HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

A neural attention model for speech command recognition

Douglas Coimbra de Andrade; Sabato Leo; Martin Loesener Da Silva Viana; Christoph Bernkopf

A neural attention model for speech command recognition

Abstract

This paper introduces a convolutional recurrent network with attention for speech command recognition. Attention models are powerful tools to improve performance on natural language, image captioning and speech tasks. The proposed model establishes a new state-of-the-art accuracy of 94.1% on Google Speech Commands dataset V1 and 94.5% on V2 (for the 20-commands recognition task), while still keeping a small footprint of only 202K trainable parameters. Results are compared with previous convolutional implementations on 5 different tasks (20 commands recognition (V1 and V2), 12 commands recognition (V1), 35 word recognition (V1) and left-right (V1)). We show detailed performance results and demonstrate that the proposed attention mechanism not only improves performance but also allows inspecting what regions of the audio were taken into consideration by the network when outputting a given category.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
keyword-spotting-on-google-speech-commandsAttention RNN
Google Speech Commands V1 12: 95.6
Google Speech Commands V1 2: 99.2
Google Speech Commands V1 20: 94.1
Google Speech Commands V1 35: 94.3
Google Speech Commands V2 12: 96.9
Google Speech Commands V2 2: 99.4
Google Speech Commands V2 20: 94.5
Google Speech Commands V2 35: 93.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
A neural attention model for speech command recognition | Papers | HyperAI