Command Palette
Search for a command to run...
Schlüter Jan ; Gutenbrunner Gerald

Abstract
In audio classification, differentiable auditory filterbanks with fewparameters cover the middle ground between hard-coded spectrograms and rawaudio. LEAF (arXiv:2101.08596), a Gabor-based filterbank combined withPer-Channel Energy Normalization (PCEN), has shown promising results, but iscomputationally expensive. With inhomogeneous convolution kernel sizes andstrides, and by replacing PCEN with better parallelizable operations, we canreach similar results more efficiently. In experiments on six audioclassification tasks, our frontend matches the accuracy of LEAF at 3% of thecost, but both fail to consistently outperform a fixed mel filterbank. Thequest for learnable audio frontends is not solved.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| audio-classification-on-birdclef-2021 | melspect | Accuracy: 39.9 |
| audio-classification-on-birdclef-2021 | EfficientLEAF | Accuracy: 42.9 |
| audio-classification-on-birdclef-2021 | LEAF | Accuracy: 42.3 |
| audio-classification-on-birdclef-2021 | EfficientLEAF (8s) | Accuracy: 72.2 |
| audio-classification-on-crema-d | LEAF | Accuracy: 50.2 |
| audio-classification-on-crema-d | melspect | Accuracy: 58.8 |
| audio-classification-on-crema-d | EfficientLEAF | Accuracy: 60.2 |
| audio-classification-on-speech-commands-1 | melspect | Accuracy: 95.1 |
| audio-classification-on-speech-commands-1 | EfficientLEAF | Accuracy: 95.2 |
| audio-classification-on-speech-commands-1 | LEAF | Accuracy: 95.1 |
| instrument-recognition-on-nsynth | EfficientLEAF | Accuracy: 71.7 |
| instrument-recognition-on-nsynth | LEAF | Accuracy: 69.2 |
| instrument-recognition-on-nsynth | melspect | Accuracy: 72.1 |
| spoken-language-identification-on-voxforge-2 | LEAF | Accuracy: 91.5 |
| spoken-language-identification-on-voxforge-2 | EfficientLEAF | Accuracy: 86.6 |
| spoken-language-identification-on-voxforge-2 | melspect | Accuracy: 85.6 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.