Command Palette
Search for a command to run...
Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors
Julien Hauret Malo Olivier Thomas Joubaud Christophe Langrenne Sarah Poirée Véronique Zimpfer Éric Bavu

Abstract
Vibravox is a dataset compliant with the General Data Protection Regulation(GDPR) containing audio recordings using five different body-conduction audiosensors : two in-ear microphones, two bone conduction vibration pickups and alaryngophone. The data set also includes audio data from an airborne microphoneused as a reference. The Vibravox corpus contains 38 hours of speech samplesand physiological sounds recorded by 188 participants under different acousticconditions imposed by an high order ambisonics 3D spatializer. Annotationsabout the recording conditions and linguistic transcriptions are also includedin the corpus. We conducted a series of experiments on various speech-relatedtasks, including speech recognition, speech enhancement and speakerverification. These experiments were carried out using state-of-the-art modelsto evaluate and compare their performances on signals captured by the differentaudio sensors offered by the Vibravox dataset, with the aim of gaining a bettergrasp of their individual characteristics.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| automatic-phoneme-recognition-on-vibravox | medium wav2vec2.0 | Test PER: 0.028 |
| automatic-phoneme-recognition-on-vibravox-1 | medium wav2vec2.0 | Test PER: 0.046 |
| automatic-phoneme-recognition-on-vibravox-2 | medium wav2vec2.0 | Test PER: 0.041 |
| automatic-phoneme-recognition-on-vibravox-3 | medium wav2vec2.0 | Test PER: 0.045 |
| automatic-phoneme-recognition-on-vibravox-4 | medium wav2vec2.0 | Test PER: 0.073 |
| automatic-phoneme-recognition-on-vibravox-5 | medium wav2vec2.0 | Test PER: 0.142 |
| bandwidth-extension-on-vibravox | Configurable EBEN (M=4, P=2, Q=4) | EER (ECAPA2): 0.0364 Noresqua-MOS: 4.285 PER (wav2vec2): 0.084 STOI: 0.877 |
| bandwidth-extension-on-vibravox-forehead | Configurable EBEN (M=4, P=4, Q=4) | EER (ECAPA2): 0.0183 Noresqua-MOS: 4.250 PER (wav2vec2): 0.091 STOI: 0.855 |
| bandwidth-extension-on-vibravox-soft-in-ear | Configurable EBEN (M=4, P=2, Q=4) | EER (ECAPA2): 0.0488 Noresqua-MOS: 4.331 PER (wav2vec2): 0.087 STOI: 0.868 |
| bandwidth-extension-on-vibravox-temple | Configurable EBEN (M=4, P=1, Q=4) | EER (ECAPA2): 0.1622 Noresqua-MOS: 3.632 PER (wav2vec2): 0.391 STOI: 0.763 |
| bandwidth-extension-on-vibravox-throat | Configurable EBEN (M=4, P=2, Q=4) | EER (ECAPA2): 0.0847 Noresqua-MOS: 3.862 PER (wav2vec2): 0.179 STOI: 0.834 |
| speaker-verification-on-vibravox-forehead | ECAPA2 | Test EER: 0.009 Test min-DCF: 0.06 |
| speaker-verification-on-vibravox-headset | ECAPA2 | Test EER: 0.0026 Test min-DCF: 0.02 |
| speaker-verification-on-vibravox-rigid-in-ear | ECAPA2 | Test EER: 0.0316 Test min-DCF: 0.21 |
| speaker-verification-on-vibravox-soft-in-ear | ECAPA2 | Test EER: 0.0172 Test min-DCF: 0.10 |
| speaker-verification-on-vibravox-temple | ECAPA2 | Test EER: 0.08 Test min-DCF: 0.58 |
| speaker-verification-on-vibravox-throat | ECAPA2 | Test EER: 0.0353 Test min-DCF: 0.20 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.