
摘要
Vibravox 是一个符合《通用数据保护条例》(GDPR)的数据集,包含使用五种不同体传导音频传感器录制的音频记录:两个耳内麦克风、两个骨传导振动拾音器和一个喉头电话。该数据集还包括一个空气传播麦克风作为参考的音频数据。Vibravox 语料库包含由188名参与者在高阶Ambisonics 3D空间化器施加的不同声学条件下录制的38小时语音样本和生理声音。语料库中还包含了关于录音条件和语言转录的注释。我们对多种与语音相关的任务进行了一系列实验,包括语音识别、语音增强和说话人验证。这些实验使用了最先进的模型来评估和比较不同音频传感器捕获信号的性能,旨在更好地了解这些传感器各自的特性。
代码仓库
jhauret/vibravox
官方
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| automatic-phoneme-recognition-on-vibravox | medium wav2vec2.0 | Test PER: 0.028 |
| automatic-phoneme-recognition-on-vibravox-1 | medium wav2vec2.0 | Test PER: 0.046 |
| automatic-phoneme-recognition-on-vibravox-2 | medium wav2vec2.0 | Test PER: 0.041 |
| automatic-phoneme-recognition-on-vibravox-3 | medium wav2vec2.0 | Test PER: 0.045 |
| automatic-phoneme-recognition-on-vibravox-4 | medium wav2vec2.0 | Test PER: 0.073 |
| automatic-phoneme-recognition-on-vibravox-5 | medium wav2vec2.0 | Test PER: 0.142 |
| bandwidth-extension-on-vibravox | Configurable EBEN (M=4, P=2, Q=4) | EER (ECAPA2): 0.0364 Noresqua-MOS: 4.285 PER (wav2vec2): 0.084 STOI: 0.877 |
| bandwidth-extension-on-vibravox-forehead | Configurable EBEN (M=4, P=4, Q=4) | EER (ECAPA2): 0.0183 Noresqua-MOS: 4.250 PER (wav2vec2): 0.091 STOI: 0.855 |
| bandwidth-extension-on-vibravox-soft-in-ear | Configurable EBEN (M=4, P=2, Q=4) | EER (ECAPA2): 0.0488 Noresqua-MOS: 4.331 PER (wav2vec2): 0.087 STOI: 0.868 |
| bandwidth-extension-on-vibravox-temple | Configurable EBEN (M=4, P=1, Q=4) | EER (ECAPA2): 0.1622 Noresqua-MOS: 3.632 PER (wav2vec2): 0.391 STOI: 0.763 |
| bandwidth-extension-on-vibravox-throat | Configurable EBEN (M=4, P=2, Q=4) | EER (ECAPA2): 0.0847 Noresqua-MOS: 3.862 PER (wav2vec2): 0.179 STOI: 0.834 |
| speaker-verification-on-vibravox-forehead | ECAPA2 | Test EER: 0.009 Test min-DCF: 0.06 |
| speaker-verification-on-vibravox-headset | ECAPA2 | Test EER: 0.0026 Test min-DCF: 0.02 |
| speaker-verification-on-vibravox-rigid-in-ear | ECAPA2 | Test EER: 0.0316 Test min-DCF: 0.21 |
| speaker-verification-on-vibravox-soft-in-ear | ECAPA2 | Test EER: 0.0172 Test min-DCF: 0.10 |
| speaker-verification-on-vibravox-temple | ECAPA2 | Test EER: 0.08 Test min-DCF: 0.58 |
| speaker-verification-on-vibravox-throat | ECAPA2 | Test EER: 0.0353 Test min-DCF: 0.20 |