HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Dual Knowledge Distillation for Efficient Sound Event Detection

Xiao Yang ; Das Rohan Kumar

Dual Knowledge Distillation for Efficient Sound Event Detection

Abstract

Sound event detection (SED) is essential for recognizing specific sounds andtheir temporal locations within acoustic signals. This becomes challengingparticularly for on-device applications, where computational resources arelimited. To address this issue, we introduce a novel framework referred to asdual knowledge distillation for developing efficient SED systems in this work.Our proposed dual knowledge distillation commences with temporal-averagingknowledge distillation (TAKD), utilizing a mean student model derived from thetemporal averaging of the student model's parameters. This allows the studentmodel to indirectly learn from a pre-trained teacher model, ensuring a stableknowledge distillation. Subsequently, we introduce embedding-enhanced featuredistillation (EEFD), which involves incorporating an embedding distillationlayer within the student model to bolster contextual learning. On DCASE 2023Task 4A public evaluation dataset, our proposed SED system with dual knowledgedistillation having merely one-third of the baseline model's parameters,demonstrates superior performance in terms of PSDS1 and PSDS2. This highlightsthe importance of proposed dual knowledge distillation for compact SED systems,which can be ideal for edge devices.

Benchmarks

BenchmarkMethodologyMetrics
sound-event-detection-on-desedSE-CRNN-16 with DualKD
PSDS1: 0.474
PSDS2: 0.698
event-based F1 score: 55.6

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Dual Knowledge Distillation for Efficient Sound Event Detection | Papers | HyperAI