HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

SG-VAD: Stochastic Gates Based Speech Activity Detection

Jonathan Svirsky Ofir Lindenbaum

SG-VAD: Stochastic Gates Based Speech Activity Detection

Abstract

We propose a novel voice activity detection (VAD) model in a low-resource environment. Our key idea is to model VAD as a denoising task, and construct a network that is designed to identify nuisance features for a speech classification task. We train the model to simultaneously identify irrelevant features while predicting the type of speech event. Our model contains only 7.8K parameters, outperforms the previously proposed methods on the AVA-Speech evaluation set, and provides comparative results on the HAVIC dataset. We present its architecture, experimental results, and ablation study on the model's components. We publish the code and the models here https://www.github.com/jsvir/vad.

Code Repositories

jsvir/vad
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
activity-detection-on-ava-speechSG-VAD (ours)
ROC-AUC: 94.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
SG-VAD: Stochastic Gates Based Speech Activity Detection | Papers | HyperAI