HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR

Gary Wang Ekin D.Cubuk Andrew Rosenberg Shuyang Cheng Ron J. Weiss Bhuvana Ramabhadran Pedro J. Moreno Quoc V. Le Daniel S. Park

G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR

Abstract

Data augmentation is a ubiquitous technique used to provide robustness to automatic speech recognition (ASR) training. However, even as so much of the ASR training process has become automated and more "end-to-end", the data augmentation policy (what augmentation functions to use, and how to apply them) remains hand-crafted. We present Graph-Augment, a technique to define the augmentation space as directed acyclic graphs (DAGs) and search over this space to optimize the augmentation policy itself. We show that given the same computational budget, policies produced by G-Augment are able to perform better than SpecAugment policies obtained by random search on fine-tuning tasks on CHiME-6 and AMI. G-Augment is also able to establish a new state-of-the-art ASR performance on the CHiME-6 evaluation set (30.7% WER). We further demonstrate that G-Augment policies show better transfer properties across warm-start to cold-start training and model size compared to random-searched SpecAugment policies.

Benchmarks

BenchmarkMethodologyMetrics
speech-recognition-on-chime-6-dev-gss12ConformerXXL-PS + G-Augment
Word Error Rate (WER): 26
speech-recognition-on-chime-6-evalConformerXXL-PS + G-Augment
Word Error Rate (WER): 30.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR | Papers | HyperAI