HyperAIHyperAI

Command Palette

Search for a command to run...

a month ago

Advancing Speech Understanding in Speech-Aware Language Models with GRPO

Avishai Elmakies Hagai Aronowitz Nimrod Shabtay Eli Schwartz Ron Hoory Avihu Dekel

Advancing Speech Understanding in Speech-Aware Language Models with GRPO

Abstract

In this paper, we introduce a Group Relative Policy Optimization (GRPO)-basedmethod for training Speech-Aware Large Language Models (SALLMs) on open-formatspeech understanding tasks, such as Spoken Question Answering and AutomaticSpeech Translation. SALLMs have proven highly effective for speechunderstanding tasks. GRPO has recently gained traction for its efficiency intraining LLMs, and prior work has explored its application to SALLMs, primarilyin multiple-choice tasks. Building on this, we focus on open-format tasks thatbetter reflect the generative abilities of the models. Our approach leveragesGRPO with BLEU as the reward signal to optimize SALLMs, and we demonstrateempirically that it surpasses standard SFT across several key metrics. Finally,we explore the potential of incorporating off-policy samples within GRPO forthese tasks, highlighting avenues for further improvement and further research.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Advancing Speech Understanding in Speech-Aware Language Models with GRPO | Papers | HyperAI