HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning

Jung Chaeyoung ; Lee Suyeon ; Nam Kihyun ; Rho Kyeongha ; Kim You Jin ; Jang Youngjoon ; Chung Joon Son

TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive
  Learning

Abstract

The goal of this work is Active Speaker Detection (ASD), a task to determinewhether a person is speaking or not in a series of video frames. Previous workshave dealt with the task by exploring network architectures while learningeffective representations has been less explored. In this work, we proposeTalkNCE, a novel talk-aware contrastive loss. The loss is only applied to partof the full segments where a person on the screen is actually speaking. Thisencourages the model to learn effective representations through the naturalcorrespondence of speech and facial movements. Our loss can be jointlyoptimized with the existing objectives for training ASD models without the needfor additional supervision or training data. The experiments demonstrate thatour loss can be easily integrated into the existing ASD frameworks, improvingtheir performance. Our method achieves state-of-the-art performances onAVA-ActiveSpeaker and ASW datasets.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
audio-visual-active-speaker-detection-on-avaLoCoNet+TalkNCE
validation mean average precision: 95.5%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning | Papers | HyperAI