HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Hyperbolic Audio-visual Zero-shot Learning

Hong Jie ; Hayder Zeeshan ; Han Junlin ; Fang Pengfei ; Harandi Mehrtash ; Petersson Lars

Hyperbolic Audio-visual Zero-shot Learning

Abstract

Audio-visual zero-shot learning aims to classify samples consisting of a pairof corresponding audio and video sequences from classes that are not presentduring training. An analysis of the audio-visual data reveals a large degree ofhyperbolicity, indicating the potential benefit of using a hyperbolictransformation to achieve curvature-aware geometric learning, with the aim ofexploring more complex hierarchical data structures for this task. The proposedapproach employs a novel loss function that incorporates cross-modalityalignment between video and audio features in the hyperbolic space.Additionally, we explore the use of multiple adaptive curvatures for hyperbolicprojections. The experimental results on this very challenging task demonstratethat our proposed hyperbolic approach for zero-shot learning outperforms theSOTA method on three datasets: VGGSound-GZSL, UCF-GZSL, and ActivityNet-GZSLachieving a harmonic mean (HM) improvement of around 3.0%, 7.0%, and 5.3%,respectively.

Benchmarks

BenchmarkMethodologyMetrics
gzsl-video-classification-on-activitynet-gzslHyper-multiple
HM: 15.25
ZSL: 10.39
gzsl-video-classification-on-activitynet-gzsl-1Hyper-multiple
HM: 12.65
ZSL: 9.50
gzsl-video-classification-on-ucf-gzsl-clsHyper-multiple
HM: 48.30
ZSL: 52.11
gzsl-video-classification-on-ucf-gzsl-mainHyper-multiple
HM: 29.32
ZSL: 22.24
gzsl-video-classification-on-vggsound-gzslHyper-multiple
HM: 8.67
ZSL: 7.31
gzsl-video-classification-on-vggsound-gzsl-1Hyper-multiple
HM: 9.32
ZSL: 7.97

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Hyperbolic Audio-visual Zero-shot Learning | Papers | HyperAI