HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

LAMV: Learning to Align and Match Videos With Kernelized Temporal Layers

{Hervé Jégou Rita Cucchiara Matthijs Douze Lorenzo Baraldi}

LAMV: Learning to Align and Match Videos With Kernelized Temporal Layers

Abstract

This paper considers a learnable approach for comparing and aligning videos. Our architecture builds upon and revisits temporal match kernels within neural networks: we propose a new temporal layer that finds temporal alignments by maximizing the scores between two sequences of vectors, according to a time-sensitive similarity metric parametrized in the Fourier domain. We learn this layer with a temporal proposal strategy, in which we minimize a triplet loss that takes into account both the localization accuracy and the recognition rate. We evaluate our approach on video alignment, copy detection and event retrieval. Our approach outperforms the state on the art on temporal video alignment and video copy detection datasets in comparable setups. It also attains the best reported results for particular event search, while precisely aligning videos.

Benchmarks

BenchmarkMethodologyMetrics
video-alignment-on-msu-video-alignment-andTMK
Accuracy w/ 3 frames error (Hard): 0.0554
Accuracy w/ 3 frames error (Light): 0.0571
Accuracy w/ 3 frames error (Medium color): 0.0607
Accuracy w/ 3 frames error (Medium geometric): 0.0446
video-retrieval-on-fivr-200kLAMV
mAP (CSVR): 0.466
mAP (DSVR): 0.496
mAP (ISVR): 0.371

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
LAMV: Learning to Align and Match Videos With Kernelized Temporal Layers | Papers | HyperAI