HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Dual Encoding for Video Retrieval by Text

Dong Jianfeng ; Li Xirong ; Xu Chaoxi ; Yang Xun ; Yang Gang ; Wang Xun ; Wang Meng

Dual Encoding for Video Retrieval by Text

Abstract

This paper attacks the challenging problem of video retrieval by text. Insuch a retrieval paradigm, an end user searches for unlabeled videos by ad-hocqueries described exclusively in the form of a natural-language sentence, withno visual example provided. Given videos as sequences of frames and queries assequences of words, an effective sequence-to-sequence cross-modal matching iscrucial. To that end, the two modalities need to be first encoded intoreal-valued vectors and then projected into a common space. In this paper weachieve this by proposing a dual deep encoding network that encodes videos andqueries into powerful dense representations of their own. Our novelty istwo-fold. First, different from prior art that resorts to a specificsingle-level encoder, the proposed network performs multi-level encoding thatrepresents the rich content of both modalities in a coarse-to-fine fashion.Second, different from a conventional common space learning algorithm which iseither concept based or latent space based, we introduce hybrid space learningwhich combines the high performance of the latent space and the goodinterpretability of the concept space. Dual encoding is conceptually simple,practically effective and end-to-end trained with hybrid space learning.Extensive experiments on four challenging video datasets show the viability ofthe new method.

Code Repositories

danieljf24/hybrid_space
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
ad-hoc-video-search-on-trecvid-avs16-iacc-3Dual Encoding
infAP: 0.152
ad-hoc-video-search-on-trecvid-avs17-iacc-3Dual Encoding
infAP: 0.231
ad-hoc-video-search-on-trecvid-avs18-iacc-3Dual Encoding
infAP: 0.121

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Dual Encoding for Video Retrieval by Text | Papers | HyperAI