HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Video Summarization with Attention-Based Encoder-Decoder Networks

Zhong Ji; Kailin Xiong; Yanwei Pang; Xuelong Li

Video Summarization with Attention-Based Encoder-Decoder Networks

Abstract

This paper addresses the problem of supervised video summarization by formulating it as a sequence-to-sequence learning problem, where the input is a sequence of original video frames, the output is a keyshot sequence. Our key idea is to learn a deep summarization network with attention mechanism to mimic the way of selecting the keyshots of human. To this end, we propose a novel video summarization framework named Attentive encoder-decoder networks for Video Summarization (AVS), in which the encoder uses a Bidirectional Long Short-Term Memory (BiLSTM) to encode the contextual information among the input video frames. As for the decoder, two attention-based LSTM networks are explored by using additive and multiplicative objective functions, respectively. Extensive experiments are conducted on three video summarization benchmark datasets, i.e., SumMe, and TVSum. The results demonstrate the superiority of the proposed AVS-based approaches against the state-of-the-art approaches,with remarkable improvements from 0.8% to 3% on two datasets,respectively..

Benchmarks

BenchmarkMethodologyMetrics
video-summarization-on-summeM-AVS
F1-score (Augmented): 46.1
F1-score (Canonical): 44.4
video-summarization-on-tvsumM-AVS
F1-score (Augmented): 61.8
F1-score (Canonical): 61.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Video Summarization with Attention-Based Encoder-Decoder Networks | Papers | HyperAI