HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

An Efficient Keyframes Selection Based Framework for Video Captioning

{Sivaji Bandyopadhyay Thoudam Doren Singh Salam Michael Singh Loitongbam Sanayai Meetei Alok Singh}

An Efficient Keyframes Selection Based Framework for Video Captioning

Abstract

Describing a video is a challenging yet attractive task since it falls into the intersection of computer vision and natural language generation. The attention-based models have reported the best performance. However, all these models follow similar procedures, such as segmenting videos into chunks of frames or sampling frames at equal intervals for visual encoding. The process of segmenting video into chunks or sampling frames at equal intervals causes encoding of redundant visual information and requires additional computational cost since a video consists of a sequence of similar frames and suffers from inescapable noise such as uneven illumination, occlusion and motion effects. In this paper, a boundary-based keyframes selection approach for video description is proposed that allow the system to select a compact subset of keyframes to encode the visual information and generate a description for a video without much degradation. The proposed approach uses 3 4 frames per video and yields competitive performance over two benchmark datasets MSVD and MSR-VTT (in both English and Hindi).

Benchmarks

BenchmarkMethodologyMetrics
video-captioning-on-hindi-msr-vttSBD_Keyframe
BLEU4: 41.01

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
An Efficient Keyframes Selection Based Framework for Video Captioning | Papers | HyperAI