HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Global Object Proposals for Improving Multi-Sentence Video Descriptions

{Pushpak Bhattacharyya Sriparna Saha Chandresh S. Kanani}

Abstract

There has been significant progress in image captioning in recent years. The generation of video descriptions is still in its early stages; this is due to the complex nature of videos in comparison to images. Generating paragraph descriptions of a video is even more challenging. Amongst the main issues are temporal object dependencies and complex object-object relationships. Recently, many works are proposed on the generation of multi-sentence video descriptions. The majority of the proposed works are based on a two-step approach: 1) event proposals and 2) caption generation. While these approaches produce good results, they miss out on globally available information. Here we propose the use of global object proposals while generating the video captions. Experimental results on ActivityNet dataset illustrate that the use of global object proposals can produce more informative and correct captions. We also propose three scores to evaluate the object detection capacity of the generator. A qualitative comparison of captions generated by the proposed method and the state-of-the-art techniques proves the efficacy of the proposed method.

Benchmarks

BenchmarkMethodologyMetrics
dense-video-captioning-on-activitynetADV-INF + Global
BLEU-4: 9.45
CIDEr: 19.40
DIV-1: 0.60
DIV-2: 0.78
METEOR: 16.36
RE-4: 0.05

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Global Object Proposals for Improving Multi-Sentence Video Descriptions | Papers | HyperAI