HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

IntentQA: Context-aware Video Intent Reasoning

{Lifeng Fan Wenjuan Han Ping Wei Jiapeng Li}

IntentQA: Context-aware Video Intent Reasoning

Abstract

In this paper, we propose a novel task IntentQA, a special VideoQA task focusing on video intent reasoning, which has become increasingly important for AI with its advantages in equipping AI agents with the capability of reasoning beyond mere recognition in daily tasks. We also contribute a large-scale VideoQA dataset for this task. We propose a Context-aware Video Intent Reasoning model (CaVIR) consisting of i) Video Query Language (VQL) for better cross-modal representation of the situational context, ii) Contrastive Learning module for utilizing the contrastive context, and iii) Commonsense Reasoning module for incorporating the commonsense context. Comprehensive experiments on this challenging task demonstrate the effectiveness of each model component, the superiority of our full model over other baselines, and the generalizability of our model to a new VideoQA task. The dataset and codes are open-sourced at: https://github.com/JoseponLee/IntentQA.git

Benchmarks

BenchmarkMethodologyMetrics
video-question-answering-on-intentqaIntentQA
Accuarcy: 57.6
CH: 65.5
CW: 58.4
TPu0026TN: 50.5
video-question-answering-on-intentqaHuman
Accuarcy: 78.5
CH: 80.2
CW: 77.8
TPu0026TN: 79.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
IntentQA: Context-aware Video Intent Reasoning | Papers | HyperAI