HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Faithful Multimodal Explanation for Visual Question Answering

Wu Jialin ; Mooney Raymond J.

Faithful Multimodal Explanation for Visual Question Answering

Abstract

AI systems' ability to explain their reasoning is critical to their utilityand trustworthiness. Deep neural networks have enabled significant progress onmany challenging problems such as visual question answering (VQA). However,most of them are opaque black boxes with limited explanatory capability. Thispaper presents a novel approach to developing a high-performing VQA system thatcan elucidate its answers with integrated textual and visual explanations thatfaithfully reflect important aspects of its underlying reasoning whilecapturing the style of comprehensible human explanations. Extensiveexperimental evaluation demonstrates the advantages of this approach comparedto competing methods with both automatic evaluation metrics and humanevaluation metrics.

Code Repositories

explainableml/clevr-x
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
explanatory-visual-question-answering-on-gqaEXP
BLEU-4: 42.45
CIDEr: 357.10
GQA-test: 56.92
GQA-val: 65.17
Grounding: 33.52
METEOR: 34.46
ROUGE-L: 73.51
SPICE: 40.35

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Faithful Multimodal Explanation for Visual Question Answering | Papers | HyperAI