HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

REX: Reasoning-aware and Grounded Explanation

Chen Shi ; Zhao Qi

REX: Reasoning-aware and Grounded Explanation

Abstract

Effectiveness and interpretability are two essential properties fortrustworthy AI systems. Most recent studies in visual reasoning are dedicatedto improving the accuracy of predicted answers, and less attention is paid toexplaining the rationales behind the decisions. As a result, they commonly takeadvantage of spurious biases instead of actually reasoning on thevisual-textual data, and have yet developed the capability to explain theirdecision making by considering key information from both modalities. This paperaims to close the gap from three distinct perspectives: first, we define a newtype of multi-modal explanations that explain the decisions by progressivelytraversing the reasoning process and grounding keywords in the images. Wedevelop a functional program to sequentially execute different reasoning stepsand construct a new dataset with 1,040,830 multi-modal explanations. Second, weidentify the critical need to tightly couple important components across thevisual and textual modalities for explaining the decisions, and propose a novelexplanation generation method that explicitly models the pairwisecorrespondence between words and regions of interest. It improves the visualgrounding capability by a considerable margin, resulting in enhancedinterpretability and reasoning performance. Finally, with our new data andmethod, we perform extensive analyses to study the effectiveness of ourexplanation under different settings, including multi-task learning andtransfer learning. Our code and data are available athttps://github.com/szzexpoi/rex.

Code Repositories

szzexpoi/rex
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
explanatory-visual-question-answering-on-gqaREX-LXMERT
BLEU-4: 54.79
CIDEr: 466.01
GQA-test: 58.15
GQA-val: 78.19
Grounding: 70.79
METEOR: 39.51
ROUGE-L: 79.41
SPICE: 49.98
explanatory-visual-question-answering-on-gqaREX-VisualBert
BLEU-4: 54.59
CIDEr: 464.20
GQA-test: 57.77
GQA-val: 66.16
Grounding: 67.95
METEOR: 39.22
ROUGE-L: 78.56
SPICE: 46.80
fs-mevqa-on-smeREX
#Learning Samples (N): 16
ACC: 17.77
BLEU-4: 0.00
CIDEr: 0.89
Detection: 0.00
METEOR: 4.37
ROUGE-L: 23.23
SPICE: 0.00

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
REX: Reasoning-aware and Grounded Explanation | Papers | HyperAI