HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Iterative Context-Aware Graph Inference for Visual Dialog

Dan Guo Hui Wang Hanwang Zhang Zheng-Jun Zha Meng Wang

Iterative Context-Aware Graph Inference for Visual Dialog

Abstract

Visual dialog is a challenging task that requires the comprehension of the semantic dependencies among implicit visual and textual contexts. This task can refer to the relation inference in a graphical model with sparse contexts and unknown graph structure (relation descriptor), and how to model the underlying context-aware relation inference is critical. To this end, we propose a novel Context-Aware Graph (CAG) neural network. Each node in the graph corresponds to a joint semantic feature, including both object-based (visual) and history-related (textual) context representations. The graph structure (relations in dialog) is iteratively updated using an adaptive top-$K$ message passing mechanism. Specifically, in every message passing step, each node selects the most $K$ relevant nodes, and only receives messages from them. Then, after the update, we impose graph attention on all the nodes to get the final graph embedding and infer the answer. In CAG, each node has dynamic relations in the graph (different related $K$ neighbor nodes), and only the most relevant nodes are attributive to the context-aware relational graph inference. Experimental results on VisDial v0.9 and v1.0 datasets show that CAG outperforms comparative methods. Visualization results further validate the interpretability of our method.

Code Repositories

wh0330/CAG_VisDial
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
visual-dialog-on-visdial-v09-valCAG
MRR: 0.6756
Mean Rank: 3.75
R@1: 54.64
R@10: 91.48
R@5: 83.72
visual-dialog-on-visual-dialog-v1-0-test-stdCAG
MRR (x 100): 63.49
Mean: 4.11
NDCG (x 100): 56.64
R@1: 49.85
R@10: 90.15
R@5: 80.63

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Iterative Context-Aware Graph Inference for Visual Dialog | Papers | HyperAI