Visual Question Answering On Vqa V1 Test Dev
评估指标
Accuracy
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | ||
|---|---|---|---|
| SAAA (ResNet) | 64.5 | Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering | |
| DAN (ResNet) | 64.3 | Dual Attention Networks for Multimodal Reasoning and Matching | |
| MCB (ResNet) | 64.2 | Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding | |
| RAU (ResNet) | 63.3 | Training Recurrent Answering Units with Joint Loss Minimization for VQA | - |
| HieCoAtt (ResNet) | 61.8 | Hierarchical Question-Image Co-Attention for Visual Question Answering | |
| DMN+ | 60.3 | Dynamic Memory Networks for Visual and Textual Question Answering | |
| NMN+LSTM+FT | 58.6 | Neural Module Networks |
0 of 7 row(s) selected.