Visual Question Answering On Clevr Humans
评估指标
Accuracy
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | ||
|---|---|---|---|
| MDETR | 81.7 | MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding | |
| MAC | 81.5 | Compositional Attention Networks for Machine Reasoning | |
| CNN+GRU+FiLM | 75.9 | FiLM: Visual Reasoning with a General Conditioning Layer | |
| NS-VQA (1K programs) | 67.8 | Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding | |
| IEP-18K | 66.6 | Inferring and Executing Programs for Visual Reasoning | 
0 of 5 row(s) selected.