Visual Question Answering Vqa On Imagenet

ClipMatch@1

ClipMatch@5

Contains

ExactMatch

Follow-up ClipMatch@1

Follow-up ClipMatch@5

Follow-up Contains

Follow-up ExactMatch

评测结果

各个模型在此基准测试上的表现结果

									Paper Title	Repository
BLIP-2 OPT	57.10	77.24	35.49	0.87	67.22	83.54	40.31	2.54	Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy

0 of 1 row(s) selected.