HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
全站搜索…
⌘
K
首页
SOTA
视觉问答 (VQA)
Visual Question Answering Vqa On Infoseek
Visual Question Answering Vqa On Infoseek
评估指标
Accuracy
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
Accuracy
Paper Title
Repository
RA-VQAv2 w/ PreFLMR
30.65
PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers
PaLI-X
24
PaLI-X: On Scaling up a Multilingual Vision and Language Model
CLIP + FiD
20.9
Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?
CLIP + PaLM (540B)
20.4
Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?
PaLI
19.7
Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?
BLIP2
14.6
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
InstructBLIP
14.5
-
-
0 of 7 row(s) selected.
Previous
Next
Visual Question Answering Vqa On Infoseek | SOTA | HyperAI超神经