Zero Shot Transfer Image Classification On 1

评估指标

Accuracy (Private)

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
M2-Encoder88.5M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining
BASIC (Lion)88.3--
CoCa86.3CoCa: Contrastive Captioners are Image-Text Foundation Models
LiT-22B85.9Scaling Vision Transformers to 22 Billion Parameters
BASIC85.7Combined Scaling for Zero-shot Transfer Learning-
LiT ViT-e85.4PaLI: A Jointly-Scaled Multilingual Language-Image Model
LiT-tuning84.5LiT: Zero-Shot Transfer with Locked-image text Tuning
IMP-MoE-L83.9Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception-
EVA-CLIP-18B83.8EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
InternVL-C83.2InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
MAWS (ViT-2B)82.1The effectiveness of MAE pre-pretraining for billion-scale pretraining
EVA-CLIP-E/14+82EVA-CLIP: Improved Training Techniques for CLIP at Scale
CLIPA (ViT-H/14-336px)81.8--
MAWS (ViT-H)81.1The effectiveness of MAE pre-pretraining for billion-scale pretraining
REACT78.5Learning Customized Visual Models with Retrieval-Augmented Knowledge
ALIGN76.4Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
CLIP(ViT-L/14-336px)76.2Learning Transferable Visual Models From Natural Language Supervision
AltCLIP74.5AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities
PaLI72.11PaLI: A Jointly-Scaled Multilingual Language-Image Model
Diffusion Classifier (zero-shot)61.4Your Diffusion Model is Secretly a Zero-Shot Classifier
0 of 23 row(s) selected.
Zero Shot Transfer Image Classification On 1 | SOTA | HyperAI超神经