Domain Generalization On Imagenet A

评估指标

Top-1 accuracy %

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
Model soups (BASIC-L)94.17Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Model soups (ViT-G/14)92.67Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
µ2Net+ (ViT-L/16)84.53A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems
CAR-FT (CLIP, ViT-L/14@336px)81.5Context-Aware Robust Fine-Tuning-
CAFormer-B36 (IN-21K, 384)79.5MetaFormer Baselines for Vision
MAE (ViT-H, 448)76.7Masked Autoencoders Are Scalable Vision Learners
FAN-Hybrid-L(IN-21K, 384)74.5Understanding The Robustness in Vision Transformers
ConvFormer-B36 (IN-21K, 384)73.5MetaFormer Baselines for Vision
CAFormer-B36 (IN-21K)69.4MetaFormer Baselines for Vision
ConvNeXt-XL (Im21k, 384)69.3A ConvNet for the 2020s
MAE+DAT (ViT-H)68.92Enhance the Visual Representation via Discrete Adversarial Training
ConvFormer-B36 (IN-21K)63.3MetaFormer Baselines for Vision
Pyramid Adversarial Training Improves ViT (Im21k)62.44Pyramid Adversarial Training Improves ViT Performance
CAFormer-B36 (384)61.9MetaFormer Baselines for Vision
TransNeXt-Base (IN-1K supervised, 384)61.6TransNeXt: Robust Foveal Visual Perception for Vision Transformers
TransNeXt-Small (IN-1K supervised, 384)58.3TransNeXt: Robust Foveal Visual Perception for Vision Transformers
ConvFormer-B36 (384)55.3MetaFormer Baselines for Vision
SEER (RegNet10B)52.7Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
TransNeXt-Base (IN-1K supervised, 224)50.6TransNeXt: Robust Foveal Visual Perception for Vision Transformers
CAFormer-B3648.5MetaFormer Baselines for Vision
0 of 39 row(s) selected.
Domain Generalization On Imagenet A | SOTA | HyperAI超神经