| CAFormer-B36 (IN21K, 384) | - | MetaFormer Baselines for Vision | |
| CAFormer-B36 (IN21K) | - | MetaFormer Baselines for Vision | |
| ResNet-50 (PushPull-Conv) + PRIME | 69.4 | PushPull-Net: Inhibition-driven ResNet robust to image corruptions | |
| ConvNeXt-XL (Im21k) (augmentation overlap with ImageNet-C) | - | A ConvNet for the 2020s | |
| DINOv2 (ViT-S/14, frozen model, linear eval) | - | DINOv2: Learning Robust Visual Features without Supervision | |
| DINOv2 (ViT-g/14, frozen model, linear eval) | - | DINOv2: Learning Robust Visual Features without Supervision | |