| MetaFormer
(MetaFormer-2,384,extra_info) | 88.7% | MetaFormer: A Unified Meta Framework for Fine-Grained Recognition | |
| MetaFormer
(MetaFormer-2,384) | 84.3% | MetaFormer: A Unified Meta Framework for Fine-Grained Recognition | |
| RDNet-L (224 res, IN-1K pretrained) | 81.8% | DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs | |
| RDNet-B (224 res, IN-1K pretrained) | 80.5 | DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs | |
| CeiT-S (384 finetune resolution) | 79.4% | Incorporating Convolution Designs into Visual Transformers | |
| RDNet-S (224 res, IN-1K pretrained) | 79.1 | DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs | |