Image Classification On Inaturalist 2018

评估指标

Top-1 Accuracy

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
OmniVec294.6OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning-
OmniVec93.8OmniVec: Learning robust representations with cross modal sharing-
InternImage-H92.6%InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
MAWS (ViT-2B)91.3%The effectiveness of MAE pre-pretraining for billion-scale pretraining
MetaFormer (MetaFormer-2,384,extra_info)88.7%MetaFormer: A Unified Meta Framework for Fine-Grained Recognition
Hiera-H (448px)87.3%Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
MAE (ViT-H, 448)86.8%Masked Autoencoders Are Scalable Vision Learners
SWAG (ViT H/14)86.0%Revisiting Weakly Supervised Pre-Training of Visual Perception Models
SEER (RegNet10B - finetuned - 384px)84.7%Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
MetaFormer (MetaFormer-2,384)84.3%MetaFormer: A Unified Meta Framework for Fine-Grained Recognition
OMNIVORE (Swin-L)84.1%Omnivore: A Single Model for Many Visual Modalities
RDNet-L (224 res, IN-1K pretrained)81.8%DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
RegNet-8GF81.2%Grafit: Learning fine-grained image representations with coarse labels-
VL-LTR (ViT-B-16)81.0%VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition
µ2Net+ (ViT-L/16)80.97A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems
RDNet-B (224 res, IN-1K pretrained)80.5DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
MixMIM-L80.3%MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
DeiT-B79.5%Training data-efficient image transformers & distillation through attention
CeiT-S (384 finetune resolution)79.4%Incorporating Convolution Designs into Visual Transformers
RDNet-S (224 res, IN-1K pretrained)79.1DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
0 of 60 row(s) selected.
Image Classification On Inaturalist 2018 | SOTA | HyperAI超神经