3 months ago

A ConvNet for the 2020s

Zhuang Liu Hanzi Mao Chao-Yuan Wu Christoph Feichtenhofer Trevor Darrell Saining Xie

Abstract

The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model. A vanilla ViT, on the other hand, faces difficulties when applied to general computer vision tasks such as object detection and semantic segmentation. It is the hierarchical Transformers (e.g., Swin Transformers) that reintroduced several ConvNet priors, making Transformers practically viable as a generic vision backbone and demonstrating remarkable performance on a wide variety of vision tasks. However, the effectiveness of such hybrid approaches is still largely credited to the intrinsic superiority of Transformers, rather than the inherent inductive biases of convolutions. In this work, we reexamine the design spaces and test the limits of what a pure ConvNet can achieve. We gradually "modernize" a standard ResNet toward the design of a vision Transformer, and discover several key components that contribute to the performance difference along the way. The outcome of this exploration is a family of pure ConvNet models dubbed ConvNeXt. Constructed entirely from standard ConvNet modules, ConvNeXts compete favorably with Transformers in terms of accuracy and scalability, achieving 87.8% ImageNet top-1 accuracy and outperforming Swin Transformers on COCO detection and ADE20K segmentation, while maintaining the simplicity and efficiency of standard ConvNets.

Code Repositories

BR-IDL/PaddleViT/tree/develop/image_classification/ConvNeXt

paddle

k-h-ismail/convnext-dcls

pytorch

Mentioned in GitHub

keras-team/keras/blob/master/keras/applications/convnext.py

dongkyuk/ConvNext-tensorflow

Mentioned in GitHub

hmichaeli/alias_free_convnets

pytorch

Mentioned in GitHub

kingcong/convnext-

mindspore

mzeromiko/vmamba

pytorch

Mentioned in GitHub

sayakpaul/ConvNeXt-TF

Mentioned in GitHub

james77777778/keras-image-models

pytorch

Mentioned in GitHub

pytorch/vision

pytorch

PaddlePaddle/PASSL

paddle

frgfm/Holocron

pytorch

Mentioned in GitHub

rwightman/pytorch-image-models

pytorch

Mentioned in GitHub

2024-MindSpore-1/Code2/tree/main/model-1/convnext

mindspore

Westlake-AI/openmixup

pytorch

Mentioned in GitHub

Owais-Ansari/Unet3plus

pytorch

Mentioned in GitHub

PaddlePaddle/PaddleClas

paddle

hanfried/hanfried-bookmarks

pytorch

Mentioned in GitHub

duyhominhnguyen/LVM-Med

pytorch

Mentioned in GitHub

jmnolte/hccnet

pytorch

Mentioned in GitHub

leondgarse/keras_cv_attention_models/tree/main/keras_cv_attention_models/convnext

mindspore-ecosystem/mindcv/blob/main/mindcv/models/convnext.py

mindspore

martinsbruveris/tensorflow-image-models

Mentioned in GitHub

AlassaneSakande/A-ConvNet-of-2020s

pytorch

Mentioned in GitHub

IMvision12/keras-vision-models

pytorch

Mentioned in GitHub

open-mmlab/mmclassification

pytorch

waterdisappear/nudt4mstar

pytorch

Mentioned in GitHub

lucidrains/denoising-diffusion-pytorch

pytorch

Mentioned in GitHub

https://gitlab.com/birder/birder

pytorch

yaya-yns/tart

pytorch

Mentioned in GitHub

avocardio/resnet_vs_convnext

Mentioned in GitHub

MindCode-4/code-3/tree/main/convnext

mindspore

facebookresearch/ConvNeXt

Official

pytorch

Mentioned in GitHub

mit-han-lab/litepose

pytorch

Mentioned in GitHub

tuanio/nextformer

pytorch

Mentioned in GitHub

kingcong/convnext

mindspore

facebookresearch/ppuda

pytorch

Mentioned in GitHub

Raghvender1205/ConvNeXt

pytorch

Mentioned in GitHub

DarshanDeshpande/jax-models

jax

Mentioned in GitHub

bamps53/convnext-tf

pytorch

mindspore-ai/models/tree/master/research/cv/convnext

mindspore

yangyucheng000/convnext/blob/main/convnext.py

mindspore

flytocc/ConvNeXt-paddle

paddle

Mentioned in GitHub

Asthestarsfalll/ConvNeXt-MegEngine

pytorch

SarthakYadav/audax

jax

protonx-tf-04-projects/ConvNext-2020s

Mentioned in GitHub

sithu31296/semantic-segmentation

pytorch

Mentioned in GitHub

towhee-io/towhee

pytorch

0jason000/convnext

mindspore

Mentioned in GitHub

lyqcom/convnext

mindspore

2023-MindSpore-4/Code8/tree/main/convnext

mindspore

murufeng/awesome_lightweight_networks

pytorch

leanderme/ConvNeXt-Tensorflow

zibbini/convnext-v2_tensorflow

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
classification-on-indl	ConvNext	Average Recall: 93.47%
domain-generalization-on-imagenet-a	ConvNeXt-XL (Im21k, 384)	Top-1 accuracy %: 69.3
domain-generalization-on-imagenet-c	ConvNeXt-XL (Im21k) (augmentation overlap with ImageNet-C)	Number of params: 350M mean Corruption Error (mCE): 38.8
domain-generalization-on-imagenet-r	ConvNeXt-XL (Im21k, 384)	Top-1 Error Rate: 31.8
domain-generalization-on-imagenet-sketch	ConvNeXt-XL (Im21k, 384)	Top-1 accuracy: 55.0
domain-generalization-on-vizwiz	ConvNeXt-B	Accuracy - All Images: 53.5 Accuracy - Clean Images: 56 Accuracy - Corrupted Images: 46.9
image-classification-on-imagenet	ConvNeXt-XL (ImageNet-22k)	GFLOPs: 179 Number of params: 350M Top 1 Accuracy: 87.8%
image-classification-on-imagenet	Adlik-ViT-SG+Swin_large+Convnext_xlarge(384)	Number of params: 1827M Top 1 Accuracy: 88.36%
image-classification-on-imagenet	ConvNeXt-L (384 res)	GFLOPs: 101 Number of params: 198M Top 1 Accuracy: 85.5%
image-classification-on-imagenet	ConvNeXt-T	GFLOPs: 4.5 Number of params: 29M Top 1 Accuracy: 82.1%
object-detection-on-coco-o	ConvNeXt-XL (Cascade Mask R-CNN)	Average mAP: 37.5 Effective Robustness: 12.68
semantic-segmentation-on-ade20k	ConvNeXt-S	GFLOPs (512 x 512): 1027 Params (M): 82 Validation mIoU: 49.6
semantic-segmentation-on-ade20k	ConvNeXt-B++	GFLOPs (512 x 512): 1828 Params (M): 122 Validation mIoU: 53.1
semantic-segmentation-on-ade20k	ConvNeXt-B	GFLOPs (512 x 512): 1170 Params (M): 122 Validation mIoU: 49.9
semantic-segmentation-on-ade20k	ConvNeXt-T	GFLOPs (512 x 512): 939 Params (M): 60 Validation mIoU: 46.7
semantic-segmentation-on-ade20k	ConvNeXt-L++	GFLOPs (512 x 512): 2458 Params (M): 235 Validation mIoU: 53.7
semantic-segmentation-on-ade20k	ConvNeXt-XL++	GFLOPs (512 x 512): 3335 Params (M): 391 Validation mIoU: 54
semantic-segmentation-on-imagenet-s	ConvNext-Tiny (P4, 224x224, SUP)	mIoU (test): 48.8 mIoU (val): 48.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

A ConvNet for the 2020s

Zhuang Liu Hanzi Mao Chao-Yuan Wu Christoph Feichtenhofer Trevor Darrell Saining Xie

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters