
摘要
本文介绍了EfficientNetV2,这是一个新型的卷积神经网络家族,相较于以往模型,具有更快的训练速度和更高的参数效率。为构建这一模型系列,我们结合了训练感知的神经架构搜索(training-aware neural architecture search)与模型缩放(scaling)技术,协同优化训练速度与参数效率。所设计的模型在扩展后的搜索空间中进行搜索,该空间引入了诸如Fused-MBConv等新型运算操作。实验结果表明,EfficientNetV2模型在训练速度上显著优于当前最优模型,同时模型规模最多可缩小6.8倍。我们进一步发现,通过在训练过程中逐步增大输入图像尺寸,可进一步提升训练速度,但这一方法通常会导致准确率下降。为补偿这一性能损失,本文提出动态调整正则化策略(如丢弃率dropout和数据增强)的方法,使模型在实现快速训练的同时仍能保持优异的准确率。借助渐进式学习(progressive learning)策略,EfficientNetV2在ImageNet以及CIFAR、Cars、Flowers等数据集上的表现显著超越此前的先进模型。在相同ImageNet21k数据集上进行预训练后,EfficientNetV2在ImageNet ILSVRC2012数据集上实现了87.3%的Top-1准确率,相比近期的Vision Transformer(ViT)模型高出2.0个百分点,且在相同计算资源下训练速度提升5至11倍。相关代码将发布于:https://github.com/google/automl/tree/master/efficientnetv2。
代码仓库
lukemelas/EfficientNet-PyTorch
pytorch
GitHub 中提及
james77777778/keras-image-models
pytorch
GitHub 中提及
DavidLandup0/deepvision
pytorch
pytorch/vision
pytorch
rwightman/pytorch-image-models
pytorch
GitHub 中提及
Klassikcat/project-NEXTLab-CNN-EfficientNet
tf
GitHub 中提及
hankyul2/EfficientNetV2-pytorch
pytorch
GitHub 中提及
IMvision12/keras-vision-models
pytorch
GitHub 中提及
abhuse/pytorch-efficientnet
pytorch
GitHub 中提及
leondgarse/Keras_efficientnet_v2
tf
GitHub 中提及
2023-MindSpore-1/ms-code-193
mindspore
phykn/film-defect-detection
pytorch
GitHub 中提及
jahongir7174/EffcientNetV2
pytorch
GitHub 中提及
ardeal/EfficientNetV2
pytorch
GitHub 中提及
szq0214/fkd
pytorch
GitHub 中提及
danstowell/insect_classifier_GDSC23_insecteffnet
pytorch
GitHub 中提及
d-li14/efficientnetv2.pytorch
pytorch
GitHub 中提及
seermer/TensorFlow2-EfficientNetV2
tf
GitHub 中提及
open-edge-platform/geti
pytorch
GitHub 中提及
jahongir7174/EfficientNetV2
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| image-classification-on-cifar-10 | EfficientNetV2-M | Percentage correct: 99.0 |
| image-classification-on-cifar-10 | EfficientNetV2-S | Percentage correct: 98.7 |
| image-classification-on-cifar-10 | EfficientNetV2-L | Percentage correct: 99.1 |
| image-classification-on-cifar-100 | EfficientNetV2-L | Percentage correct: 92.3 |
| image-classification-on-cifar-100 | EfficientNetV2-M | Percentage correct: 92.2 |
| image-classification-on-cifar-100 | EfficientNetV2-S | Percentage correct: 91.5 |
| image-classification-on-flowers-102 | EfficientNetV2-S | Accuracy: 97.9 |
| image-classification-on-flowers-102 | EfficientNetV2-M | Accuracy: 98.5 |
| image-classification-on-flowers-102 | EfficientNetV2-L | Accuracy: 98.8 |
| image-classification-on-imagenet | EfficientNetV2-M (21k) | GFLOPs: 24 Number of params: 54M Top 1 Accuracy: 86.2% |
| image-classification-on-imagenet | EfficientNetV2-S (21k) | GFLOPs: 8.8 Number of params: 22M Top 1 Accuracy: 84.9% |
| image-classification-on-imagenet | EfficientNetV2-L | GFLOPs: 53 Top 1 Accuracy: 85.7% |
| image-classification-on-imagenet | EfficientNetV2-M | Top 1 Accuracy: 85.1% |
| image-classification-on-imagenet | EfficientNetV2-S | Top 1 Accuracy: 83.9% |
| image-classification-on-imagenet | EfficientNetV2-XL (21k) | GFLOPs: 94 Number of params: 208M Top 1 Accuracy: 87.3% |
| image-classification-on-imagenet | EfficientNetV2-L (21k) | GFLOPs: 53 Number of params: 120M Top 1 Accuracy: 86.8% |
| image-classification-on-stanford-cars | EfficientNetV2-M | Accuracy: 94.6 |
| image-classification-on-stanford-cars | EfficientNetV2-S | Accuracy: 93.8 |
| image-classification-on-stanford-cars | EfficientNetV2-L | Accuracy: 95.1 |