4 个月前

丹麦真菌2020——不仅仅是另一个图像识别数据集

丹麦真菌2020——不仅仅是另一个图像识别数据集

摘要

我们介绍了一个新的细粒度数据集和基准测试,即丹麦真菌2020(DF20)。该数据集基于提交给丹麦真菌图谱的观察记录构建,其独特之处在于分类学上准确的类别标签、错误数量较少、高度不平衡的长尾类别分布、丰富的观察元数据以及明确定义的类别层次结构。DF20与ImageNet没有重叠,允许从公开可用的ImageNet检查点微调模型时进行无偏比较。所提出的评估协议能够测试利用元数据(例如精确地理位置、生境和基质)改进分类的能力,有助于分类器校准测试,并最终研究设备设置对分类性能的影响。实验使用了卷积神经网络(CNN)和最近的视觉变换器(ViT),结果显示DF20提出了一个具有挑战性的任务。有趣的是,ViT在准确率和宏F1分数方面分别达到了80.45%和0.743,分别将CNN的错误率降低了9%和12%。一种简单的将元数据纳入决策过程的方法使分类准确率提高了超过2.95个百分点,错误率降低了15%。所有方法和实验的源代码可在https://sites.google.com/view/danish-fungi-dataset获取。

代码仓库

基准测试

基准方法指标
image-classification-on-df20Inception-V3 (299)
Top-1: 72.1
Top-3: 86.58
image-classification-on-df20SE-ResNeXt-101-32x4d (224)
F1 - macro: 0.66
Top-1: 74.26
Top-3: 87.78
image-classification-on-df20ResNet-34 (299)
F1 - macro: 0.60
Top-3: 84.76
image-classification-on-df20Inception-ResNet-V2 (299)
F1 - macro: 0.651
Top-1: 74.01
Top-3: 87.49
image-classification-on-df20EfficientNet-B1 (299)
F1 - macro: 0.654
Top-1: 74.08
Top-3: 87.68
image-classification-on-df20EfficientNet-B3 (224)
F1 - macro: 0.634
Top-1: 72.51
Top-3: 86.77
image-classification-on-df20ResNet-50 (299)
Top-1: 73.49
Top-3: 87.13
image-classification-on-df20ViT-Large/16 (384)
F1 - macro: 0.743
Top-1: 80.45
Top-3: 91.68
image-classification-on-df20EfficientNet-B3 (299)
F1 - macro: 0.673
Top-1: 75.69
Top-3: 88.72
image-classification-on-df20ViT-Base/16 (384)
F1 - macro: 0.727
Top-1: 79.48
Top-3: 90.95
image-classification-on-df20MobileNet-V2 (299)
Top-1: 69.77
Top-3: 85.01
image-classification-on-df20ViT-Large/16 (224)
F1 - macro: 0.675
Top-1: 75.29
Top-3: 88.34
image-classification-on-df20SE-ResNeXt-101-32x4d (299)
F1 - macro: 0.693
Top-3: 89.48
image-classification-on-df20ResNet-18
F1 - macro: 0.580
Top-1: 67.13
Top-3: 82.65
image-classification-on-df20EfficientNet-B5 (299)
F1 - macro: 0.678
Top-1: 76.1
Top-3: 88.85
image-classification-on-df20Inception-V4 (299)
F1 - macro: 0.637
Top-1: 73
Top-3: 86.87
image-classification-on-df20EfficientNet-B0 (224)
F1 - macro: 0.613
Top-1: 70.33
Top-3: 85.19
image-classification-on-df20EfficientNet-B0 (299)
Top-1: 73.65
image-classification-on-df20SE-ResNeXt-101-32x4d
Top-1: 77.13
image-classification-on-df20-miniResNet-18
F1 - macro: 0.514
Top-1: 62.91
Top-3: 81.65
image-classification-on-df20-miniEfficientNet-B5 (299)
Top-1: 68.76
Top-3: 85
image-classification-on-df20-miniResNet-50 (299)
Top-1: 68.49
Top-3: 85.22
image-classification-on-df20-miniResNet-34 (299)
F1 - macro: 0.559
Top-3: 83.52
image-classification-on-df20-miniViT-Large/16 (384)
F1 - macro: 0.669
Top-1: 75.85
Top-3: 89.95
image-classification-on-df20-miniInception-ResNet-V2 (299)
Top-1: 64.67
Top-3: 81.42
image-classification-on-df20-miniEfficientNet-B3 (224)
F1 - macro: 0.55
Top-1: 67.39
Top-3: 83.74
image-classification-on-df20-miniEfficientNet-B0 (224)
F1 - macro: 0.531
Top-1: 65.66
Top-3: 83.65
image-classification-on-df20-miniEfficientNet-B1 (299)
Top-1: 68.35
Top-3: 84.67
image-classification-on-df20-miniEfficientNet-B0 (299)
F1 - macro: 0.567
Top-1: 67.94
Top-3: 85.71
image-classification-on-df20-miniViT-Base/16 (384)
F1 - macro: 0.639
Top-1: 74.23
Top-3: 89.12
image-classification-on-df20-miniEfficientNet-B3 (299)
F1 - macro: 0.59
Top-1: 69.59
Top-3: 85.55
image-classification-on-df20-miniViT-Large/16 (224)
F1 - macro: 0.603
Top-1: 71.04
Top-3: 86.15
image-classification-on-df20-miniInception-V3 (299)
F1 - macro: 0.535
Top-1: 65.91
Top-3: 82.97
image-classification-on-df20-miniSE-ResNeXt-101-32x4d
Top-1: 72.23
image-classification-on-df20-miniSE-ResNeXt-101-32x4d (224)
F1 - macro: 0.585
Top-1: 68.87
Top-3: 85.14
image-classification-on-df20-miniMobileNet-V2 (299)
Top-1: 65.58
image-classification-on-df20-miniInception-V4 (299)
Top-1: 67.45
Top-3: 82.78
image-classification-on-df20-miniSE-ResNeXt-101-32x4d (299)
F1 - macro: 0.62
Top-3: 87.28

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
丹麦真菌2020——不仅仅是另一个图像识别数据集 | 论文 | HyperAI超神经