3 个月前

视觉模型在无监督条件下于未筛选图像上进行预训练时,表现出更强的鲁棒性与公平性

视觉模型在无监督条件下于未筛选图像上进行预训练时,表现出更强的鲁棒性与公平性

摘要

判别式自监督学习允许在任意随机选取的互联网图像集合上训练模型,并可能恢复有助于区分图像的关键信息。当应用于ImageNet数据集时,该方法可生成以物体为中心的特征表示,在多数以物体为中心的下游任务中,其性能可与监督学习所得特征相媲美。在本研究中,我们探讨了这一能力是否足以从全球范围内多样且无边界的大规模图像集合中,学习到更具代表性与显著性的信息。为此,我们在数十亿张未经任何预处理、且不预先设定学习目标的随机图像上训练模型。为避免在大规模数据上出现欠拟合,我们将模型规模扩展至高达100亿个参数的密集架构。我们在超过50个基准测试上系统性地评估并验证了模型性能,涵盖公平性、分布偏移下的鲁棒性、地理多样性、细粒度识别、图像复制检测以及多个图像分类数据集。实验结果表明,该模型不仅能够有效捕捉语义信息,还能从视觉内容中学习到艺术风格、地理位置等显著特征,以及基于视觉的多语言词嵌入。更重要的是,我们发现,此类模型相较于监督学习模型,或在以物体为中心的数据集(如ImageNet)上训练的模型,表现出更强的鲁棒性、更高的公平性、更低的有害性与偏见水平。

代码仓库

facebookresearch/vissl
官方
pytorch
GitHub 中提及

基准测试

基准方法指标
action-classification-on-kinetics-700SEER (RegNet10B)
Top-1 Accuracy: 51.9
domain-generalization-on-imagenet-aSEER (RegNet10B)
Top-1 accuracy %: 52.7
domain-generalization-on-imagenet-rSEER (RegNet10B)
Top-1 Error Rate: 43.9
domain-generalization-on-imagenet-sketchSEER (RegNet10B)
Top-1 accuracy: 45.6
fine-grained-image-classification-on-caltechSEER (RegNet10B - linear eval)
Accuracy: 91.0
Top-1 Error Rate: 9.0%
fine-grained-image-classification-on-fgvcSEER (RegNet10B)
Accuracy: 54.82%
fine-grained-image-classification-on-oxford-1SEER (RegNet10B)
Accuracy: 85.3%
fine-grained-image-classification-on-stanfordSEER (RegNet10B)
Accuracy: 68.03%
fine-grained-image-classification-on-sun397SEER (RegNet10B - linear eval)
Accuracy: 80.0
image-classification-on-cifar-10SEER (RegNet10B)
Percentage correct: 90
image-classification-on-cifar-100SEER (RegNet10B)
Percentage correct: 81.53
image-classification-on-clevr-countSEER (RegNet10B)
Top 1 Accuracy: 89.28
image-classification-on-clevr-countSEER (RegNetY-128GF)
Top 1 Accuracy: 87.98
image-classification-on-clevr-distSEER (RegNet10B)
Top 1 Accuracy: 74.98
image-classification-on-clevr-distSEER (RegNetY-128GF)
Top 1 Accuracy: 72.67
image-classification-on-dtdSEER (RegNet10B - linear eval)
Accuracy: 80.5
image-classification-on-eurosatSEER (RegNet10B - linear eval)
Accuracy (%): 97.5
image-classification-on-flowers-102SEER (RegNet10B)
Accuracy: 96.3
image-classification-on-food-101-1SEER (RegNet10B - linear eval)
Accuracy (%): 90.3
image-classification-on-imagenetSEER (RG-10B)
Number of params: 10000M
Top 1 Accuracy: 85.8%
image-classification-on-imagenet-realSEER (RegNet10B)
Accuracy: 89.8%
Params: 10000M
image-classification-on-imagenet-v2SEER (RegNet10B)
Top 1 Accuracy: 76.2
image-classification-on-inaturalist-2018SEER (RegNet10B - finetuned - 384px)
Top-1 Accuracy: 84.7%
image-classification-on-kitti-distSEER (RegNet10B)
Top 1 Accuracy: 78.34
image-classification-on-mnistSEER (RegNet10B)
Accuracy: 99.42
Percentage error: 0.58
image-classification-on-objectnetSEER (RegNet10B)
Top-1 Accuracy: 60.2
image-classification-on-places205SEER (RegNet10B - finetuned - 384px)
Top 1 Accuracy: 69.0
image-classification-on-resisc45ResNet50 (ImageNet-supervised)
Top 1 Accuracy: 88.56
image-classification-on-resisc45DeiT-B/16
Top 1 Accuracy: 92.48
image-classification-on-resisc45SimCLR-v2 (ResNet152-w3 + SK)
Top 1 Accuracy: 89.77
image-classification-on-resisc45MoCo-v3 (ViT-B/16)
Top 1 Accuracy: 93.35
image-classification-on-resisc45SwAV (ResNet50-w5)
Top 1 Accuracy: 94.73
image-classification-on-resisc45MoCo-v2 (ResNet50)
Top 1 Accuracy: 85.4
image-classification-on-resisc45SEER (RegNet10B)
Top 1 Accuracy: 95.61
image-classification-on-resisc45CLIP (ViT-B/16)
Top 1 Accuracy: 92.7
image-classification-on-resisc45DINO (DeiT-B/16)
Top 1 Accuracy: 93.97
image-classification-on-stl-10SEER (RegNet10B)
PARAMS: 10000M
Percentage correct: 97.3
image-classification-on-svhnSEER (RegNet10B)
Percentage error: 13.6
meme-classification-on-hateful-memesSEER (RegNet10B)
ROC-AUC: 0.734
self-supervised-image-classification-on-1SEER (Regnet10B)
Number of Params: 10000M
Top 1 Accuracy: 85.8%
semi-supervised-image-classification-on-1SEER (RegNet10B)
Top 1 Accuracy: 62.4%
semi-supervised-image-classification-on-2SEER (RegNet10B)
Top 1 Accuracy: 78.8%
traffic-sign-recognition-on-gtsrbSEER (RegNet10B)
Accuracy: 90.71%

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
视觉模型在无监督条件下于未筛选图像上进行预训练时,表现出更强的鲁棒性与公平性 | 论文 | HyperAI超神经