3 个月前

基于视觉任务适应基准的大规模表示学习研究

基于视觉任务适应基准的大规模表示学习研究

摘要

表示学习有望在无需昂贵标注数据集的情况下,推动深度学习在视觉任务长尾领域的应用。然而,缺乏统一的通用视觉表示评估标准,严重制约了该领域的进展。现有的主流评估协议往往过于受限(如线性分类)、多样性不足(如仅依赖ImageNet、CIFAR、Pascal-VOC等数据集),或与表示质量的相关性较弱(如ELBO、重构误差)。为此,我们提出了视觉任务适应基准(Visual Task Adaptation Benchmark,简称VTAB),其核心思想是:优秀的表示应能以少量样本快速适应多样且未见过的任务。基于VTAB,我们对多种广泛使用的公开表示学习算法进行了大规模系统性研究。研究中,我们严格控制了模型架构和调优预算等混杂因素。通过该基准,我们深入探讨了若干关键问题:ImageNet预训练表示在标准自然图像数据集之外的表现如何?生成式与判别式模型所学习的表示有何差异?自监督学习在多大程度上可替代人工标注?当前我们距离实现通用视觉表示还有多远?

代码仓库

google-research/task_adaptation
官方
tf
GitHub 中提及
facebookresearch/vissl
pytorch
GitHub 中提及

基准测试

基准方法指标
image-classification-on-vtab-1k-1SelfSup-RelativePatchLoc-ResNet50
Top-1 Accuracy: 50.8
image-classification-on-vtab-1k-1BigBiGAN-ResNet50
Top-1 Accuracy: 59.1
image-classification-on-vtab-1k-1ResNet50-LargeHyperSweep
Top-1 Accuracy: 59.2
image-classification-on-vtab-1k-1SelfSup-Rotation-ResNet50
Top-1 Accuracy: 59.5
image-classification-on-vtab-1k-1Conditional-BigGAN
Top-1 Accuracy: 35.3
image-classification-on-vtab-1k-1SelfSup-Jigsaw-ResNet50
Top-1 Accuracy: 51.1
image-classification-on-vtab-1k-1ImageNet-ResNet50-LargeHyperSweep
Top-1 Accuracy: 71.2
image-classification-on-vtab-1k-1ResNet50
Top-1 Accuracy: 42.1
image-classification-on-vtab-1k-1S4L-10%-Exemplar-ResNet50
Top-1 Accuracy: 63.9
image-classification-on-vtab-1k-1SelfSup-Exemplar-ResNet50
Top-1 Accuracy: 57.5
image-classification-on-vtab-1k-1VAE
Top-1 Accuracy: 37.5
image-classification-on-vtab-1k-1ImageNet-10%-ResNet50
Top-1 Accuracy: 61.6
image-classification-on-vtab-1k-1S4L-Rotation-ResNet50-LargeHyperSweep
Top-1 Accuracy: 71.5
image-classification-on-vtab-1k-1WAE-UKL
Top-1 Accuracy: 31.0
image-classification-on-vtab-1k-1WAE-GAN
Top-1 Accuracy: 32.0
image-classification-on-vtab-1k-1ImageNet-ResNet50
Top-1 Accuracy: 65.6
image-classification-on-vtab-1k-1S4L-Exemplar-ResNet50
Top-1 Accuracy: 67.0
image-classification-on-vtab-1k-1WAE-MMD
Top-1 Accuracy: 37.3
image-classification-on-vtab-1k-1S4L-Exemplar-ResNet50-LargeHyperSweep
Top-1 Accuracy: 72.7
image-classification-on-vtab-1k-1Unconditional-BigGAN-ResNet50
Top-1 Accuracy: 44.0
image-classification-on-vtab-1k-1S4L-10%-Rotation-ResNet50
Top-1 Accuracy: 64.8
image-classification-on-vtab-1k-1S4L-Rotation-ResNet50
Top-1 Accuracy: 67.5

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
基于视觉任务适应基准的大规模表示学习研究 | 论文 | HyperAI超神经