HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

DABS: A Domain-Agnostic Benchmark for Self-Supervised Learning

Alex Tamkin Vincent Liu Rongfei Lu Daniel Fein Colin Schultz Noah Goodman

DABS: A Domain-Agnostic Benchmark for Self-Supervised Learning

Abstract

Self-supervised learning algorithms, including BERT and SimCLR, have enabled significant strides in fields like natural language processing, computer vision, and speech processing. However, these algorithms are domain-specific, meaning that new self-supervised learning algorithms must be developed for each new setting, including myriad healthcare, scientific, and multimodal domains. To catalyze progress toward domain-agnostic methods, we introduce DABS: a Domain-Agnostic Benchmark for Self-supervised learning. To perform well on DABS, an algorithm is evaluated on seven diverse domains: natural images, multichannel sensor data, English text, speech recordings, multilingual text, chest x-rays, and images with text descriptions. Each domain contains an unlabeled dataset for pretraining; the model is then is scored based on its downstream performance on a set of labeled tasks in the domain. We also present e-Mix and ShED: two baseline domain-agnostic algorithms; their relatively modest performance demonstrates that significant progress is needed before self-supervised learning is an out-of-the-box solution for arbitrary domains. Code for benchmark datasets and baseline algorithms is available at https://github.com/alextamkin/dabs.

Code Repositories

alextamkin/dabs
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
self-supervised-learning-on-dabsPretraining: ShED
Images u0026 Text: 54.3
Med. Imaging: 74.5
Natural Images: 20.9
Sensors: 88.7
Speech: 36.5
Text: 48.4
self-supervised-learning-on-dabsPretraining: e-Mix
Images u0026 Text: 48.9
Med. Imaging: 72.4
Natural Images: 27.9
Sensors: 79.5
Speech: 41.8
Text: 44.1
self-supervised-learning-on-dabsPretraining: None
Images u0026 Text: 57.5
Med. Imaging: 68.1
Natural Images: 10.1
Sensors: 69.8
Speech: 24.9
Text: 42.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
DABS: A Domain-Agnostic Benchmark for Self-Supervised Learning | Papers | HyperAI