HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Scalable Transfer Learning with Expert Models

Joan Puigcerver Carlos Riquelme Basil Mustafa Cedric Renggli André Susano Pinto Sylvain Gelly Daniel Keysers Neil Houlsby

Scalable Transfer Learning with Expert Models

Abstract

Transfer of pre-trained representations can improve sample efficiency and reduce computational requirements for new tasks. However, representations used for transfer are usually generic, and are not tailored to a particular distribution of downstream tasks. We explore the use of expert representations for transfer with a simple, yet effective, strategy. We train a diverse set of experts by exploiting existing label structures, and use cheap-to-compute performance proxies to select the relevant expert for each target task. This strategy scales the process of transferring to new tasks, since it does not revisit the pre-training data during transfer. Accordingly, it requires little extra compute per target task, and results in a speed-up of 2-3 orders of magnitude compared to competing approaches. Further, we provide an adapter-based architecture able to compress many experts into a single model. We evaluate our approach on two different data sources and demonstrate that it outperforms baselines on over 20 diverse vision tasks in both cases.

Benchmarks

BenchmarkMethodologyMetrics
image-classification-on-vtab-1k-1ScalableExperts (I21k+JFT)
Top-1 Accuracy: 72.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Scalable Transfer Learning with Expert Models | Papers | HyperAI