HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Sparse Mixture-of-Experts are Domain Generalizable Learners

Bo Li Yifei Shen Jingkang Yang Yezhen Wang Jiawei Ren Tong Che Jun Zhang Ziwei Liu

Sparse Mixture-of-Experts are Domain Generalizable Learners

Abstract

Human visual perception can easily generalize to out-of-distributed visual data, which is far beyond the capability of modern machine learning models. Domain generalization (DG) aims to close this gap, with existing DG methods mainly focusing on the loss function design. In this paper, we propose to explore an orthogonal direction, i.e., the design of the backbone architecture. It is motivated by an empirical finding that transformer-based models trained with empirical risk minimization (ERM) outperform CNN-based models employing state-of-the-art (SOTA) DG algorithms on multiple DG datasets. We develop a formal framework to characterize a network's robustness to distribution shifts by studying its architecture's alignment with the correlations in the dataset. This analysis guides us to propose a novel DG model built upon vision transformers, namely Generalizable Mixture-of-Experts (GMoE). Extensive experiments on DomainBed demonstrate that GMoE trained with ERM outperforms SOTA DG baselines by a large margin. Moreover, GMoE is complementary to existing DG methods and its performance is substantially improved when trained with DG algorithms.

Code Repositories

luodian/sf-moe-dg
Official
pytorch
Mentioned in GitHub
KU-CVLAB/MoA
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
domain-generalization-on-domainnetHybrid-SF-MoE
Average Accuracy: 52.0
domain-generalization-on-domainnetGMoE-S/16
Average Accuracy: 48.7
domain-generalization-on-office-homeGMoE-S/16
Average Accuracy: 74.2
domain-generalization-on-pacs-2GMoE-S/16
Average Accuracy: 88.1
domain-generalization-on-terraincognitaGMoE-S/16
Average Accuracy: 48.5
domain-generalization-on-vlcsGMoE-S/16
Average Accuracy: 80.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Sparse Mixture-of-Experts are Domain Generalizable Learners | Papers | HyperAI