Command Palette
Search for a command to run...
Hoang Phan Lam Tran Quyen Tran Trung Le

Abstract
Prior Unsupervised Domain Adaptation (UDA) methods often aim to train a domain-invariant feature extractor, which may hinder the model from learning sufficiently discriminative features. To tackle this, a line of works based on prompt learning leverages the power of large-scale pre-trained vision-language models to learn both domain-invariant and specific features through a set of domain-agnostic and domain-specific learnable prompts. Those studies typically enforce invariant constraints on representation, output, or prompt space to learn such prompts. In contrast, we cast UDA as a multiple-objective optimization problem in which each objective is represented by a domain loss. Under this new framework, we propose to align per-objective gradients to foster consensus between them. Additionally, to prevent potential overfitting when fine-tuning this deep learning architecture, we penalize the norm of these gradients. To achieve these goals, we devise a practical gradient update procedure that can work under both single-source and multi-source UDA. Empirically, our method consistently outperforms other vision-language model adaptation methods. The implementation is available at https://github.com/VietHoang1512/PGA.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| domain-adaptation-on-office-home | PGA (ViT-B/16) | Accuracy: 85.1 |
| domain-adaptation-on-office-home | PGA (ViT-L/14) | Accuracy: 89.4 |
| domain-adaptation-on-office-home | PGA (RN50) | Accuracy: 75.8 |
| domain-adaptation-on-s2rda-49 | PGA | Accuracy: 74.1 |
| domain-adaptation-on-s2rda-ms-39 | PGA | Accuracy: 38 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.