Command Palette
Search for a command to run...
Saharia Chitwan ; Chan William ; Chang Huiwen ; Lee Chris A. ; Ho Jonathan ; Salimans Tim ; Fleet David J. ; Norouzi Mohammad

Abstract
This paper develops a unified framework for image-to-image translation basedon conditional diffusion models and evaluates this framework on fourchallenging image-to-image translation tasks, namely colorization, inpainting,uncropping, and JPEG restoration. Our simple implementation of image-to-imagediffusion models outperforms strong GAN and regression baselines on all tasks,without task-specific hyper-parameter tuning, architecture customization, orany auxiliary loss or sophisticated new techniques needed. We uncover theimpact of an L2 vs. L1 loss in the denoising diffusion objective on samplediversity, and demonstrate the importance of self-attention in the neuralarchitecture through empirical studies. Importantly, we advocate a unifiedevaluation protocol based on ImageNet, with human evaluation and sample qualityscores (FID, Inception Score, Classification Accuracy of a pre-trainedResNet-50, and Perceptual Distance against original images). We expect thisstandardized evaluation protocol to play a role in advancing image-to-imagetranslation research. Finally, we show that a generalist, multi-task diffusionmodel performs as well or better than task-specific specialist counterparts.Check out https://diffusion-palette.github.io for an overview of the results.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| colorization-on-imagenet-ctest10k | Palette | FID: 3.4 |
| colorization-on-imagenet-val | Palette | FID-5K: 15.78 |
| image-inpainting-on-places2-val | Palatte (20-30% free form) | FID: 11.7 PD: 35.0 |
| image-inpainting-on-places2-val | Palette (128×128 center mask) | FID: 11.9 PD: 57.3 |
| jpeg-decompression-on-imagenet | Palette (QF: 5) | CA: 64.2 FID-5K: 8.3 IS: 133.6 PD: 95.5 |
| jpeg-decompression-on-imagenet | Regression (QF: 5) | CA: 52.8 FID-5K: 29.0 IS: 73.9 PD: 155.4 |
| jpeg-decompression-on-imagenet | Regression (QF: 20) | CA: 69.7 FID-5K: 11.5 IS: 158.7 PD: 65.4 |
| jpeg-decompression-on-imagenet | Palette (QF: 10) | CA: 70.7 FID-5K: 5.4 IS: 180.5 PD: 58.3 |
| jpeg-decompression-on-imagenet | Palette (QF: 20) | CA: 73.5 FID-5K: 4.3 IS: 208.7 PD: 37.1 |
| jpeg-decompression-on-imagenet | Regression (QF: 10) | CA: 63.5 FID-5K: 18.0 IS: 117.2 PD: 102.2 |
| uncropping-on-places2-val | Palette | FID: 3.53 Fool rate: 39.9 PD: 103.3 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.