HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Palette: Image-to-Image Diffusion Models

Saharia Chitwan ; Chan William ; Chang Huiwen ; Lee Chris A. ; Ho Jonathan ; Salimans Tim ; Fleet David J. ; Norouzi Mohammad

Palette: Image-to-Image Diffusion Models

Abstract

This paper develops a unified framework for image-to-image translation basedon conditional diffusion models and evaluates this framework on fourchallenging image-to-image translation tasks, namely colorization, inpainting,uncropping, and JPEG restoration. Our simple implementation of image-to-imagediffusion models outperforms strong GAN and regression baselines on all tasks,without task-specific hyper-parameter tuning, architecture customization, orany auxiliary loss or sophisticated new techniques needed. We uncover theimpact of an L2 vs. L1 loss in the denoising diffusion objective on samplediversity, and demonstrate the importance of self-attention in the neuralarchitecture through empirical studies. Importantly, we advocate a unifiedevaluation protocol based on ImageNet, with human evaluation and sample qualityscores (FID, Inception Score, Classification Accuracy of a pre-trainedResNet-50, and Perceptual Distance against original images). We expect thisstandardized evaluation protocol to play a role in advancing image-to-imagetranslation research. Finally, we show that a generalist, multi-task diffusionmodel performs as well or better than task-specific specialist counterparts.Check out https://diffusion-palette.github.io for an overview of the results.

Code Repositories

omerb01/puq
pytorch
Mentioned in GitHub
crosszamirski/guided-i2i
pytorch
Mentioned in GitHub
kylelo/roofdiffusion
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
colorization-on-imagenet-ctest10kPalette
FID: 3.4
colorization-on-imagenet-valPalette
FID-5K: 15.78
image-inpainting-on-places2-valPalatte (20-30% free form)
FID: 11.7
PD: 35.0
image-inpainting-on-places2-valPalette (128×128 center mask)
FID: 11.9
PD: 57.3
jpeg-decompression-on-imagenetPalette (QF: 5)
CA: 64.2
FID-5K: 8.3
IS: 133.6
PD: 95.5
jpeg-decompression-on-imagenetRegression (QF: 5)
CA: 52.8
FID-5K: 29.0
IS: 73.9
PD: 155.4
jpeg-decompression-on-imagenetRegression (QF: 20)
CA: 69.7
FID-5K: 11.5
IS: 158.7
PD: 65.4
jpeg-decompression-on-imagenetPalette (QF: 10)
CA: 70.7
FID-5K: 5.4
IS: 180.5
PD: 58.3
jpeg-decompression-on-imagenetPalette (QF: 20)
CA: 73.5
FID-5K: 4.3
IS: 208.7
PD: 37.1
jpeg-decompression-on-imagenetRegression (QF: 10)
CA: 63.5
FID-5K: 18.0
IS: 117.2
PD: 102.2
uncropping-on-places2-valPalette
FID: 3.53
Fool rate: 39.9
PD: 103.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp