Home Console Docs News Papers Tutorials Datasets Wiki SOTA LLM Models GPU Leaderboard Events

English

3 months ago

Scalable Diffusion Models with Transformers

View Paper Details

William Peebles Saining Xie

Scalable Diffusion Models with Transformers

Abstract

We explore a new class of diffusion models based on the transformer architecture. We train latent diffusion models of images, replacing the commonly-used U-Net backbone with a transformer that operates on latent patches. We analyze the scalability of our Diffusion Transformers (DiTs) through the lens of forward pass complexity as measured by Gflops. We find that DiTs with higher Gflops -- through increased transformer depth/width or increased number of input tokens -- consistently have lower FID. In addition to possessing good scalability properties, our largest DiT-XL/2 models outperform all prior diffusion models on the class-conditional ImageNet 512x512 and 256x256 benchmarks, achieving a state-of-the-art FID of 2.27 on the latter.

Code Repositories

VachanVY/diffusion-transformer

pytorch

senmaoy/RAT-Diffusion

pytorch

Mentioned in GitHub

facebookresearch/DiT

Official

pytorch

Mentioned in GitHub

milmor/diffusion-transformer

pytorch

Mentioned in GitHub

milmor/diffusion-transformer-keras

tf

Mentioned in GitHub

FineDiffusion/FineDiffusion

pytorch

Mentioned in GitHub

MindSpore-scientific/code-5/tree/main/Scalable-Sharpness-Aware-Minimization

mindspore

nyu-systems/grendel-gs

pytorch

Mentioned in GitHub

pytorch

Mentioned in GitHub

chuanyangjin/fast-dit

pytorch

Mentioned in GitHub

mindspore-lab/mindone

mindspore

pytorch

Mentioned in GitHub

huggingface/diffusers

jax

Benchmarks

Benchmark	Methodology	Metrics
image-generation-on-imagenet-256x256	DiT-XL/2	FID: 2.27
image-generation-on-imagenet-512x512	DiT-XL/2	FID: 3.04 Inception score: 240.82

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

Scalable Diffusion Models with Transformers | Papers | HyperAI