HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models

Qihao Liu Zhanpeng Zeng Ju He Qihang Yu Xiaohui Shen Liang-Chieh Chen

Alleviating Distortion in Image Generation via Multi-Resolution
  Diffusion Models

Abstract

This paper presents innovative enhancements to diffusion models byintegrating a novel multi-resolution network and time-dependent layernormalization. Diffusion models have gained prominence for their effectivenessin high-fidelity image generation. While conventional approaches rely onconvolutional U-Net architectures, recent Transformer-based designs havedemonstrated superior performance and scalability. However, Transformerarchitectures, which tokenize input data (via "patchification"), face atrade-off between visual fidelity and computational complexity due to thequadratic nature of self-attention operations concerning token length. Whilelarger patch sizes enable attention computation efficiency, they struggle tocapture fine-grained visual details, leading to image distortions. To addressthis challenge, we propose augmenting the Diffusion model with theMulti-Resolution network (DiMR), a framework that refines features acrossmultiple resolutions, progressively enhancing detail from low to highresolution. Additionally, we introduce Time-Dependent Layer Normalization(TD-LN), a parameter-efficient approach that incorporates time-dependentparameters into layer normalization to inject time information and achievesuperior performance. Our method's efficacy is demonstrated on theclass-conditional ImageNet generation benchmark, where DiMR-XL variantsoutperform prior diffusion models, setting new state-of-the-art FID scores of1.70 on ImageNet 256 x 256 and 2.89 on ImageNet 512 x 512. Project page:https://qihao067.github.io/projects/DiMR

Code Repositories

qihao067/DiMR
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
image-generation-on-imagenet-256x256DiMR-G/2R
FID: 1.63
image-generation-on-imagenet-256x256DiMR-XL/2R
FID: 1.70
image-generation-on-imagenet-512x512DiMR-XL/3R
FID: 2.89

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models | Papers | HyperAI