HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

LaCon: Late-Constraint Diffusion for Steerable Guided Image Synthesis

Liu Chang ; Li Rui ; Zhang Kaidong ; Luo Xin ; Liu Dong

LaCon: Late-Constraint Diffusion for Steerable Guided Image Synthesis

Abstract

Diffusion models have demonstrated impressive abilities in generatingphoto-realistic and creative images. To offer more controllability for thegeneration process, existing studies, termed as early-constraint methods inthis paper, leverage extra conditions and incorporate them into pre-traineddiffusion models. Particularly, some of them adopt condition-specific modulesto handle conditions separately, where they struggle to generalize across otherconditions. Although follow-up studies present unified solutions to solve thegeneralization problem, they also require extra resources to implement, e.g.,additional inputs or parameter optimization, where more flexible and efficientsolutions are expected to perform steerable guided image synthesis. In thispaper, we present an alternative paradigm, namely Late-Constraint Diffusion(LaCon), to simultaneously integrate various conditions into pre-traineddiffusion models. Specifically, LaCon establishes an alignment between theexternal condition and the internal features of diffusion models, and utilizesthe alignment to incorporate the target condition, guiding the sampling processto produce tailored results. Experimental results on COCO dataset illustratethe effectiveness and superior generalization capability of LaCon under variousconditions and settings. Ablation studies investigate the functionalities ofdifferent components in LaCon, and illustrate its great potential to serve asan efficient solution to offer flexible controllability for diffusion models.

Code Repositories

AlonzoLeeeooo/LCDG
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
conditional-text-to-image-synthesis-on-cocoSD using SDEdit
FID: 71.16
conditional-text-to-image-synthesis-on-cocoSD using SDEdit (evaluated under color stroke)
CLIP Score: 0.2257
FID: 32.93
conditional-text-to-image-synthesis-on-cocoSD using SDEdit (evaluated under image palette)
CLIP Score: 0.2138
conditional-text-to-image-synthesis-on-cocoLCDG (Color, evaluated under image palette)
CLIP Score: 0.2580
FID: 20.61
conditional-text-to-image-synthesis-on-cocoSD (text)
CLIP Score: 0.2673
FID: 27.99
conditional-text-to-image-synthesis-on-cocoLCDG (Edge)
FID: 21.02
conditional-text-to-image-synthesis-on-cocoLCDG
FID: 20.27
conditional-text-to-image-synthesis-on-cocoT2I-Adapter (Sketch)
CLIP Score: 0.2580
FID: 21.72
conditional-text-to-image-synthesis-on-cocoT2I-Adapter (Color, evaluated under image palette)
CLIP Score: 0.2613
FID: 26.54
conditional-text-to-image-synthesis-on-cocoT2I-Adapter (Color, evaluated under color stroke)
FID: 30.84
conditional-text-to-image-synthesis-on-cocoLCDG (Mask)
CLIP Score: 0.2617
FID: 20.94
conditional-text-to-image-synthesis-on-cocoControlNet (HED Edge)
CLIP Score: 0.2525
FID: 28.09

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
LaCon: Late-Constraint Diffusion for Steerable Guided Image Synthesis | Papers | HyperAI