8 months ago

Abstract

Diffusion models have demonstrated impressive abilities in generatingphoto-realistic and creative images. To offer more controllability for thegeneration process, existing studies, termed as early-constraint methods inthis paper, leverage extra conditions and incorporate them into pre-traineddiffusion models. Particularly, some of them adopt condition-specific modulesto handle conditions separately, where they struggle to generalize across otherconditions. Although follow-up studies present unified solutions to solve thegeneralization problem, they also require extra resources to implement, e.g.,additional inputs or parameter optimization, where more flexible and efficientsolutions are expected to perform steerable guided image synthesis. In thispaper, we present an alternative paradigm, namely Late-Constraint Diffusion(LaCon), to simultaneously integrate various conditions into pre-traineddiffusion models. Specifically, LaCon establishes an alignment between theexternal condition and the internal features of diffusion models, and utilizesthe alignment to incorporate the target condition, guiding the sampling processto produce tailored results. Experimental results on COCO dataset illustratethe effectiveness and superior generalization capability of LaCon under variousconditions and settings. Ablation studies investigate the functionalities ofdifferent components in LaCon, and illustrate its great potential to serve asan efficient solution to offer flexible controllability for diffusion models.

Source PDF