8 months ago

Diffusion Model

Semantic Segmentation

Method/Architecture

Computer Vision

Lvmin Zhang Anyi Rao Maneesh Agrawala

Abstract

We present ControlNet, a neural network architecture to add spatialconditioning controls to large, pretrained text-to-image diffusion models.ControlNet locks the production-ready large diffusion models, and reuses theirdeep and robust encoding layers pretrained with billions of images as a strongbackbone to learn a diverse set of conditional controls. The neuralarchitecture is connected with "zero convolutions" (zero-initializedconvolution layers) that progressively grow the parameters from zero and ensurethat no harmful noise could affect the finetuning. We test various conditioningcontrols, eg, edges, depth, segmentation, human pose, etc, with StableDiffusion, using single or multiple conditions, with or without prompts. Weshow that the training of ControlNets is robust with small (<50k) and large(>1m) datasets. Extensive results show that ControlNet may facilitate widerapplications to control image diffusion models.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

Diffusion Model

Semantic Segmentation

Method/Architecture

Computer Vision

Lvmin Zhang Anyi Rao Maneesh Agrawala

Abstract

We present ControlNet, a neural network architecture to add spatialconditioning controls to large, pretrained text-to-image diffusion models.ControlNet locks the production-ready large diffusion models, and reuses theirdeep and robust encoding layers pretrained with billions of images as a strongbackbone to learn a diverse set of conditional controls. The neuralarchitecture is connected with "zero convolutions" (zero-initializedconvolution layers) that progressively grow the parameters from zero and ensurethat no harmful noise could affect the finetuning. We test various conditioningcontrols, eg, edges, depth, segmentation, human pose, etc, with StableDiffusion, using single or multiple conditions, with or without prompts. Weshow that the training of ControlNets is robust with small (<50k) and large(>1m) datasets. Extensive results show that ControlNet may facilitate widerapplications to control image diffusion models.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp