HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

Han Zhang; Tao Xu; Hongsheng Li; Shaoting Zhang; Xiaogang Wang; Xiaolei Huang; Dimitris Metaxas

StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

Abstract

Synthesizing high-quality images from text descriptions is a challenging problem in computer vision and has many practical applications. Samples generated by existing text-to-image approaches can roughly reflect the meaning of the given descriptions, but they fail to contain necessary details and vivid object parts. In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) to generate 256x256 photo-realistic images conditioned on text descriptions. We decompose the hard problem into more manageable sub-problems through a sketch-refinement process. The Stage-I GAN sketches the primitive shape and colors of the object based on the given text description, yielding Stage-I low-resolution images. The Stage-II GAN takes Stage-I results and text descriptions as inputs, and generates high-resolution images with photo-realistic details. It is able to rectify defects in Stage-I results and add compelling details with the refinement process. To improve the diversity of the synthesized images and stabilize the training of the conditional-GAN, we introduce a novel Conditioning Augmentation technique that encourages smoothness in the latent conditioning manifold. Extensive experiments and comparisons with state-of-the-arts on benchmark datasets demonstrate that the proposed method achieves significant improvements on generating photo-realistic images conditioned on text descriptions.

Code Repositories

savya08/Text-to-Image
Mentioned in GitHub
Vishal-V/GSoC-TensorFlow
tf
Mentioned in GitHub
suryar510/StackGAN
tf
Mentioned in GitHub
hg1722/fashionista
tf
Mentioned in GitHub
hanzhanggit/StackGAN
Official
tf
Mentioned in GitHub
Vishal-V/StackGAN
tf
Mentioned in GitHub
CorneliusHsiao/FoodMethodGAN
pytorch
Mentioned in GitHub
uditss03/txt2img
Mentioned in GitHub
alinstein/Modify-image-by-text
pytorch
Mentioned in GitHub
hanzhanggit/StackGAN-Pytorch
pytorch
Mentioned in GitHub
vdopp234/Text2Image
tf
Mentioned in GitHub
sdai654416/Joint-GAN
tf
Mentioned in GitHub
charchit7/QuickDraw-App
pytorch
Mentioned in GitHub
Vishal-V/GSoC
tf
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
text-to-image-generation-on-cubStackGAN
Inception score: 3.7
text-to-image-generation-on-oxford-102StackGAN
Inception score: 3.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks | Papers | HyperAI