HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

TediGAN: Text-Guided Diverse Face Image Generation and Manipulation

Xia Weihao ; Yang Yujiu ; Xue Jing-Hao ; Wu Baoyuan

TediGAN: Text-Guided Diverse Face Image Generation and Manipulation

Abstract

In this work, we propose TediGAN, a novel framework for multi-modal imagegeneration and manipulation with textual descriptions. The proposed methodconsists of three components: StyleGAN inversion module, visual-linguisticsimilarity learning, and instance-level optimization. The inversion module mapsreal images to the latent space of a well-trained StyleGAN. Thevisual-linguistic similarity learns the text-image matching by mapping theimage and text into a common embedding space. The instance-level optimizationis for identity preservation in manipulation. Our model can produce diverse andhigh-quality images with an unprecedented resolution at 1024. Using a controlmechanism based on style-mixing, our TediGAN inherently supports imagesynthesis with multi-modal inputs, such as sketches or semantic labels, with orwithout instance guidance. To facilitate text-guided multi-modal synthesis, wepropose the Multi-Modal CelebA-HQ, a large-scale dataset consisting of realface images and corresponding semantic segmentation map, sketch, and textualdescriptions. Extensive experiments on the introduced dataset demonstrate thesuperior performance of our proposed method. Code and data are available athttps://github.com/weihaox/TediGAN.

Code Repositories

weihaox/Multi-Modal-CelebA-HQ-Dataset
pytorch
Mentioned in GitHub
iigroup/mm-celeba-hq-dataset
pytorch
Mentioned in GitHub
weihaox/TediGAN
Official
pytorch
Mentioned in GitHub
IIGROUP/TediGAN
pytorch
Mentioned in GitHub
IIGROUP/Multi-Modal-CelebA-HQ-Dataset
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
text-to-image-generation-on-multi-modalTediGAN-A
Acc: 18.4
FID: 106.37
LPIPS: 0.456
Real: 22.6

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
TediGAN: Text-Guided Diverse Face Image Generation and Manipulation | Papers | HyperAI