HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Multimodal Conditional Image Synthesis with Product-of-Experts GANs

Xun Huang Arun Mallya Ting-Chun Wang Ming-Yu Liu

Multimodal Conditional Image Synthesis with Product-of-Experts GANs

Abstract

Existing conditional image synthesis frameworks generate images based on user inputs in a single modality, such as text, segmentation, sketch, or style reference. They are often unable to leverage multimodal user inputs when available, which reduces their practicality. To address this limitation, we propose the Product-of-Experts Generative Adversarial Networks (PoE-GAN) framework, which can synthesize images conditioned on multiple input modalities or any subset of them, even the empty set. PoE-GAN consists of a product-of-experts generator and a multimodal multiscale projection discriminator. Through our carefully designed training scheme, PoE-GAN learns to synthesize images with high quality and diversity. Besides advancing the state of the art in multimodal conditional image synthesis, PoE-GAN also outperforms the best existing unimodal conditional image synthesis approaches when tested in the unimodal setting. The project website is available at https://deepimagination.github.io/PoE-GAN .

Benchmarks

BenchmarkMethodologyMetrics
image-to-image-translation-on-coco-stuffPoE-GAN
FID: 15.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Multimodal Conditional Image Synthesis with Product-of-Experts GANs | Papers | HyperAI