HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation

Chuanxia Zheng Long Tung Vuong Jianfei Cai Dinh Phung

MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation

Abstract

Although two-stage Vector Quantized (VQ) generative models allow for synthesizing high-fidelity and high-resolution images, their quantization operator encodes similar patches within an image into the same index, resulting in a repeated artifact for similar adjacent regions using existing decoder architectures. To address this issue, we propose to incorporate the spatially conditional normalization to modulate the quantized vectors so as to insert spatially variant information to the embedded index maps, encouraging the decoder to generate more photorealistic images. Moreover, we use multichannel quantization to increase the recombination capability of the discrete codes without increasing the cost of model and codebook. Additionally, to generate discrete tokens at the second stage, we adopt a Masked Generative Image Transformer (MaskGIT) to learn an underlying prior distribution in the compressed latent space, which is much faster than the conventional autoregressive model. Experiments on two benchmark datasets demonstrate that our proposed modulated VQGAN is able to greatly improve the reconstructed image quality as well as provide high-fidelity image generation.

Code Repositories

ai-forever/movqgan
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
image-reconstruction-on-imagenetMo-VQGAN (16x16x4)
FID: 1.12
LPIPS: 0.113
PSNR: 22.42
SSIM: 0.673

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation | Papers | HyperAI