HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation

Zhuoyan Luo Fengyuan Shi Yixiao Ge Yujiu Yang Limin Wang Ying Shan

Open-MAGVIT2: An Open-Source Project Toward Democratizing
  Auto-regressive Visual Generation

Abstract

We present Open-MAGVIT2, a family of auto-regressive image generation modelsranging from 300M to 1.5B. The Open-MAGVIT2 project produces an open-sourcereplication of Google's MAGVIT-v2 tokenizer, a tokenizer with a super-largecodebook (i.e., 2^{18} codes), and achieves the state-of-the-artreconstruction performance (1.17 rFID) on ImageNet 256 times 256.Furthermore, we explore its application in plain auto-regressive models andvalidate scalability properties. To assist auto-regressive models in predictingwith a super-large vocabulary, we factorize it into two sub-vocabulary ofdifferent sizes by asymmetric token factorization, and further introduce "nextsub-token prediction" to enhance sub-token interaction for better generationquality. We release all models and codes to foster innovation and creativity inthe field of auto-regressive visual generation.

Code Repositories

tencentarc/seed-voken
Official
pytorch
Mentioned in GitHub
tencentarc/open-magvit2
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
image-generation-on-imagenet-256x256Open-MAGVIT2-XL
FID: 2.33
image-reconstruction-on-imagenetOpen-Magvit2 (16x16)
FID: 1.17
PSNR: 21.90

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation | Papers | HyperAI