HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

OmniGen2: Exploration to Advanced Multimodal Generation

OmniGen2: Exploration to Advanced Multimodal Generation

Abstract

In this work, we introduce OmniGen2, a versatile and open-source generativemodel designed to provide a unified solution for diverse generation tasks,including text-to-image, image editing, and in-context generation. UnlikeOmniGen v1, OmniGen2 features two distinct decoding pathways for text and imagemodalities, utilizing unshared parameters and a decoupled image tokenizer. Thisdesign enables OmniGen2 to build upon existing multimodal understanding modelswithout the need to re-adapt VAE inputs, thereby preserving the original textgeneration capabilities. To facilitate the training of OmniGen2, we developedcomprehensive data construction pipelines, encompassing image editing andin-context generation data. Additionally, we introduce a reflection mechanismtailored for image generation tasks and curate a dedicated reflection datasetbased on OmniGen2. Despite its relatively modest parameter size, OmniGen2achieves competitive results on multiple task benchmarks, includingtext-to-image and image editing. To further evaluate in-context generation,also referred to as subject-driven tasks, we introduce a new benchmark namedOmniContext. OmniGen2 achieves state-of-the-art performance among open-sourcemodels in terms of consistency. We will release our models, training code,datasets, and data construction pipeline to support future research in thisfield. Project Page: https://vectorspacelab.github.io/OmniGen2; GitHub Link:https://github.com/VectorSpaceLab/OmniGen2

Code Repositories

vectorspacelab/omnigen2
Official
pytorch
Mentioned in GitHub

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
OmniGen2: Exploration to Advanced Multimodal Generation | Papers | HyperAI