HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis

Wu Chenfei ; Liang Jian ; Hu Xiaowei ; Gan Zhe ; Wang Jianfeng ; Wang Lijuan ; Liu Zicheng ; Fang Yuejian ; Duan Nan

NUWA-Infinity: Autoregressive over Autoregressive Generation for
  Infinite Visual Synthesis

Abstract

In this paper, we present NUWA-Infinity, a generative model for infinitevisual synthesis, which is defined as the task of generating arbitrarily-sizedhigh-resolution images or long-duration videos. An autoregressive overautoregressive generation mechanism is proposed to deal with this variable-sizegeneration task, where a global patch-level autoregressive model considers thedependencies between patches, and a local token-level autoregressive modelconsiders dependencies between visual tokens within each patch. A NearbyContext Pool (NCP) is introduced to cache-related patches already generated asthe context for the current patch being generated, which can significantly savecomputation costs without sacrificing patch-level dependency modeling. AnArbitrary Direction Controller (ADC) is used to decide suitable generationorders for different visual synthesis tasks and learn order-aware positionalembeddings. Compared to DALL-E, Imagen and Parti, NUWA-Infinity can generatehigh-resolution images with arbitrary sizes and support long-duration videogeneration additionally. Compared to NUWA, which also covers images and videos,NUWA-Infinity has superior visual synthesis capabilities in terms of resolutionand variable-size generation. The GitHub link ishttps://github.com/microsoft/NUWA. The homepage link ishttps://nuwa-infinity.microsoft.com.

Code Repositories

microsoft/nuwa
Official
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
image-outpainting-on-lhqcNUWA-Infinity w/o text
Block-FID (Right Extend): 6.43
Block-FID (Down Extend): 11.47
Block-FID (Left Extend): 6.71
Block-FID (Up Extend): 8.03
image-outpainting-on-lhqcNUWA-Infinity
Block-FID (Right Extend): 6.45
Block-FID (Down Extend): 9.84
Block-FID (Left Extend): 6.72
Block-FID (Up Extend): 7.43
text-to-image-generation-on-lhqcNUWA-Infinity
Block-FID: 9.71

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis | Papers | HyperAI