Command Palette
Search for a command to run...
NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis
Wu Chenfei ; Liang Jian ; Hu Xiaowei ; Gan Zhe ; Wang Jianfeng ; Wang Lijuan ; Liu Zicheng ; Fang Yuejian ; Duan Nan

Abstract
In this paper, we present NUWA-Infinity, a generative model for infinitevisual synthesis, which is defined as the task of generating arbitrarily-sizedhigh-resolution images or long-duration videos. An autoregressive overautoregressive generation mechanism is proposed to deal with this variable-sizegeneration task, where a global patch-level autoregressive model considers thedependencies between patches, and a local token-level autoregressive modelconsiders dependencies between visual tokens within each patch. A NearbyContext Pool (NCP) is introduced to cache-related patches already generated asthe context for the current patch being generated, which can significantly savecomputation costs without sacrificing patch-level dependency modeling. AnArbitrary Direction Controller (ADC) is used to decide suitable generationorders for different visual synthesis tasks and learn order-aware positionalembeddings. Compared to DALL-E, Imagen and Parti, NUWA-Infinity can generatehigh-resolution images with arbitrary sizes and support long-duration videogeneration additionally. Compared to NUWA, which also covers images and videos,NUWA-Infinity has superior visual synthesis capabilities in terms of resolutionand variable-size generation. The GitHub link ishttps://github.com/microsoft/NUWA. The homepage link ishttps://nuwa-infinity.microsoft.com.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| image-outpainting-on-lhqc | NUWA-Infinity w/o text | Block-FID (Right Extend): 6.43 Block-FID (Down Extend): 11.47 Block-FID (Left Extend): 6.71 Block-FID (Up Extend): 8.03 |
| image-outpainting-on-lhqc | NUWA-Infinity | Block-FID (Right Extend): 6.45 Block-FID (Down Extend): 9.84 Block-FID (Left Extend): 6.72 Block-FID (Up Extend): 7.43 |
| text-to-image-generation-on-lhqc | NUWA-Infinity | Block-FID: 9.71 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.