HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Yume: An Interactive World Generation Model

Xiaofeng Mao Shaoheng Lin Zhen Li Chuanhao Li Wenshuo Peng Tong He Jiangmiao Pang Mingmin Chi Yu Qiao Kaipeng Zhang

Yume: An Interactive World Generation Model

Abstract

Yume aims to use images, text, or videos to create an interactive, realistic,and dynamic world, which allows exploration and control using peripheraldevices or neural signals. In this report, we present a preview version of\method, which creates a dynamic world from an input image and allowsexploration of the world using keyboard actions. To achieve this high-fidelityand interactive video world generation, we introduce a well-designed framework,which consists of four main components, including camera motion quantization,video generation architecture, advanced sampler, and model acceleration. First,we quantize camera motions for stable training and user-friendly interactionusing keyboard inputs. Then, we introduce the Masked Video DiffusionTransformer~(MVDT) with a memory module for infinite video generation in anautoregressive manner. After that, training-free Anti-Artifact Mechanism (AAM)and Time Travel Sampling based on Stochastic Differential Equations (TTS-SDE)are introduced to the sampler for better visual quality and more precisecontrol. Moreover, we investigate model acceleration by synergisticoptimization of adversarial distillation and caching mechanisms. We use thehigh-quality world exploration dataset \sekai to train \method, and it achievesremarkable results in diverse scenes and applications. All data, codebase, andmodel weights are available on https://github.com/stdstu12/YUME. Yume willupdate monthly to achieve its original goal. Project page:https://stdstu12.github.io/YUME-Project/.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Yume: An Interactive World Generation Model | Papers | HyperAI