HyperAIHyperAI

Command Palette

Search for a command to run...

2 months ago

GenCompositor: Generative Video Compositing with Diffusion Transformer

Shuzhou Yang Xiaoyu Li Xiaodong Cun Guangzhi Wang Lingen Li Ying Shan Jian Zhang

GenCompositor: Generative Video Compositing with Diffusion Transformer

Abstract

Video compositing combines live-action footage to create video production,serving as a crucial technique in video creation and film production.Traditional pipelines require intensive labor efforts and expert collaboration,resulting in lengthy production cycles and high manpower costs. To address thisissue, we automate this process with generative models, called generative videocompositing. This new task strives to adaptively inject identity and motioninformation of foreground video to the target video in an interactive manner,allowing users to customize the size, motion trajectory, and other attributesof the dynamic elements added in final video. Specifically, we designed a novelDiffusion Transformer (DiT) pipeline based on its intrinsic properties. Tomaintain consistency of the target video before and after editing, we revised alight-weight DiT-based background preservation branch with masked tokeninjection. As to inherit dynamic elements from other sources, a DiT fusionblock is proposed using full self-attention, along with a simple yet effectiveforeground augmentation for training. Besides, for fusing background andforeground videos with different layouts based on user control, we developed anovel position embedding, named Extended Rotary Position Embedding (ERoPE).Finally, we curated a dataset comprising 61K sets of videos for our new task,called VideoComp. This data includes complete dynamic elements and high-qualitytarget videos. Experiments demonstrate that our method effectively realizesgenerative video compositing, outperforming existing possible solutions infidelity and consistency.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
GenCompositor: Generative Video Compositing with Diffusion Transformer | Papers | HyperAI