5 hours ago

Donghao Zhou Guisheng Liu Hao Yang Jiatong Li Jingyu Lin Xiaohu Huang Yichen Liu Xin Gao Cunjian Chen Shilei Wen

Abstract

In this work, we study Human-Object Interaction Video Generation (HOIVG), which aims to synthesize high-quality human-object interaction videos conditioned on text, reference images, audio, and pose. This task holds significant practical value for automating content creation in real-world applications, such as e-commerce demonstrations, short video production, and interactive entertainment. However, existing approaches fail to accommodate all these requisite conditions. We present OmniShow, an end-to-end framework tailored for this practical yet challenging task, capable of harmonizing multimodal conditions and delivering industry-grade performance. To overcome the trade-off between controllability and quality, we introduce Unified Channel-wise Conditioning for efficient image and pose injection, and Gated Local-Context Attention to ensure precise audio-visual synchronization. To effectively address data scarcity, we develop a Decoupled-Then-Joint Training strategy that leverages a multi-stage training process with model merging to efficiently harness heterogeneous sub-task datasets. Furthermore, to fill the evaluation gap in this field, we establish HOIVG-Bench, a dedicated and comprehensive benchmark for HOIVG. Extensive experiments demonstrate that OmniShow achieves overall state-of-the-art performance across various multimodal conditioning settings, setting a solid standard for the emerging HOIVG task.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

5 hours ago

Donghao Zhou Guisheng Liu Hao Yang Jiatong Li Jingyu Lin Xiaohu Huang Yichen Liu Xin Gao Cunjian Chen Shilei Wen

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

5 hours ago

Donghao Zhou Guisheng Liu Hao Yang Jiatong Li Jingyu Lin Xiaohu Huang Yichen Liu Xin Gao Cunjian Chen Shilei Wen

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation

Donghao Zhou Guisheng Liu Hao Yang Jiatong Li Jingyu Lin Xiaohu Huang Yichen Liu Xin Gao Cunjian Chen Shilei Wen2 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation

Donghao Zhou Guisheng Liu Hao Yang Jiatong Li Jingyu Lin Xiaohu Huang Yichen Liu Xin Gao Cunjian Chen Shilei Wen2 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation

Donghao Zhou Guisheng Liu Hao Yang Jiatong Li Jingyu Lin Xiaohu Huang Yichen Liu Xin Gao Cunjian Chen Shilei Wen2 more

Abstract

Build AI with AI

HyperAI Newsletters

Donghao Zhou Guisheng Liu Hao Yang Jiatong Li Jingyu Lin Xiaohu Huang Yichen Liu Xin Gao Cunjian Chen Shilei Wen

Donghao Zhou Guisheng Liu Hao Yang Jiatong Li Jingyu Lin Xiaohu Huang Yichen Liu Xin Gao Cunjian Chen Shilei Wen

Donghao Zhou Guisheng Liu Hao Yang Jiatong Li Jingyu Lin Xiaohu Huang Yichen Liu Xin Gao Cunjian Chen Shilei Wen