HyperAIHyperAI

Command Palette

Search for a command to run...

2 months ago

OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation

Jianwen Jiang Weihong Zeng Zerong Zheng Jiaqi Yang Chao Liang Wang Liao Han Liang Yuan Zhang Mingyuan Gao

OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive
  Simulation

Abstract

Existing video avatar models can produce fluid human animations, yet theystruggle to move beyond mere physical likeness to capture a character'sauthentic essence. Their motions typically synchronize with low-level cues likeaudio rhythm, lacking a deeper semantic understanding of emotion, intent, orcontext. To bridge this gap, we propose a framework designed togenerate character animations that are not only physically plausible but alsosemantically coherent and expressive. Our model, OmniHuman-1.5, isbuilt upon two key technical contributions. First, we leverage Multimodal LargeLanguage Models to synthesize a structured textual representation of conditionsthat provides high-level semantic guidance. This guidance steers our motiongenerator beyond simplistic rhythmic synchronization, enabling the productionof actions that are contextually and emotionally resonant. Second, to ensurethe effective fusion of these multimodal inputs and mitigate inter-modalityconflicts, we introduce a specialized Multimodal DiT architecture with a novelPseudo Last Frame design. The synergy of these components allows our model toaccurately interpret the joint semantics of audio, images, and text, therebygenerating motions that are deeply coherent with the character, scene, andlinguistic content. Extensive experiments demonstrate that our model achievesleading performance across a comprehensive set of metrics, including lip-syncaccuracy, video quality, motion naturalness and semantic consistency withtextual prompts. Furthermore, our approach shows remarkable extensibility tocomplex scenarios, such as those involving multi-person and non-human subjects.Homepage: https://omnihuman-lab.github.io/v1_5/

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation | Papers | HyperAI