HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset

Phantom-Data : Towards a General Subject-Consistent Video Generation
  Dataset

Abstract

Subject-to-video generation has witnessed substantial progress in recentyears. However, existing models still face significant challenges in faithfullyfollowing textual instructions. This limitation, commonly known as thecopy-paste problem, arises from the widely used in-pair training paradigm. Thisapproach inherently entangles subject identity with background and contextualattributes by sampling reference images from the same scene as the targetvideo. To address this issue, we introduce Phantom-Data, the firstgeneral-purpose cross-pair subject-to-video consistency dataset, containingapproximately one million identity-consistent pairs across diverse categories.Our dataset is constructed via a three-stage pipeline: (1) a general andinput-aligned subject detection module, (2) large-scale cross-context subjectretrieval from more than 53 million videos and 3 billion images, and (3)prior-guided identity verification to ensure visual consistency undercontextual variation. Comprehensive experiments show that training withPhantom-Data significantly improves prompt alignment and visual quality whilepreserving identity consistency on par with in-pair baselines.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset | Papers | HyperAI