5 months ago

Video Generation

Computer Vision

Shen Sang Tiancheng Zhi Tianpei Gu Jing Liu Linjie Luo

Abstract

We present Lynx, a high-fidelity model for personalized video synthesis froma single input image. Built on an open-source Diffusion Transformer (DiT)foundation model, Lynx introduces two lightweight adapters to ensure identityfidelity. The ID-adapter employs a Perceiver Resampler to convertArcFace-derived facial embeddings into compact identity tokens forconditioning, while the Ref-adapter integrates dense VAE features from a frozenreference pathway, injecting fine-grained details across all transformer layersthrough cross-attention. These modules collectively enable robust identitypreservation while maintaining temporal coherence and visual realism. Throughevaluation on a curated benchmark of 40 subjects and 20 unbiased prompts, whichyielded 800 test cases, Lynx has demonstrated superior face resemblance,competitive prompt following, and strong video quality, thereby advancing thestate of personalized video generation.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

5 months ago

Video Generation

Computer Vision

Shen Sang Tiancheng Zhi Tianpei Gu Jing Liu Linjie Luo

Abstract

We present Lynx, a high-fidelity model for personalized video synthesis froma single input image. Built on an open-source Diffusion Transformer (DiT)foundation model, Lynx introduces two lightweight adapters to ensure identityfidelity. The ID-adapter employs a Perceiver Resampler to convertArcFace-derived facial embeddings into compact identity tokens forconditioning, while the Ref-adapter integrates dense VAE features from a frozenreference pathway, injecting fine-grained details across all transformer layersthrough cross-attention. These modules collectively enable robust identitypreservation while maintaining temporal coherence and visual realism. Throughevaluation on a curated benchmark of 40 subjects and 20 unbiased prompts, whichyielded 800 test cases, Lynx has demonstrated superior face resemblance,competitive prompt following, and strong video quality, thereby advancing thestate of personalized video generation.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp