Command Palette
Search for a command to run...
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints
Meiqi Wu Jiashu Zhu Xiaokun Feng Chubin Chen Chen Zhu Bingze Song Fangyuan Mao Jiahong Wu Xiangxiang Chu Kaiqi Huang

Abstract
Video generation models have achieved remarkable progress, particularlyexcelling in realistic scenarios; however, their performance degrades notablyin imaginative scenarios. These prompts often involve rarely co-occurringconcepts with long-distance semantic relationships, falling outside trainingdistributions. Existing methods typically apply test-time scaling for improvingvideo quality, but their fixed search spaces and static reward designs limitadaptability to imaginative scenarios. To fill this gap, we proposeImagerySearch, a prompt-guided adaptive test-time search strategy thatdynamically adjusts both the inference search space and reward functionaccording to semantic relationships in the prompt. This enables more coherentand visually plausible videos in challenging imaginative settings. To evaluateprogress in this direction, we introduce LDT-Bench, the first dedicatedbenchmark for long-distance semantic prompts, consisting of 2,839 diverseconcept pairs and an automated protocol for assessing creative generationcapabilities. Extensive experiments show that ImagerySearch consistentlyoutperforms strong video generation baselines and existing test-time scalingapproaches on LDT-Bench, and achieves competitive improvements on VBench,demonstrating its effectiveness across diverse prompt types. We will releaseLDT-Bench and code to facilitate future research on imaginative videogeneration.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.