Command Palette
Search for a command to run...
STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer
Yushi Lan Yihang Luo Fangzhou Hong Shangchen Zhou Honghua Chen Zhaoyang Lyu Shuai Yang Bo Dai Chen Change Loy Xingang Pan

Abstract
We present STream3R, a novel approach to 3D reconstruction that reformulatespointmap prediction as a decoder-only Transformer problem. Existingstate-of-the-art methods for multi-view reconstruction either depend onexpensive global optimization or rely on simplistic memory mechanisms thatscale poorly with sequence length. In contrast, STream3R introduces anstreaming framework that processes image sequences efficiently using causalattention, inspired by advances in modern language modeling. By learninggeometric priors from large-scale 3D datasets, STream3R generalizes well todiverse and challenging scenarios, including dynamic scenes where traditionalmethods often fail. Extensive experiments show that our method consistentlyoutperforms prior work across both static and dynamic scene benchmarks.Moreover, STream3R is inherently compatible with LLM-style traininginfrastructure, enabling efficient large-scale pretraining and fine-tuning forvarious downstream 3D tasks. Our results underscore the potential of causalTransformer models for online 3D perception, paving the way for real-time 3Dunderstanding in streaming environments. More details can be found in ourproject page: https://nirvanalan.github.io/projects/stream3r.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.