HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

π^3: Scalable Permutation-Equivariant Visual Geometry Learning

Yifan Wang Jianjun Zhou Haoyi Zhu Wenzheng Chang Yang Zhou Zizun Li Junyi Chen Jiangmiao Pang Chunhua Shen Tong He

π^3: Scalable Permutation-Equivariant Visual Geometry Learning

Abstract

We introduce pi^3, a feed-forward neural network that offers a novelapproach to visual geometry reconstruction, breaking the reliance on aconventional fixed reference view. Previous methods often anchor theirreconstructions to a designated viewpoint, an inductive bias that can lead toinstability and failures if the reference is suboptimal. In contrast, pi^3employs a fully permutation-equivariant architecture to predictaffine-invariant camera poses and scale-invariant local point maps without anyreference frames. This design makes our model inherently robust to inputordering and highly scalable. These advantages enable our simple and bias-freeapproach to achieve state-of-the-art performance on a wide range of tasks,including camera pose estimation, monocular/video depth estimation, and densepoint map reconstruction. Code and models are publicly available.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
π^3: Scalable Permutation-Equivariant Visual Geometry Learning | Papers | HyperAI