HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Long-Range Grouping Transformer for Multi-View 3D Reconstruction

Yang Liying ; Zhu Zhenwei ; Lin Xuxin ; Nong Jian ; Liang Yanyan

Long-Range Grouping Transformer for Multi-View 3D Reconstruction

Abstract

Nowadays, transformer networks have demonstrated superior performance in manycomputer vision tasks. In a multi-view 3D reconstruction algorithm followingthis paradigm, self-attention processing has to deal with intricate imagetokens including massive information when facing heavy amounts of view input.The curse of information content leads to the extreme difficulty of modellearning. To alleviate this problem, recent methods compress the token numberrepresenting each view or discard the attention operations between the tokensfrom different views. Obviously, they give a negative impact on performance.Therefore, we propose long-range grouping attention (LGA) based on thedivide-and-conquer principle. Tokens from all views are grouped for separateattention operations. The tokens in each group are sampled from all views andcan provide macro representation for the resided view. The richness of featurelearning is guaranteed by the diversity among different groups. An effectiveand efficient encoder can be established which connects inter-view featuresusing LGA and extract intra-view features using the standard self-attentionlayer. Moreover, a novel progressive upsampling decoder is also designed forvoxel generation with relatively high resolution. Hinging on the above, weconstruct a powerful transformer-based network, called LRGT. Experimentalresults on ShapeNet verify our method achieves SOTA accuracy in multi-viewreconstruction. Code will be available athttps://github.com/LiyingCV/Long-Range-Grouping-Transformer.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
3d-object-reconstruction-on-data3dr2n2LRGT
3DIoU: 0.696

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Long-Range Grouping Transformer for Multi-View 3D Reconstruction | Papers | HyperAI