HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation

Min-Seop Kwak Junho Kim Sangdoo Yun Dongyoon Han Taekyoung Kim Seungryong Kim Jin-Hwa Kim

Aligned Novel View Image and Geometry Synthesis via Cross-modal
  Attention Instillation

Abstract

We introduce a diffusion-based framework that performs aligned novel viewimage and geometry generation via a warping-and-inpainting methodology. Unlikeprior methods that require dense posed images or pose-embedded generativemodels limited to in-domain views, our method leverages off-the-shelf geometrypredictors to predict partial geometries viewed from reference images, andformulates novel-view synthesis as an inpainting task for both image andgeometry. To ensure accurate alignment between generated images and geometry,we propose cross-modal attention distillation, where attention maps from theimage diffusion branch are injected into a parallel geometry diffusion branchduring both training and inference. This multi-task approach achievessynergistic effects, facilitating geometrically robust image synthesis as wellas well-defined geometry prediction. We further introduce proximity-based meshconditioning to integrate depth and normal cues, interpolating between pointcloud and filtering erroneously predicted geometry from influencing thegeneration process. Empirically, our method achieves high-fidelityextrapolative view synthesis on both image and geometry across a range ofunseen scenes, delivers competitive reconstruction quality under interpolationsettings, and produces geometrically aligned colored point clouds forcomprehensive 3D completion. Project page is available athttps://cvlab-kaist.github.io/MoAI.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation | Papers | HyperAI