HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

INFERNO: Inferring Object-Centric 3D Scene Representations without Supervision

{Aaron Courville Nicolas Ballas Lluis Castrejon}

INFERNO: Inferring Object-Centric 3D Scene Representations without Supervision

Abstract

We propose INFERNO, a method to infer object-centric representations of visual scenes without relying on annotations. Our method learns to decompose a scene into multiple objects, each object having a structured representation that disentangles its shape, appearance and 3D pose. To impose this structure we rely on recent advances in neural 3D rendering. Each object representation defines a localized neural radiance field that is used to generate 2D views of the scene through a differentiable rendering process. Our model is subsequently trained by minimizing a reconstruction loss between inputs and corresponding rendered scenes. We empirically show that INFERNO discovers objects in a scene without supervision. We also validate the interpretability of the learned representations by manipulating inferred scenes and showing the corresponding effect in the rendered output. Finally, we demonstrate the usefulness of our 3D object representations in a visual reasoning task using the CATER dataset.

Benchmarks

BenchmarkMethodologyMetrics
video-object-tracking-on-caterInferno
Top 1 Accuracy: 71.7
Top 5 Accuracy: 88.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
INFERNO: Inferring Object-Centric 3D Scene Representations without Supervision | Papers | HyperAI