Command Palette
Search for a command to run...
Yi Hongwei ; Huang Chun-Hao P. ; Tripathi Shashank ; Hering Lea ; Thies Justus ; Black Michael J.

Abstract
Generating realistic 3D worlds occupied by moving humans has manyapplications in games, architecture, and synthetic data creation. Butgenerating such scenes is expensive and labor intensive. Recent work generateshuman poses and motions given a 3D scene. Here, we take the opposite approachand generate 3D indoor scenes given 3D human motion. Such motions can come fromarchival motion capture or from IMU sensors worn on the body, effectivelyturning human movement in a "scanner" of the 3D world. Intuitively, humanmovement indicates the free-space in a room and human contact indicatessurfaces or objects that support activities such as sitting, lying or touching.We propose MIME (Mining Interaction and Movement to infer 3D Environments),which is a generative model of indoor scenes that produces furniture layoutsthat are consistent with the human movement. MIME uses an auto-regressivetransformer architecture that takes the already generated objects in the sceneas well as the human motion as input, and outputs the next plausible object. Totrain MIME, we build a dataset by populating the 3D FRONT scene dataset with 3Dhumans. Our experiments show that MIME produces more diverse and plausible 3Dscenes than a recent generative scene method that does not know about humanmovement. Code and data will be available for research athttps://mime.is.tue.mpg.de.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 3d-semantic-scene-completion-on-pro-text | MIME | CD: 2.0493 CMD: 1.3832 F1: 0.0990 |
| indoor-scene-synthesis-on-pro-text | MIME | CD: 2.0493 EMD: 1.3832 F1: 0.0990 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.