Command Palette
Search for a command to run...
Floors are Flat: Leveraging Semantics for Real-Time Surface Normal Prediction
Steven Hickson; Karthik Raveendran; Alireza Fathi; Kevin Murphy; Irfan Essa

Abstract
We propose 4 insights that help to significantly improve the performance of deep learning models that predict surface normals and semantic labels from a single RGB image. These insights are: (1) denoise the "ground truth" surface normals in the training set to ensure consistency with the semantic labels; (2) concurrently train on a mix of real and synthetic data, instead of pretraining on synthetic and finetuning on real; (3) jointly predict normals and semantics using a shared model, but only backpropagate errors on pixels that have valid training labels; (4) slim down the model and use grayscale instead of color inputs. Despite the simplicity of these steps, we demonstrate consistently improved results on several datasets, using a model that runs at 12 fps on a standard mobile phone.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| semantic-segmentation-on-scannetv2 | Floors are Flat | Pixel Accuracy: 65.6 |
| surface-normals-estimation-on-nyu-depth-v2-1 | Floors are Flat | % u003c 11.25: 59.5 % u003c 22.5: 72.2 % u003c 30: 77.3 Mean Angle Error: 19.7 RMSE: 19.3 |
| surface-normals-estimation-on-scannetv2 | Floors are Flat | % u003c 11.25: 50.9 % u003c 22.5: 65.2 % u003c 30: 70 Mean Angle Error: 28 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.