Command Palette
Search for a command to run...
He Kaiming Gkioxari Georgia Dollá r Piotr Girshick Ross

Abstract
We present a conceptually simple, flexible, and general framework for objectinstance segmentation. Our approach efficiently detects objects in an imagewhile simultaneously generating a high-quality segmentation mask for eachinstance. The method, called Mask R-CNN, extends Faster R-CNN by adding abranch for predicting an object mask in parallel with the existing branch forbounding box recognition. Mask R-CNN is simple to train and adds only a smalloverhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy togeneralize to other tasks, e.g., allowing us to estimate human poses in thesame framework. We show top results in all three tracks of the COCO suite ofchallenges, including instance segmentation, bounding-box object detection, andperson keypoint detection. Without bells and whistles, Mask R-CNN outperformsall existing, single-model entries on every task, including the COCO 2016challenge winners. We hope our simple and effective approach will serve as asolid baseline and help ease future research in instance-level recognition.Code has been made available at: https://github.com/facebookresearch/Detectron
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| instance-segmentation-on-bdd100k-val | Mask R-CNN | AP: 20.5 |
| instance-segmentation-on-coco | Mask R-CNN (ResNeXt-101-FPN) | AP50: 60.0 AP75: 39.4 APL: 53.5 APM: 39.9 APS: 16.9 mask AP: 37.1 |
| instance-segmentation-on-isaid | Mask-RCNN+ | Average Precision: 37.18 |
| instance-segmentation-on-isaid | Mask-RCNN | Average Precision: 36.50 |
| keypoint-detection-on-coco-1 | Mask R-CNN | Test AP: 63.1 Validation AP: 69.2 |
| keypoint-detection-on-coco-test-challenge | Mask R-CNN* | AP: 68.9 AP50: 89.2 AP75: 75.2 APL: 82.6 AR: 75.4 AR50: 93.2 AR75: 81.2 ARL: 76.8 ARM: 70.2 |
| keypoint-detection-on-coco-test-dev | Mask R-CNN | AP50: 87.3 AP75: 68.7 APL: 71.4 APM: 57.8 |
| multi-human-parsing-on-mhp-v10 | Mask R-CNN | AP 0.5: 52.68% |
| multi-human-parsing-on-mhp-v20 | Mask R-CNN | AP 0.5: 14.9 |
| multi-person-pose-estimation-on-crowdpose | Mask R-CNN | AP Easy: 69.4 AP Hard: 45.8 AP Medium: 57.9 mAP @0.5:0.95: 57.2 |
| multi-person-pose-estimation-on-ochuman | Mask R-CNN | AP50: 33.2 AP75: 24.5 Validation AP: 20.2 |
| multi-tissue-nucleus-segmentation-on-kumar | Mask R-CNN (e) | Dice: 0.760 Hausdorff Distance (mm): 50.9 |
| nuclear-segmentation-on-cell17 | Mask R-CNN | Dice: 0.707 F1-score: 0.8004 Hausdorff: 12.6723 |
| object-detection-on-coco | Mask R-CNN (ResNeXt-101-FPN) | AP50: 62.3 AP75: 43.4 APL: 51.2 APM: 43.2 APS: 22.1 Hardware Burden: 9G box mAP: 39.8 |
| object-detection-on-coco | Mask R-CNN (ResNet-101-FPN) | AP50: 60.3 AP75: 41.7 APL: 50.2 APM: 41.1 APS: 20.1 Hardware Burden: 9G box mAP: 38.2 |
| object-detection-on-coco-minival | Mask R-CNN (ResNeXt-101-FPN) | AP50: 59.5 AP75: 38.9 box AP: 36.7 |
| object-detection-on-coco-minival | Mask R-CNN (ResNet-50-FPN) | box AP: 37.7 |
| object-detection-on-coco-minival | Mask R-CNN (ResNet-101-FPN) | box AP: 40.0 |
| object-detection-on-coco-o | Mask R-CNN (ResNet-50) | Average mAP: 17.1 |
| object-detection-on-coco-o | Mask R-CNN (ResNet-50) | Effective Robustness: -0.11 |
| object-detection-on-isaid | Mask-RCNN | Average Precision: 36.50 |
| object-detection-on-isaid | Mask-RCNN+ | Average Precision: 37.18 |
| object-localization-on-grit | Mask R-CNN | Localization (ablation): 44.7 Localization (test): 45.1 |
| panoptic-segmentation-on-cityscapes-val | Mask R-CNN+COCO | PQth: 54.0 |
| pose-estimation-on-coco-test-dev | Mask-RCNN | AP: 63.1 AP50: 87.3 AP75: 68.7 APL: 71.4 |
| real-time-object-detection-on-coco-1 | Mask R-CNN X-152-32x8d | box AP: 45.2 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.