Command Palette
Search for a command to run...
WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition
Zhu Lianghui ; Zhou Junwei ; Liu Yan ; Hao Xin ; Liu Wenyu ; Wang Xinggang

Abstract
Weakly supervised visual recognition using inexact supervision is a criticalyet challenging learning problem. It significantly reduces human labeling costsand traditionally relies on multi-instance learning and pseudo-labeling. Thispaper introduces WeakSAM and solves the weakly-supervised object detection(WSOD) and segmentation by utilizing the pre-learned world knowledge containedin a vision foundation model, i.e., the Segment Anything Model (SAM). WeakSAMaddresses two critical limitations in traditional WSOD retraining, i.e., pseudoground truth (PGT) incompleteness and noisy PGT instances, through adaptive PGTgeneration and Region of Interest (RoI) drop regularization. It also addressesthe SAM's problems of requiring prompts and category unawareness for automaticobject detection and segmentation. Our results indicate that WeakSAMsignificantly surpasses previous state-of-the-art methods in WSOD and WSISbenchmarks with large margins, i.e. average improvements of 7.4% and 8.5%,respectively. The code is available at \url{https://github.com/hustvl/WeakSAM}.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| image-level-supervised-instance-segmentation | WeakSAM-Mask RCNN (with SAM) | mAP@0.25: 70.3 mAP@0.5: 59.6 mAP@0.7: 43.1 mAP@0.75: 36.2 |
| image-level-supervised-instance-segmentation | WeakSAM-Mask2Former (with SAM) | mAP@0.25: 73.4 mAP@0.5: 64.4 mAP@0.7: 49.7 mAP@0.75: 45.3 |
| image-level-supervised-instance-segmentation-1 | WeakSAM-Mask RCNN (with SAM) | AP: 21.0 AP@50: 34.5 AP@75: 22.2 |
| image-level-supervised-instance-segmentation-1 | WeakSAM-Mask2Former (with SAM) | AP: 25.9 AP@50: 39.9 AP@75: 27.9 |
| image-level-supervised-instance-segmentation-2 | WeakSAM-Mask RCNN (with SAM) | AP: 20.6 AP@50: 33.9 AP@75: 22.0 |
| image-level-supervised-instance-segmentation-2 | WeakSAM-Mask2Former (with SAM) | AP: 25.2 AP@50: 38.4 AP@75: 27.0 |
| weakly-supervised-object-detection-on-ms-coco | WeakSAM-OICR-DINO (with SAM) | AP: 24.9 |
| weakly-supervised-object-detection-on-ms-coco | WeakSAM-MIST-Faster RCNN (with SAM) | AP: 23.8 |
| weakly-supervised-object-detection-on-ms-coco | WeakSAM-MIST-DINO (with SAM) | AP: 26.6 |
| weakly-supervised-object-detection-on-ms-coco | WeakSAM-MIST (with SAM) | AP: 22.9 |
| weakly-supervised-object-detection-on-ms-coco | WeakSAM-OICR (with SAM) | AP: 19.9 |
| weakly-supervised-object-detection-on-ms-coco | WeakSAM-OICR-Faster RCNN (with SAM) | AP: 22.3 |
| weakly-supervised-object-detection-on-pascal | WeakSAM-OICR-DINO (with SAM) | MAP: 63.7 |
| weakly-supervised-object-detection-on-pascal | WeakSAM-MIST-DINO (with SAM) | MAP: 70.2 |
| weakly-supervised-object-detection-on-pascal | WeakSAM-MIST-Faster RCNN (with SAM) | MAP: 69.2 |
| weakly-supervised-object-detection-on-pascal | WeakSAM-MIST (with SAM) | MAP: 66.9 |
| weakly-supervised-object-detection-on-pascal | WeakSAM-OICR-Faster RCNN (with SAM) | MAP: 62.9 |
| weakly-supervised-object-detection-on-pascal | WeakSAM-OICR (with SAM) | MAP: 58.4 |
| weakly-supervised-object-detection-on-pascal-1 | WeakSAM-OICR (with SAM) | MAP: 58.9 |
| weakly-supervised-object-detection-on-pascal-1 | WeakSAM-MIST-Faster RCNN (with SAM) | MAP: 71.8 |
| weakly-supervised-object-detection-on-pascal-1 | WeakSAM-MIST (with SAM) | MAP: 67.4 |
| weakly-supervised-object-detection-on-pascal-1 | WeakSAM-MIST-DINO (with SAM) | MAP: 73.4 |
| weakly-supervised-object-detection-on-pascal-1 | WeakSAM-OICR-Faster RCNN (with SAM) | MAP: 65.7 |
| weakly-supervised-object-detection-on-pascal-1 | WeakSAM-OICR-DINO (with SAM) | MAP: 66.1 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.