Command Palette
Search for a command to run...

Abstract
High-resolution representations are essential for position-sensitive visionproblems, such as human pose estimation, semantic segmentation, and objectdetection. Existing state-of-the-art frameworks first encode the input image asa low-resolution representation through a subnetwork that is formed byconnecting high-to-low resolution convolutions \emph{in series} (e.g., ResNet,VGGNet), and then recover the high-resolution representation from the encodedlow-resolution representation. Instead, our proposed network, named asHigh-Resolution Network (HRNet), maintains high-resolution representationsthrough the whole process. There are two key characteristics: (i) Connect thehigh-to-low resolution convolution streams \emph{in parallel}; (ii) Repeatedlyexchange the information across resolutions. The benefit is that the resultingrepresentation is semantically richer and spatially more precise. We show thesuperiority of the proposed HRNet in a wide range of applications, includinghuman pose estimation, semantic segmentation, and object detection, suggestingthat the HRNet is a stronger backbone for computer vision problems. All thecodes are available at~{\url{https://github.com/HRNet}}.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| dichotomous-image-segmentation-on-dis-te1 | HRNet | E-measure: 0.797 HCE: 262 MAE: 0.088 S-Measure: 0.742 max F-Measure: 0.668 weighted F-measure: 0.579 |
| dichotomous-image-segmentation-on-dis-te2 | HRNet | E-measure: 0.840 HCE: 555 MAE: 0.087 S-Measure: 0.784 max F-Measure: 0.747 weighted F-measure: 0.664 |
| dichotomous-image-segmentation-on-dis-te3 | HRNet | E-measure: 0.869 HCE: 1049 MAE: 0.080 S-Measure: 0.805 max F-Measure: 0.784 weighted F-measure: 0.700 |
| dichotomous-image-segmentation-on-dis-te4 | HRNet | E-measure: 0.854 HCE: 3864 MAE: 0.092 S-Measure: 0.792 max F-Measure: 0.772 weighted F-measure: 0.687 |
| dichotomous-image-segmentation-on-dis-vd | HRNet | E-measure: 0.824 HCE: 1560 MAE: 0.095 S-Measure: 0.767 max F-Measure: 0.726 weighted F-measure: 0.641 |
| face-alignment-on-300w | HRNet | NME_inter-ocular (%, Challenge): 5.15 NME_inter-ocular (%, Common): 2.87 NME_inter-ocular (%, Full): 3.32 |
| face-alignment-on-cofw | HRNet | NME (inter-ocular): 3.45 |
| face-alignment-on-cofw-68 | HRNetV2-W18 | NME (inter-ocular): 5.06 |
| face-alignment-on-wflw | HRNet | NME (inter-ocular): 4.60 |
| instance-segmentation-on-bdd100k-val | HRNet | AP: 22.5 |
| instance-segmentation-on-coco-minival | HTC (HRNetV2p-W48) | mask AP: 41.0 |
| object-detection-on-coco | Mask R-CNN (HRNetV2p-W48 + cascade) | AP50: 64.0 AP75: 50.3 APL: 58.3 APM: 48.6 APS: 27.1 Hardware Burden: 15G Operations per network pass: 61.8G box mAP: 46.1 |
| object-detection-on-coco | Mask R-CNN (HRNetV2p-W32 + cascade) | AP50: 62.5 AP75: 48.6 APL: 56.3 Hardware Burden: 16G Operations per network pass: 50.6G |
| object-detection-on-coco | CenterNet (HRNetV2-W48) | AP75: 46.5 APL: 57.8 APS: 22.2 Hardware Burden: 16G Operations per network pass: 21.7G box mAP: 43.5 |
| object-detection-on-coco | FCOS (HRNetV2p-W48) | AP50: 59.3 APL: 51.0 APM: 42.6 APS: 23.4 Hardware Burden: 16G Operations per network pass: 27.3G box mAP: 40.5 |
| object-detection-on-coco | Faster R-CNN (HRNetV2p-W48) | AP50: 63.6 AP75: 46.4 APL: 53.0 APM: 44.6 APS: 24.9 Hardware Burden: 16G Operations per network pass: 20.8G box mAP: 42.4 |
| object-detection-on-coco | HTC (HRNetV2p-W48) | AP50: 65.9 AP75: 51.2 APL: 59.8 APM: 49.7 APS: 28.0 Hardware Burden: 15G Operations per network pass: 71.7G box mAP: 47.3 |
| object-detection-on-coco | Cascade R-CNN (HRNetV2p-W48) | AP75: 48.6 APL: 56.3 APM: 47.3 APS: 26.0 |
| object-detection-on-coco-minival | HTC (HRNetV2p-W48) | APL: 62.2 APM: 50.3 APS: 28.8 box AP: 47.0 |
| object-detection-on-coco-minival | Cascade R-CNN (HRNetV2p-W18) | AP50: 59.2 AP75: 44.9 APL: 54.1 APM: 44.2 APS: 23.7 box AP: 41.3 |
| object-detection-on-coco-minival | Mask R-CNN (HRNetV2p-W32) | APM: 45.4 APS: 25.0 box AP: 42.3 |
| object-detection-on-coco-minival | Mask R-CNN (HRNetV2p-W32, cascade) | APM: 47.9 APS: 26.1 |
| object-detection-on-coco-minival | Faster R-CNN (HRNetV2p-W18) | AP50: 58.9 AP75: 41.5 APL: 49.6 APM: 40.8 APS: 22.6 box AP: 38.0 |
| object-detection-on-coco-minival | Cascade R-CNN (HRNetV2p-W48) | AP50: 62.7 AP75: 48.7 APL: 58.5 APM: 48.1 APS: 26.3 box AP: 44.6 |
| object-detection-on-coco-minival | Faster R-CNN (HRNetV2p-W48) | AP50: 62.8 AP75: 45.9 APL: 54.6 APM: 44.7 box AP: 41.8 |
| object-detection-on-coco-minival | Mask R-CNN (HRNetV2p-W48, cascade) | APL: 60.1 APS: 27.5 box AP: 46.0 |
| object-detection-on-coco-minival | Mask R-CNN (HRNetV2p-W18) | APL: 51.0 APM: 41.7 box AP: 39.2 |
| object-detection-on-coco-minival | Faster R-CNN (HRNetV2p-W32) | AP50: 61.8 AP75: 44.8 APL: 53.3 APM: 43.7 APS: 24.4 box AP: 40.9 |
| object-detection-on-coco-minival | Cascade R-CNN (HRNetV2p-W32) | AP50: 61.7 AP75: 47.7 APL: 57.4 APM: 46.5 APS: 25.6 box AP: 43.7 |
| object-detection-on-coco-minival | HTC (HRNetV2p-W32) | APL: 59.5 APM: 48.4 APS: 27.0 box AP: 45.3 |
| object-detection-on-coco-minival | HTC (HRNetV2p-W18) | APM: 46.0 APS: 26.6 box AP: 43.1 |
| semantic-segmentation-on-cityscapes | HRNetV2 (train+val) | Mean IoU (class): 81.6% |
| semantic-segmentation-on-cityscapes-val | HRNetV2 (HRNetV2-W40) | mIoU: 80.2 |
| semantic-segmentation-on-cityscapes-val | HRNetV2 (HRNetV2-W48) | mIoU: 81.1 |
| semantic-segmentation-on-dada-seg | HRNet (ACDC) | mIoU: 27.5 |
| semantic-segmentation-on-pascal-context | HRNetV2 HRNetV2-W48 | mIoU: 54 |
| semantic-segmentation-on-pascal-context | CFNet (ResNet-101) | mIoU: 54.0 |
| semantic-segmentation-on-potsdam-1 | HRNet-18 | mIoU: 84.02 |
| semantic-segmentation-on-potsdam-1 | HRNet-48 | mIoU: 84.22 |
| semantic-segmentation-on-us3d-1 | HRNet-18 | mIoU: 60.33 |
| semantic-segmentation-on-us3d-1 | HRNet-48 | mIoU: 72.66 |
| semantic-segmentation-on-vaihingen | HRNet-48 | mIoU: 76.75 |
| semantic-segmentation-on-vaihingen | HRNet-18 | mIoU: 75.90 |
| thermal-image-segmentation-on-mfn-dataset | HRNet | mIOU: 51.7 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.