WangJingdong ; SunKe ; ChengTianheng ; JiangBorui ; DengChaorui ; ZhaoYang ; LiuDong ; MuYadong ; TanMingkui ; WangXinggang ; LiuWenyu ; XiaoBin

摘要
高分辨率表示对于位置敏感的视觉问题至关重要,例如人体姿态估计、语义分割和目标检测。现有的最先进框架首先通过由高分辨率到低分辨率卷积串联(例如ResNet、VGGNet)组成的子网络将输入图像编码为低分辨率表示,然后从编码后的低分辨率表示中恢复高分辨率表示。相比之下,我们提出的名为高分辨率网络(HRNet)的网络在整个过程中保持高分辨率表示。该网络具有两个关键特性:(i) 高到低分辨率卷积流并行连接;(ii) 在不同分辨率之间反复交换信息。这样做的好处是生成的表示在语义上更加丰富,在空间上也更加精确。我们在包括人体姿态估计、语义分割和目标检测在内的广泛应用中展示了所提出的HRNet的优势,表明HRNet是解决计算机视觉问题的更强骨干网络。所有代码均可在以下网址获取:https://github.com/HRNet。
代码仓库
kingcong/gpu_HRNetW48_cls
mindspore
GitHub 中提及
PaddlePaddle/PaddleSeg
paddle
Mind23-2/MindCode-4/tree/main/hrnet
mindspore
open-mmlab/mmpose
pytorch
eshaanagarwal/hr-net-implementation
tf
GitHub 中提及
shuuchen/HRNet
pytorch
GitHub 中提及
gox-ai/hrnet-pose-api
pytorch
GitHub 中提及
HRNet/HRNet-Object-Detection
pytorch
GitHub 中提及
PaddlePaddle/PaddleClas
paddle
yangyucheng000/HRNetW48_cls
mindspore
open-mmlab/mmdetection
pytorch
HRNet/HRNet-Semantic-Segmentation
pytorch
GitHub 中提及
yukichou/PET
pytorch
GitHub 中提及
baoshengyu/deep-high-resolution-net.pytorch
pytorch
GitHub 中提及
open-mmlab/mmsegmentation
pytorch
alililia/ascend_HRNetW48_cls
mindspore
GitHub 中提及
sdll/hrnet-pose-estimation
pytorch
GitHub 中提及
pikabite/segmentations_tf2
tf
GitHub 中提及
HRNet/HRNet-MaskRCNN-Benchmark
pytorch
GitHub 中提及
w-sugar/prtr
pytorch
GitHub 中提及
mindspore-lab/mindone
mindspore
GitHub 中提及
HRNet/HRNet-Facial-Landmark-Detection
pytorch
GitHub 中提及
leoxiaobin/deep-high-resolution-net.pytorch
pytorch
GitHub 中提及
mlpc-ucsd/PRTR
pytorch
GitHub 中提及
sithu31296/pose-estimation
pytorch
GitHub 中提及
anshky/HR-NET
pytorch
GitHub 中提及
Calylyli/Mindsporehrnet
mindspore
HRNet/HRNet-Image-Classification
pytorch
GitHub 中提及
2023-MindSpore-1/ms-code-162
mindspore
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| dichotomous-image-segmentation-on-dis-te1 | HRNet | E-measure: 0.797 HCE: 262 MAE: 0.088 S-Measure: 0.742 max F-Measure: 0.668 weighted F-measure: 0.579 |
| dichotomous-image-segmentation-on-dis-te2 | HRNet | E-measure: 0.840 HCE: 555 MAE: 0.087 S-Measure: 0.784 max F-Measure: 0.747 weighted F-measure: 0.664 |
| dichotomous-image-segmentation-on-dis-te3 | HRNet | E-measure: 0.869 HCE: 1049 MAE: 0.080 S-Measure: 0.805 max F-Measure: 0.784 weighted F-measure: 0.700 |
| dichotomous-image-segmentation-on-dis-te4 | HRNet | E-measure: 0.854 HCE: 3864 MAE: 0.092 S-Measure: 0.792 max F-Measure: 0.772 weighted F-measure: 0.687 |
| dichotomous-image-segmentation-on-dis-vd | HRNet | E-measure: 0.824 HCE: 1560 MAE: 0.095 S-Measure: 0.767 max F-Measure: 0.726 weighted F-measure: 0.641 |
| face-alignment-on-300w | HRNet | NME_inter-ocular (%, Challenge): 5.15 NME_inter-ocular (%, Common): 2.87 NME_inter-ocular (%, Full): 3.32 |
| face-alignment-on-cofw | HRNet | NME (inter-ocular): 3.45 |
| face-alignment-on-cofw-68 | HRNetV2-W18 | NME (inter-ocular): 5.06 |
| face-alignment-on-wflw | HRNet | NME (inter-ocular): 4.60 |
| instance-segmentation-on-bdd100k-val | HRNet | AP: 22.5 |
| instance-segmentation-on-coco-minival | HTC (HRNetV2p-W48) | mask AP: 41.0 |
| object-detection-on-coco | Mask R-CNN (HRNetV2p-W48 + cascade) | AP50: 64.0 AP75: 50.3 APL: 58.3 APM: 48.6 APS: 27.1 Hardware Burden: 15G Operations per network pass: 61.8G box mAP: 46.1 |
| object-detection-on-coco | Mask R-CNN (HRNetV2p-W32 + cascade) | AP50: 62.5 AP75: 48.6 APL: 56.3 Hardware Burden: 16G Operations per network pass: 50.6G |
| object-detection-on-coco | CenterNet (HRNetV2-W48) | AP75: 46.5 APL: 57.8 APS: 22.2 Hardware Burden: 16G Operations per network pass: 21.7G box mAP: 43.5 |
| object-detection-on-coco | FCOS (HRNetV2p-W48) | AP50: 59.3 APL: 51.0 APM: 42.6 APS: 23.4 Hardware Burden: 16G Operations per network pass: 27.3G box mAP: 40.5 |
| object-detection-on-coco | Faster R-CNN (HRNetV2p-W48) | AP50: 63.6 AP75: 46.4 APL: 53.0 APM: 44.6 APS: 24.9 Hardware Burden: 16G Operations per network pass: 20.8G box mAP: 42.4 |
| object-detection-on-coco | HTC (HRNetV2p-W48) | AP50: 65.9 AP75: 51.2 APL: 59.8 APM: 49.7 APS: 28.0 Hardware Burden: 15G Operations per network pass: 71.7G box mAP: 47.3 |
| object-detection-on-coco | Cascade R-CNN (HRNetV2p-W48) | AP75: 48.6 APL: 56.3 APM: 47.3 APS: 26.0 |
| object-detection-on-coco-minival | HTC (HRNetV2p-W48) | APL: 62.2 APM: 50.3 APS: 28.8 box AP: 47.0 |
| object-detection-on-coco-minival | Cascade R-CNN (HRNetV2p-W18) | AP50: 59.2 AP75: 44.9 APL: 54.1 APM: 44.2 APS: 23.7 box AP: 41.3 |
| object-detection-on-coco-minival | Mask R-CNN (HRNetV2p-W32) | APM: 45.4 APS: 25.0 box AP: 42.3 |
| object-detection-on-coco-minival | Mask R-CNN (HRNetV2p-W32, cascade) | APM: 47.9 APS: 26.1 |
| object-detection-on-coco-minival | Faster R-CNN (HRNetV2p-W18) | AP50: 58.9 AP75: 41.5 APL: 49.6 APM: 40.8 APS: 22.6 box AP: 38.0 |
| object-detection-on-coco-minival | Cascade R-CNN (HRNetV2p-W48) | AP50: 62.7 AP75: 48.7 APL: 58.5 APM: 48.1 APS: 26.3 box AP: 44.6 |
| object-detection-on-coco-minival | Faster R-CNN (HRNetV2p-W48) | AP50: 62.8 AP75: 45.9 APL: 54.6 APM: 44.7 box AP: 41.8 |
| object-detection-on-coco-minival | Mask R-CNN (HRNetV2p-W48, cascade) | APL: 60.1 APS: 27.5 box AP: 46.0 |
| object-detection-on-coco-minival | Mask R-CNN (HRNetV2p-W18) | APL: 51.0 APM: 41.7 box AP: 39.2 |
| object-detection-on-coco-minival | Faster R-CNN (HRNetV2p-W32) | AP50: 61.8 AP75: 44.8 APL: 53.3 APM: 43.7 APS: 24.4 box AP: 40.9 |
| object-detection-on-coco-minival | Cascade R-CNN (HRNetV2p-W32) | AP50: 61.7 AP75: 47.7 APL: 57.4 APM: 46.5 APS: 25.6 box AP: 43.7 |
| object-detection-on-coco-minival | HTC (HRNetV2p-W32) | APL: 59.5 APM: 48.4 APS: 27.0 box AP: 45.3 |
| object-detection-on-coco-minival | HTC (HRNetV2p-W18) | APM: 46.0 APS: 26.6 box AP: 43.1 |
| semantic-segmentation-on-cityscapes | HRNetV2 (train+val) | Mean IoU (class): 81.6% |
| semantic-segmentation-on-cityscapes-val | HRNetV2 (HRNetV2-W40) | mIoU: 80.2 |
| semantic-segmentation-on-cityscapes-val | HRNetV2 (HRNetV2-W48) | mIoU: 81.1 |
| semantic-segmentation-on-dada-seg | HRNet (ACDC) | mIoU: 27.5 |
| semantic-segmentation-on-pascal-context | HRNetV2 HRNetV2-W48 | mIoU: 54 |
| semantic-segmentation-on-pascal-context | CFNet (ResNet-101) | mIoU: 54.0 |
| semantic-segmentation-on-potsdam-1 | HRNet-18 | mIoU: 84.02 |
| semantic-segmentation-on-potsdam-1 | HRNet-48 | mIoU: 84.22 |
| semantic-segmentation-on-us3d-1 | HRNet-18 | mIoU: 60.33 |
| semantic-segmentation-on-us3d-1 | HRNet-48 | mIoU: 72.66 |
| semantic-segmentation-on-vaihingen | HRNet-48 | mIoU: 76.75 |
| semantic-segmentation-on-vaihingen | HRNet-18 | mIoU: 75.90 |
| thermal-image-segmentation-on-mfn-dataset | HRNet | mIOU: 51.7 |