5 months ago

Deep High-Resolution Representation Learning for Visual Recognition

Wang Jingdong ; Sun Ke ; Cheng Tianheng ; Jiang Borui ; Deng Chaorui ; Zhao Yang ; Liu Dong ; Mu Yadong ; Tan Mingkui ; Wang

Abstract

High-resolution representations are essential for position-sensitive visionproblems, such as human pose estimation, semantic segmentation, and objectdetection. Existing state-of-the-art frameworks first encode the input image asa low-resolution representation through a subnetwork that is formed byconnecting high-to-low resolution convolutions \emph{in series} (e.g., ResNet,VGGNet), and then recover the high-resolution representation from the encodedlow-resolution representation. Instead, our proposed network, named asHigh-Resolution Network (HRNet), maintains high-resolution representationsthrough the whole process. There are two key characteristics: (i) Connect thehigh-to-low resolution convolution streams \emph{in parallel}; (ii) Repeatedlyexchange the information across resolutions. The benefit is that the resultingrepresentation is semantically richer and spatially more precise. We show thesuperiority of the proposed HRNet in a wide range of applications, includinghuman pose estimation, semantic segmentation, and object detection, suggestingthat the HRNet is a stronger backbone for computer vision problems. All thecodes are available at~{\url{https://github.com/HRNet}}.

Code Repositories

kingcong/gpu_HRNetW48_cls

mindspore

Mentioned in GitHub

PaddlePaddle/PaddleSeg

paddle

Mind23-2/MindCode-4/tree/main/hrnet

mindspore

open-mmlab/mmpose

pytorch

eshaanagarwal/hr-net-implementation

Mentioned in GitHub

shuuchen/HRNet

pytorch

Mentioned in GitHub

shijianjian/HRNet_Keras

gox-ai/hrnet-pose-api

pytorch

Mentioned in GitHub

HRNet/HRNet-Object-Detection

pytorch

Mentioned in GitHub

megvii-research/basecls/tree/main/zoo/public/hrnet

PaddlePaddle/PaddleClas

paddle

yangyucheng000/HRNetW48_cls

mindspore

open-mmlab/mmdetection

pytorch

PaddlePaddle/PaddleDetection

paddle

mindspore-ai/models/tree/master/research/cv/HRNetW48_cls

mindspore

yuanyuanli85/tf-hrnet

d-shivam/Pose-estimation-based-action-recognition-for-help-Situation-Identification

pytorch

Mentioned in GitHub

HRNet/HRNet-Semantic-Segmentation

pytorch

Mentioned in GitHub

Mind23-2/MindCode-101/tree/main/HRNetW48_cls

mindspore

2023-MindSpore-1/ms-code-214/tree/main/HRNetW48_cls

mindspore

yukichou/PET

pytorch

Mentioned in GitHub

baoshengyu/deep-high-resolution-net.pytorch

pytorch

Mentioned in GitHub

open-mmlab/mmsegmentation

pytorch

alililia/ascend_HRNetW48_cls

mindspore

Mentioned in GitHub

Mind23-2/MindCode-3/tree/main/HRNetW48_cls

mindspore

2023-MindSpore-1/ms-code-214/tree/main/HMR

mindspore

sdll/hrnet-pose-estimation

pytorch

Mentioned in GitHub

pikabite/segmentations_tf2

Mentioned in GitHub

HRNet/HRNet-MaskRCNN-Benchmark

pytorch

Mentioned in GitHub

2023-MindSpore-4/Code-5/tree/main/HRNetW48_cls

mindspore

CSAILVision/semantic-segmentation-pytorch

pytorch

w-sugar/prtr

pytorch

Mentioned in GitHub

mindspore-lab/mindone

mindspore

Mentioned in GitHub

HRNet/HRNet-Facial-Landmark-Detection

pytorch

Mentioned in GitHub

leoxiaobin/deep-high-resolution-net.pytorch

pytorch

Mentioned in GitHub

JanMarcelKezmann/TensorFlow-Advanced-Segmentation-Models

mlpc-ucsd/PRTR

pytorch

Mentioned in GitHub

sithu31296/pose-estimation

pytorch

Mentioned in GitHub

anshky/HR-NET

pytorch

Mentioned in GitHub

Calylyli/Mindsporehrnet

mindspore

HRNet/HRNet-Image-Classification

pytorch

Mentioned in GitHub

2023-MindSpore-1/ms-code-162

mindspore

Benchmarks

Benchmark	Methodology	Metrics
dichotomous-image-segmentation-on-dis-te1	HRNet	E-measure: 0.797 HCE: 262 MAE: 0.088 S-Measure: 0.742 max F-Measure: 0.668 weighted F-measure: 0.579
dichotomous-image-segmentation-on-dis-te2	HRNet	E-measure: 0.840 HCE: 555 MAE: 0.087 S-Measure: 0.784 max F-Measure: 0.747 weighted F-measure: 0.664
dichotomous-image-segmentation-on-dis-te3	HRNet	E-measure: 0.869 HCE: 1049 MAE: 0.080 S-Measure: 0.805 max F-Measure: 0.784 weighted F-measure: 0.700
dichotomous-image-segmentation-on-dis-te4	HRNet	E-measure: 0.854 HCE: 3864 MAE: 0.092 S-Measure: 0.792 max F-Measure: 0.772 weighted F-measure: 0.687
dichotomous-image-segmentation-on-dis-vd	HRNet	E-measure: 0.824 HCE: 1560 MAE: 0.095 S-Measure: 0.767 max F-Measure: 0.726 weighted F-measure: 0.641
face-alignment-on-300w	HRNet	NME_inter-ocular (%, Challenge): 5.15 NME_inter-ocular (%, Common): 2.87 NME_inter-ocular (%, Full): 3.32
face-alignment-on-cofw	HRNet	NME (inter-ocular): 3.45
face-alignment-on-cofw-68	HRNetV2-W18	NME (inter-ocular): 5.06
face-alignment-on-wflw	HRNet	NME (inter-ocular): 4.60
instance-segmentation-on-bdd100k-val	HRNet	AP: 22.5
instance-segmentation-on-coco-minival	HTC (HRNetV2p-W48)	mask AP: 41.0
object-detection-on-coco	Mask R-CNN (HRNetV2p-W48 + cascade)	AP50: 64.0 AP75: 50.3 APL: 58.3 APM: 48.6 APS: 27.1 Hardware Burden: 15G Operations per network pass: 61.8G box mAP: 46.1
object-detection-on-coco	Mask R-CNN (HRNetV2p-W32 + cascade)	AP50: 62.5 AP75: 48.6 APL: 56.3 Hardware Burden: 16G Operations per network pass: 50.6G
object-detection-on-coco	CenterNet (HRNetV2-W48)	AP75: 46.5 APL: 57.8 APS: 22.2 Hardware Burden: 16G Operations per network pass: 21.7G box mAP: 43.5
object-detection-on-coco	FCOS (HRNetV2p-W48)	AP50: 59.3 APL: 51.0 APM: 42.6 APS: 23.4 Hardware Burden: 16G Operations per network pass: 27.3G box mAP: 40.5
object-detection-on-coco	Faster R-CNN (HRNetV2p-W48)	AP50: 63.6 AP75: 46.4 APL: 53.0 APM: 44.6 APS: 24.9 Hardware Burden: 16G Operations per network pass: 20.8G box mAP: 42.4
object-detection-on-coco	HTC (HRNetV2p-W48)	AP50: 65.9 AP75: 51.2 APL: 59.8 APM: 49.7 APS: 28.0 Hardware Burden: 15G Operations per network pass: 71.7G box mAP: 47.3
object-detection-on-coco	Cascade R-CNN (HRNetV2p-W48)	AP75: 48.6 APL: 56.3 APM: 47.3 APS: 26.0
object-detection-on-coco-minival	HTC (HRNetV2p-W48)	APL: 62.2 APM: 50.3 APS: 28.8 box AP: 47.0
object-detection-on-coco-minival	Cascade R-CNN (HRNetV2p-W18)	AP50: 59.2 AP75: 44.9 APL: 54.1 APM: 44.2 APS: 23.7 box AP: 41.3
object-detection-on-coco-minival	Mask R-CNN (HRNetV2p-W32)	APM: 45.4 APS: 25.0 box AP: 42.3
object-detection-on-coco-minival	Mask R-CNN (HRNetV2p-W32, cascade)	APM: 47.9 APS: 26.1
object-detection-on-coco-minival	Faster R-CNN (HRNetV2p-W18)	AP50: 58.9 AP75: 41.5 APL: 49.6 APM: 40.8 APS: 22.6 box AP: 38.0
object-detection-on-coco-minival	Cascade R-CNN (HRNetV2p-W48)	AP50: 62.7 AP75: 48.7 APL: 58.5 APM: 48.1 APS: 26.3 box AP: 44.6
object-detection-on-coco-minival	Faster R-CNN (HRNetV2p-W48)	AP50: 62.8 AP75: 45.9 APL: 54.6 APM: 44.7 box AP: 41.8
object-detection-on-coco-minival	Mask R-CNN (HRNetV2p-W48, cascade)	APL: 60.1 APS: 27.5 box AP: 46.0
object-detection-on-coco-minival	Mask R-CNN (HRNetV2p-W18)	APL: 51.0 APM: 41.7 box AP: 39.2
object-detection-on-coco-minival	Faster R-CNN (HRNetV2p-W32)	AP50: 61.8 AP75: 44.8 APL: 53.3 APM: 43.7 APS: 24.4 box AP: 40.9
object-detection-on-coco-minival	Cascade R-CNN (HRNetV2p-W32)	AP50: 61.7 AP75: 47.7 APL: 57.4 APM: 46.5 APS: 25.6 box AP: 43.7
object-detection-on-coco-minival	HTC (HRNetV2p-W32)	APL: 59.5 APM: 48.4 APS: 27.0 box AP: 45.3
object-detection-on-coco-minival	HTC (HRNetV2p-W18)	APM: 46.0 APS: 26.6 box AP: 43.1
semantic-segmentation-on-cityscapes	HRNetV2 (train+val)	Mean IoU (class): 81.6%
semantic-segmentation-on-cityscapes-val	HRNetV2 (HRNetV2-W40)	mIoU: 80.2
semantic-segmentation-on-cityscapes-val	HRNetV2 (HRNetV2-W48)	mIoU: 81.1
semantic-segmentation-on-dada-seg	HRNet (ACDC)	mIoU: 27.5
semantic-segmentation-on-pascal-context	HRNetV2 HRNetV2-W48	mIoU: 54
semantic-segmentation-on-pascal-context	CFNet (ResNet-101)	mIoU: 54.0
semantic-segmentation-on-potsdam-1	HRNet-18	mIoU: 84.02
semantic-segmentation-on-potsdam-1	HRNet-48	mIoU: 84.22
semantic-segmentation-on-us3d-1	HRNet-18	mIoU: 60.33
semantic-segmentation-on-us3d-1	HRNet-48	mIoU: 72.66
semantic-segmentation-on-vaihingen	HRNet-48	mIoU: 76.75
semantic-segmentation-on-vaihingen	HRNet-18	mIoU: 75.90
thermal-image-segmentation-on-mfn-dataset	HRNet	mIOU: 51.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Deep High-Resolution Representation Learning for Visual Recognition

Wang Jingdong ; Sun Ke ; Cheng Tianheng ; Jiang Borui ; Deng Chaorui ; Zhao Yang ; Liu Dong ; Mu Yadong ; Tan Mingkui ; Wang3 more

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters

Wang Jingdong ; Sun Ke ; Cheng Tianheng ; Jiang Borui ; Deng Chaorui ; Zhao Yang ; Liu Dong ; Mu Yadong ; Tan Mingkui ; Wang