[论文解读] The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation
本文提出在人体姿态估计管线中识别偏置数据处理,并提出无偏数据处理(UDP)以实现显著的、模型无关的性能提升,且不增加额外延迟,在 COCO 和 CrowdPose 上有示范。
Being a fundamental component in training and inference, data processing has not been systematically considered in human pose estimation community, to the best of our knowledge. In this paper, we focus on this problem and find that the devil of human pose estimation evolution is in the biased data processing. Specifically, by investigating the standard data processing in state-of-the-art approaches mainly including coordinate system transformation and keypoint format transformation (i.e., encoding and decoding), we find that the results obtained by common flipping strategy are unaligned with the original ones in inference. Moreover, there is a statistical error in some keypoint format transformation methods. Two problems couple together, significantly degrade the pose estimation performance and thus lay a trap for the research community. This trap has given bone to many suboptimal remedies, which are always unreported, confusing but influential. By causing failure in reproduction and unfair in comparison, the unreported remedies seriously impedes the technological development. To tackle this dilemma from the source, we propose Unbiased Data Processing (UDP) consist of two technique aspect for the two aforementioned problems respectively (i.e., unbiased coordinate system transformation and unbiased keypoint format transformation). As a model-agnostic approach and a superior solution, UDP successfully pushes the performance boundary of human pose estimation and offers a higher and more reliable baseline for research community. Code is public available in https://github.com/HuangJunJie2017/UDP-Pose
研究动机与目标
- 在姿态估计系统中动机与被忽略的数据处理偏差进行说明。
- 定义无偏坐标系变换与无偏关键点格式变换。
- 提供模型无关的 UDP 框架并分析其对最先进方法的影响。
提出的方法
- 将数据定义在连续空间中以形式化坐标系变换。
- 推导裁剪、缩放、旋转和翻转的无偏变换;证明无偏管线(方程式 2、3、9-13)。
- 引入无偏的关键点格式变换以及无偏解码/编码策略(与热力图的关系)。
- 诊断在像素级尺寸导致错位的偏置管线,并演示改正方法。
- 在 COCO 和 CrowdPose 上评估 UDP,以展示在上游(top-down)和下游(bottom-up)方法上的性能和延迟收益。
实验结果
研究问题
- RQ1在姿态估计管线中常用的坐标系变换存在哪些偏差?
- RQ2如何设计坐标系和关键点格式变换以实现无偏?
- RQ3UDP 是否在标准基准上对不同姿态估计范式(上游和下游)提升准确度和/或速度?
主要发现
- UDP 将上游 SimpleBaseline 从 70.2 提升到 71.7 AP(ResNet50-256×192),并将从 71.9 提升到 72.9 AP(ResNet152-256×192)。
- UDP 将 HRNet W32-256×192 从 73.5 提升到 75.2 AP,W48-256×192 从 74.3 提升到 75.7 AP。
- UDP 将 HRNet-W48-384×288 提升至 76.5 AP,创建了顶尖的上游姿态估计新状态。
- 在下游 HRNet-W32-512×512 中,UDP 获得 2.7 AP 的增益且推断速度提高 6.1 倍;HigherHRNet 也获益,同时降低了延迟。
- CrowdPose 实验表明 UDP 的泛化能力不仅限于 COCO。
- UDP 成为 2020 年 COCO 关键点检测挑战冠军(UDP++)的关键基线。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。