QUICK REVIEW

[论文解读] Simple and Lightweight Human Pose Estimation

Zhe Zhang, Jie Tang|arXiv (Cornell University)|Nov 23, 2019

Human Pose and Action Recognition参考文献 41被引用 37

一句话总结

简述：提出一个轻量级的 Lightweight Pose Network（LPN），通过重新设计的瓶颈块、迭代训练和 Beta-Soft-Argmax 后处理，在模型规模更小且 CPU 推理更快的条件下，获得与 COCO 姿态结果竞争的性能。

ABSTRACT

Recent research on human pose estimation has achieved significant improvement. However, most existing methods tend to pursue higher scores using complex architecture or computationally expensive models on benchmark datasets, ignoring the deployment costs in practice. In this paper, we investigate the problem of simple and lightweight human pose estimation. We first redesign a lightweight bottleneck block with two non-novel concepts: depthwise convolution and attention mechanism. And then, based on the lightweight block, we present a Lightweight Pose Network (LPN) following the architecture design principles of SimpleBaseline. The model size (#Params) of our small network LPN-50 is only 9% of SimpleBaseline(ResNet50), and the computational complexity (FLOPs) is only 11%. To give full play to the potential of our LPN and get more accurate predicted results, we also propose an iterative training strategy and a model-agnostic post-processing function Beta-Soft-Argmax. We empirically demonstrate the effectiveness and efficiency of our methods on the benchmark dataset: the COCO keypoint detection dataset. Besides, we show the speed superiority of our lightweight network at inference time on a non-GPU platform. Specifically, our LPN-50 can achieve 68.7 in AP score on the COCO test-dev set, with only 2.7M parameters and 1.0 GFLOPs, while the inference speed is 17 FPS on an Intel i7-8700K CPU machine.

研究动机与目标

说明资源受限部署环境下需要简单、轻量级的HPE模型的必要性。
提出一个轻量化的瓶颈块和整体架构（LPN），在降低参数量和 FLOPs 的同时维持精度。
提出训练和后处理策略，以在不进行大规模预训练或复杂流程的情况下最大化性能。
在 COCO 数据集上展示 LPN 的效率与准确性，包括 CPU 推理性能。

提出的方法

使用深度卷积和全局上下文（GC）注意块重新设计一个轻量级瓶颈块。
通过替换 SimpleBaseline 风格骨干网络中的标准瓶颈并简化上采样，构建 Lightweight Pose Network（LPN）。
引入一种迭代训练策略，通过在重置学习率后重新启动训练来更好地优化小型网络。
提出 Beta-Soft-Argmax 作为模型无关的后处理步骤，以从热力图获得连续且更准确的关键点坐标。
在 COCO 上评估：比较参数量、FLOPs、AP 指标，以及与基线架构相比的 CPU 推理速度。

实验结果

研究问题

RQ1轻量化的瓶颈块结合深度卷积与 GC 注意是否能在显著减小模型大小和计算量的同时保持姿态估计性能？
RQ2迭代训练策略是否比在大型数据集上的传统预训练更能提升小型网络的性能？
RQ3Beta-Soft-Argmax 在不同骨干网络上是否能在不改变训练流程的情况下提高关键点定位精度？
RQ4就准确性与 CPU 推理速度而言，LPN 与最先进方法在 COCO 上的表现如何？

主要发现

LPN-50 在验证/测试设置下达到 68.7–69.1 AP，参数量仅 2.7–2.9M，FLOPs 约 1.0 G，CPU 上 17 FPS。
与 SimpleBaseline-50 相比，LPN-50 的参数量仅为其的 9%，FLOPs 为 11%，AP 差距约为 1.3。
在小网络上添加 GC 块可带来显著增益（例如 LPN-50 最高提升至 +2.5 AP）。
迭代训练策略一致提升 AP，在阶段性训练中对 LPN-50 的累计提升约为 2.0 AP。
Beta-Soft-Argmax 提供了模型无关的改进（约提升 0.3 AP 左右），在 beta 值约为 160 时对不同骨干仍然有效。
Beta-Soft-Argmax 在多种架构上超过纯 Argmax，且随着骨干网络复杂度增加，增益增大。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。