QUICK REVIEW

[论文解读] 3D Hand Pose Tracking and Estimation Using Stereo Matching

Jiawei Zhang, Jianbo Jiao|arXiv (Cornell University)|Oct 23, 2016

Advanced Vision and Imaging参考文献 34被引用 122

一句话总结

提出一个基于立体视觉的被动立体框架用于3D手部姿态跟踪与估计，配备专用的在线皮肤颜色模型和受约束的立体匹配，以及一个包含 18k 对立体/深度图的手部姿态基准测试。

ABSTRACT

3D hand pose tracking/estimation will be very important in the next generation of human-computer interaction. Most of the currently available algorithms rely on low-cost active depth sensors. However, these sensors can be easily interfered by other active sources and require relatively high power consumption. As a result, they are currently not suitable for outdoor environments and mobile devices. This paper aims at tracking/estimating hand poses using passive stereo which avoids these limitations. A benchmark with 18,000 stereo image pairs and 18,000 depth images captured from different scenarios and the ground-truth 3D positions of palm and finger joints (obtained from the manual label) is thus proposed. This paper demonstrates that the performance of the state-of-the art tracking/estimation algorithms can be maintained with most stereo matching algorithms on the proposed benchmark, as long as the hand segmentation is correct. As a result, a novel stereo-based hand segmentation algorithm specially designed for hand tracking/estimation is proposed. The quantitative evaluation demonstrates that the proposed algorithm is suitable for the state-of-the-art hand pose tracking/estimation algorithms and the tracking quality is comparable to the use of active depth sensors under different challenging scenarios.

研究动机与目标

通过使用被动立体而非主动深度传感器，推动室外和移动友好的人体跟踪。
在新的基准数据集上评估最先进的手部跟踪/估计方法在被动立体下的表现。
开发针对稳健手部跟踪与估计的立体分割方法。
提出一个在线训练的皮肤颜色模型，以实现从彩色图像中可靠的手部分割。
证明在挑战性条件下，被动立体能够达到与主动传感器相当的跟踪/估计性能。

提出的方法

通过自适应GMM进行在线前景/背景分割，以训练场景特定的皮肤颜色模型用于手检测。
计算皮肤颜色概率和手部可能性，并结合前一帧的深度信息进行稳健的手分割。
评估多种立体匹配方法（局部/全局）及不同的匹配代价和聚合方式，以建立基线性能。
引入一个受约束的立体匹配框架，利用皮肤颜色引导和代价置信度来稳定手部区域附近的视差。
用中间深度图调整立体匹配代价，以减少背景噪声并改善手部深度估计。
整合最先进的手部姿态跟踪/估计算法（PSO、ICPPSO、CHPR）以评估在被动立体下的性能。

实验结果

研究问题

RQ1在手部分割准确的前提下，被动立体在3D手部姿态跟踪/估计方面相比主动深度传感器表现如何？
RQ2是否可以通过在线皮肤颜色模型引导的立体分割方法，在不同背景和姿势下实现与主动传感器相当的姿态精度？
RQ3不同的立体匹配代价、聚合和视差优化对在被动立体下的手部姿态跟踪/估计精度有何影响？
RQ4为手部跟踪定制的受约束立体方法是否能在纹理缺乏或室内环境具有挑战性的场景中提高鲁棒性？

主要发现

推出一个包含 18,000 对立体图像和 18,000 张深度图的手部姿态基准，具有地面实物 3D 关节位置。
大多数立体匹配方法在手部分割正确时，其跟踪/估计性能可与主动深度传感器相媲美。
利用皮肤颜色引导和代价置信度的受约束立体匹配算法在纹理缺乏区域提高了手部跟踪的鲁棒性。
提出的基于立体的在线皮肤颜色训练方法在六种背景和两种姿势类型下，其跟踪精度接近主动传感器。
随机森林/CHPR 基于的估计方法在得到提出的立体分割辅助时，可以实现具有竞争力的关节准确度。
Meshstereo 在本任务中表现较差，而 PSO/ICPPSO 的鲁棒性因背景和分割质量而异。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。