QUICK REVIEW

[论文解读] WaveMan: mmWave-Based Room-Scale Human Interaction Perception for Humanoid Robots

Yuxuan Hu, Kuangji Zuo|arXiv (Cornell University)|Jan 12, 2026

Indoor and Outdoor Localization Technologies被引用 0

一句话总结

WaveMan 提供一个时空自适应的 mmWave 感知系统，通过几何和频谱增强以及基于注意力的融合，在无约束用户位置下实现鲁棒的手势识别。

ABSTRACT

Reliable humanoid-robot interaction (HRI) in household environments is constrained by two fundamental requirements, namely robustness to unconstrained user positions and preservation of user privacy. Millimeter-wave (mmWave) sensing inherently supports privacy-preserving interaction, making it a promising modality for room-scale HRI. However, existing mmWave-based interaction-sensing systems exhibit poor spatial generalization at unseen distances or viewpoints. To address this challenge, we introduce WaveMan, a spatially adaptive room-scale perception system that restores reliable human interaction sensing across arbitrary user positions. WaveMan integrates viewpoint alignment and spectrogram enhancement for spatial consistency, with dual-channel attention for robust feature extraction. Experiments across five participants show that, under fixed-position evaluation, WaveMan achieves the same cross-position accuracy as the baseline with five times fewer training positions. In random free-position testing, accuracy increases from 33.00% to 94.33%, enabled by the proposed method. These results demonstrate the feasibility of reliable, privacy-preserving interaction for household humanoid robots across unconstrained user positions.

研究动机与目标

促成在 indoor 空间中用户位置差异较大的情况下实现可靠且隐私保护的人形机器人交互。
开发一个时空自适应感知流程，将来自不同视角的观测对齐到统一空间。
通过频谱增强和多域表示缓解远距离稀疏性与视角引起的畸变。
通过基于注意力的识别网络融合几何、光谱和方向感知特征。
展示在真实世界的人形机器人部署，并在固定、未知和随机位置进行评估。

提出的方法

将雷达点云几何对齐到标准前向配置，以减少视角引起的畸变。
无监督的频谱增强，利用 Enhancer–Reducer 对及 CycleGAN 风格的损失将稀疏的长距离光谱转化为密集表示。
双分支通道注意力（DBCA）模块，对特征通道进行再加权，以实现跨位置鲁棒识别。
多域频谱构建（RT、DT、HT、ET、XT/YT/ZT），捕捉距离、速度、方向和空间位移动态。
实时感知流程在外部工作站上运行，UDP 感知手势映射到人形机器人行为。
在线流处理，带感知–行动循环，实现在无约束用户位置下的稳定交互。

Figure 1: Spatially adaptive room-scale interaction scenario. WaveMan aligns observations from different user spatial positions into a unified perception space to mitigate spatial inconsistencies.

实验结果

研究问题

RQ1如何使基于 mmWave 的感知在房间尺度的人形机器人交互中对不同用户位置和视角具有鲁棒性？
RQ2几何对齐、频谱增强和基于注意力的融合是否能在未见 Spatial 配置下实现可靠的手势识别？
RQ3空间对齐和光谱增强对跨位置泛化与随机位置（自由视角）手势识别有何影响？
RQ4所提系统是否具备适合与人形机器人闭环感知–行动的实时运行能力？

主要发现

使用 WaveMan 时，未知位置的准确率显著提升，例如在仅用一个训练位置时达到 80.35%，而基线为 60.57%。
若用五个训练位置，未知位置准确率达到 97.67%（WaveMan），而基线始终低于 80%。
随机自由位置测试下 WaveMan 的准确率为 94.33%，较基线 33.51% 提升了 60.82 个百分点。
跨位置的性能提升在多种配置中保持，显示对未知视角和距离的强泛化能力。
端到端的增强–识别流水线每个样本约 5.45 ms，能够在典型硬件上实现实时的人机交互。
数据集包含五位参与者执行五种手势类别，在房间尺度室内环境中共采集 12,000 个样本。

Figure 2: Overview of the proposed spatially adaptive interaction framework. (a) Radar point-cloud data captured under diverse positional configurations are spatially aligned and transformed into spectrogram representations. (b) Sparse spectrograms are enhanced and fused with dense spectra to obtain

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。