QUICK REVIEW

[論文レビュー] WaveMan: mmWave-Based Room-Scale Human Interaction Perception for Humanoid Robots

Yuxuan Hu, Kuangji Zuo|arXiv (Cornell University)|Jan 12, 2026

Indoor and Outdoor Localization Technologies被引用数 0

ひとこと要約

WaveMan は部屋規模の humanoid ロボット相互作用で空間的に適応する mmWave 視覚認識を提供し、ジオメトリとスペクトログラムの改善および注意機構に基づく融合によって、制約のないユーザー位置でも頑健なジェスチャ認識を実現します。

ABSTRACT

Reliable humanoid-robot interaction (HRI) in household environments is constrained by two fundamental requirements, namely robustness to unconstrained user positions and preservation of user privacy. Millimeter-wave (mmWave) sensing inherently supports privacy-preserving interaction, making it a promising modality for room-scale HRI. However, existing mmWave-based interaction-sensing systems exhibit poor spatial generalization at unseen distances or viewpoints. To address this challenge, we introduce WaveMan, a spatially adaptive room-scale perception system that restores reliable human interaction sensing across arbitrary user positions. WaveMan integrates viewpoint alignment and spectrogram enhancement for spatial consistency, with dual-channel attention for robust feature extraction. Experiments across five participants show that, under fixed-position evaluation, WaveMan achieves the same cross-position accuracy as the baseline with five times fewer training positions. In random free-position testing, accuracy increases from 33.00% to 94.33%, enabled by the proposed method. These results demonstrate the feasibility of reliable, privacy-preserving interaction for household humanoid robots across unconstrained user positions.

研究の動機と目的

室内空間でユーザー位置が大きく変動しても信頼性のあるプライバシー保護型 humanoid ロボット相互作用を促進する。
異なる視点からの観測を統一空間へ揃える空間適応型知覚パイプラインを開発する。
スペクトログラムの強化と多ドメイン表現を通じて、長距離のまばらさと視点誘発歪みを緩和する。
幾何学・スペクトラム・方向感知特徴を注意に基づく認識ネットワークで融合する。
固定・未知・ランダムな位置での評価を含む humanoid ロボットでの実世界デプロイを実証する。

提案手法

レーダー点群を canonical front-facing 構成へ幾何学的整列して視点誘発歪みを低減する。
Sparse な長距離スペクトラムを密な表現へ翻訳する Enhancer–Reducer 対と CycleGAN スタイル損失を用いた教師なしスペクトログラム強化。
特徴チャネルを再重みづけするデュアルブランチチャネルアテンション（DBCA）モジュールで、異なる位置間の堅牢な認識を実現する。
距離・速度・方向・空間変位ダイナミクスを捉える多ドメインスペクトログラム構築（RT、DT、HT、ET、XT/YT/ZT）。
UDP による gestures を humanoid ロボット挙動へマッピングした外部ワークステーション上で動作するリアルタイム知覚パイプライン。
オンラインストリーミング処理と知覚–行動ループを備え、制約のないユーザー位置でも安定した相互作用を実現する。

Figure 1: Spatially adaptive room-scale interaction scenario. WaveMan aligns observations from different user spatial positions into a unified perception space to mitigate spatial inconsistencies.

実験結果

リサーチクエスチョン

RQ1mmWave ベースのセンシングを、部屋規模の humanoid ロボット相互作用において、異なるユーザー位置・視点に対してどの程度頑健にできるか。
RQ2幾何学的整列・スペクトログラム強化・注意ベースの融合は、未知の空間配置に対して信頼性の高いジェスチャ認識をもたらすか。
RQ3空間整列とスペクトル強化が、クロス位置一般化とランダム位置（自由視点）ジェスチャ認識に与える影響は何か。
RQ4提案システムは、 humanoid ロボットとの閉ループ知覚–行動をリアルタイムで実装可能か。

主な発見

未知位置での精度は WaveMan を用いると劇的に改善され、例えば単一の訓練位置で 80.35%、ベースライン 60.57%。
5つの訓練位置を用いると、未知精度は WaveMan が 97.67% に達し、ベースラインは 80% 未満のままである。
ランダムな自由位置テストで WaveMan は 94.33% の精度を達成し、ベースラインの 33.51% から 60.82 ポイント向上。
クロス位置性能は複数の構成で維持され、未知の視点・距離への一般化が強力である。
エンドツーエンドの Enhancer–Recognition パイプラインはサンプルあたり約 5.45 ms で実行され、通常のハードウェアでリアルタイムの人間–ロボット相互作用をサポートする。
データセットは 5 名の参加者が 5 ジェスチャークラスを実施し、部屋規模の室内設定で 12,000 サンプルを収集した。

Figure 2: Overview of the proposed spatially adaptive interaction framework. (a) Radar point-cloud data captured under diverse positional configurations are spatially aligned and transformed into spectrogram representations. (b) Sparse spectrograms are enhanced and fused with dense spectra to obtain

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。