[論文レビュー] Gesture Recognition from body-Worn RFID under Missing Data
The paper develops a gesture recognition system using body-worn passive RFID tags, introducing missing-data handling via interpolation, imputation, and a graph-attention CNN, achieving 98.13% accuracy on 21 gestures and 89.28% with leave-one-person-out.
We explore hand-gesture recognition through the use of passive body-worn reflective tags. A data processing pipeline is proposed to address the issue of missing data. Specifically, missing information is recovered through linear and exponential interpolation and extrapolation. Furthermore, imputation and proximity-based inference are employed. We represent tags as nodes in a temporal graph, with edges formed based on correlations between received signal strength (RSS) and phase values across successive timestamps, and we train a graph-based convolutional neural network that exploits graph-based self-attention. The system outperforms state-of-the-art methods with an accuracy of 98.13% for the recognition of 21 gestures. We achieve 89.28% accuracy under leave-one-person-out cross-validation. We further investigate the contribution of various body locations on the recognition accuracy. Removing tags from the arms reduces accuracy by more than 10%, while removing the wrist tag only reduces accuracy by around 2%. Therefore, tag placements on the arms are more expressive for gesture recognition than on the wrist.
研究の動機と目的
- Motivate robust gesture recognition from body-worn RFID tags despite tag miss-detections and data loss.
- Propose a data processing pipeline including interpolation, imputation, and normalization to recover missing information.
- Introduce a graph-based neural network that uses RSS and phase correlations across tags and time for classification.
- Evaluate the system across multiple environments, distances, and subjects to analyze placement and robustness.
提案手法
- Represent eight body-worn RFID tags as nodes in a temporal graph with edges based on RSS/phase correlations across timestamps.
- Apply phase unwrapping, normalization, and Savitzky–Golay and Gaussian smoothing to denoise signals.
- Use linear and exponential interpolation to fill sparse leading/trailing zero values and zero-padding for missing samples.
- Perform within-class proximity-based imputation using Mean Euclidean Distance and spatial proximity (tag placement) to fill null dataframes.
- Construct a graph neural network with temporal-KNN graph construction and self-attention for message passing and aggregation.
- Structure input as a 4D tensor [B, T, N, D] with T=30, N=8, D=2 (RSS and phase) for graph learning.
実験結果
リサーチクエスチョン
- RQ1Can body-worn RFID backscatter signals from multiple tags reliably distinguish 21 hand gestures under missing data conditions?
- RQ2How do data cleaning strategies (interpolation, imputation) influence recognition performance and robustness to tag loss?
- RQ3What is the impact of tag placement on recognition accuracy and how can a graph-based model leverage correlations across tags and time?
- RQ4Does a graph-based self-attention CNN outperform traditional RF/RSS-based classifiers under leave-one-person-out scenarios?
主な発見
| Method | Acc. | Pre. | Rec. | F1 |
|---|---|---|---|---|
| RFC with SP | 83.56 | 83.72 | 83.56 | 83.42 |
| RFC with SWP | 86.26 | 86.48 | 86.20 | 86.07 |
| RFC with SPR | 95.25 | 95.35 | 95.25 | 95.23 |
| Early Fusion | 83.62 | 84.25 | 83.57 | 83.34 |
| Late Fusion | 87.13 | 88.32 | 87.08 | 86.91 |
| EUIGR | 80.40 | 78.94 | 80.07 | 78.73 |
| GRfid | 30.34 | 31.40 | 30.16 | 30.22 |
| Our model | 98.13 | 98.19 | 98.13 | 98.13 |
| RFC with SP | 85.07 | 86.29 | 85.02 | 85.13 |
| RFC with SWP | 85.71 | 86.72 | 85.71 | 85.72 |
| RFC with SPR | 93.52 | 94.35 | 93.52 | 93.59 |
| Early Fusion | 81.01 | 86.08 | 81.00 | 81.18 |
| Late Fusion | 89.41 | 90.18 | 89.39 | 89.39 |
| EUIGR | 80.37 | 74.75 | 80.33 | 75.97 |
| GRfid | 29.39 | 29.14 | 29.34 | 28.89 |
| Our model | 96.82 | 97.88 | 96.80 | 97.02 |
| RFC with SP | 84.12 | 84.67 | 84.07 | 83.96 |
| RFC with SWP | 81.26 | 82.39 | 81.26 | 80.93 |
| RFC with SPR | 93.80 | 94.08 | 93.78 | 93.71 |
| Early Fusion | 91.13 | 94.16 | 91.13 | 91.56 |
| Late Fusion | 90.15 | 91.13 | 90.15 | 90.15 |
| EUIGR | 87.30 | 83.75 | 87.27 | 84.55 |
| GRfid | 34.43 | 36.30 | 33.87 | 34.33 |
| Our model | 98.41 | - | - | - |
- Achieves 98.13% accuracy on 21 gestures in within-user testing and 89.28% with leave-one-person-out cross-validation.
- Single-hand gesture accuracy reaches 98.27% for some gestures, with 16 of 21 gestures at 100% in within-user tests.
- Removing arm-mounted tags significantly reduces accuracy ( >10% drop when arms’ tags are removed; wrist-tag removal only ~2% drop).
- All eight tags contribute to high performance; certain tags (T4 and T8) have the largest impact when omitted.
- The proposed graph-attention framework outperforms RF-based baselines (e.g., RFC with SPR, Early/Late Fusion, EUIGR, GRfid) across three datasets at distances 3m and 1.5m.
より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。