QUICK REVIEW

[论文解读] Human activity recognition based on time series analysis using U-Net

Yong Zhang, Yu Zhang|arXiv (Cornell University)|Sep 20, 2018

Context-Aware Activity Recognition Systems参考文献 29被引用 26

一句话总结

该论文提出了一种基于U-Net的深度学习框架，用于人体活动识别（HAR），将加速度计时间序列视为具有单列多通道表示的图像输入。通过实现无需手动特征提取的像素级活动标注，该方法在四个基准数据集上均实现了最先进（SOTA）的准确率和F1分数，优于SVM、kNN、DT、QDA、CNN和FCN，同时保持了快速的推理速度。

ABSTRACT

Traditional human activity recognition (HAR) based on time series adopts sliding window analysis method. This method faces the multi-class window problem which mistakenly labels different classes of sampling points within a window as a class. In this paper, a HAR algorithm based on U-Net is proposed to perform activity labeling and prediction at each sampling point. The activity data of the triaxial accelerometer is mapped into an image with the single pixel column and multi-channel which is input into the U-Net network for training and recognition. Our proposal can complete the pixel-level gesture recognition function. The method does not need manual feature extraction and can effectively identify short-term behaviors in long-term activity sequences. We collected the Sanitation dataset and tested the proposed scheme with four open data sets. The experimental results show that compared with Support Vector Machine (SVM), k-Nearest Neighbor (kNN), Decision Tree(DT), Quadratic Discriminant Analysis (QDA), Convolutional Neural Network (CNN) and Fully Convolutional Networks (FCN) methods, our proposal has the highest accuracy and F1-socre in each dataset, and has stable performance and high robustness. At the same time, after the U-Net has finished training, our proposal can achieve fast enough recognition speed.

研究动机与目标

解决传统滑动窗口HAR方法中的多分类窗口问题。
在长期序列的每个采样点实现端到端的像素级活动标注。
通过利用深度学习处理原始时间序列输入，消除手动特征工程。
在传统机器学习和深度学习基线方法之上，提升识别准确率和鲁棒性。
在训练后实现快速推理速度，适用于实时HAR应用。

提出的方法

将三轴加速度计时间序列映射为单列多通道图像表示，作为U-Net架构的输入。
端到端训练U-Net模型，以在每个时间步（像素级预测）预测活动类别标签。
利用编码器-解码器架构与跳跃连接，保留序列数据中的空间和时间上下文信息。
应用带有批归一化和ReLU激活函数的卷积层进行特征学习。
使用交叉熵损失函数对序列级标签进行监督训练。
通过将时间步视为一维图像中的空间位置，利用U-Net处理长序列的能力。

实验结果

研究问题

RQ1与传统滑动窗口方法相比，基于U-Net的模型是否能在人体活动识别中实现更优性能？
RQ2将时间序列映射为类似图像的输入，是否能实现无需手动特征提取的更准确、像素级活动标注？
RQ3在多样化的HAR数据集上，该方法在准确率和F1分数方面的表现如何？
RQ4U-Net模型能否在实时应用中保持高鲁棒性和快速推理速度？
RQ5在HAR任务中，与SVM、kNN、DT、QDA、CNN和FCN相比，U-Net具有哪些相对优势？

主要发现

所提出的基于U-Net的HAR方法在所有四个测试数据集上均实现了最高准确率和F1分数，优于SVM、kNN、DT、QDA、CNN和FCN。
该模型在多样化活动序列中表现出稳定性能和高鲁棒性，包括长期数据中的短时行为。
通过逐个标注每个采样点，该方法消除了多分类窗口问题，避免了混合类别窗口内的误分类。
训练完成后，该模型实现了快速推理速度，适用于实时人体活动识别应用。
该方法无需任何手动特征提取，完全依赖从原始加速度计数据进行端到端学习。
单列多通道图像表示有效保留了时间依赖性，实现了准确的序列建模。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。