基于毫米波雷达三维点云的人体动作识别数据集与方法

靳标; 孙康圣; 吴昊; 李子璇; 张贞凯; 蔡焱; 李荣民; 张向群; 杜根远

doi:10.12000/JR24195

基于毫米波雷达三维点云的人体动作识别数据集与方法

DOI: 10.12000/JR24195

1.
江苏科技大学海洋学院镇江 212003
2.
苏州市智瞳威视科技有限公司苏州 215000
3.
许昌学院信息工程学院许昌 461000

基金项目: 国家自然科学基金(61701416)，江苏省自然科学基础研究计划面上项目(BK20211341)，河南省重点研发专项(241111212500)，江苏省研究生科研与实践创新计划项目(SJCX24_2605)

详细信息

作者简介:
靳　标，博士，副教授，主要研究方向为认知雷达目标跟踪和毫米波雷达信息感知等

孙康圣，硕士生，主要研究方向为毫米波雷达信息感知

吴　昊，硕士生，主要研究方向为毫米波雷达信息感知

李子璇，硕士生，主要研究方向为水声目标探测

张贞凯，博士，教授，主要研究方向为雷达目标跟踪和雷达资源管理

蔡　焱，硕士，工程师，主要研究方向为毫米波雷达信息感知及产业化

李荣民，硕士，工程师，主要研究方向为毫米波雷达信息感知及产业化

张向群，博士，教授，主要研究方向为毫米波雷达信息感知

杜根远，博士，教授，主要研究方向为遥感图像处理和毫米波雷达信息感知等

通讯作者:
靳标 biaojin@just.edu.cn

责任主编：金添 Corresponding Editor: JIN Tian
中图分类号: TN958.94
计量
- 文章访问数: 199
- HTML全文浏览量: 44
- PDF下载量: 23
- 被引次数: 0
出版历程
- 收稿日期: 2024-09-29
- 修回日期: 2025-01-04
- 网络出版日期: 2025-01-15

3D Point Cloud from Millimeter-wave Radar for Human Action Recognition: Dataset and Method

1.
Jiangsu University of Science and Technology, Ocean College, Zhenjiang 212003, China
2.
Suzhou Zadar Vision Technology Co., Ltd., Suzhou 215000, China
3.
Xuchang University, College of Information Engineering, Xuchang 461000, China

Funds: The National Natural Science Foundation of China (61701416), Natural Science Foundation of Jiangsu Province of China (BK20211341), Key Research and Development Project of Henan Province (241111212500), Postgraduate Research & Practice Innovation Program of Jiangsu Province (SJCX24_2605)

More Information

Corresponding author: JIN Biao, biaojin@just.edu.cn

摘要

摘要: 毫米波雷达凭借其出色的环境适应性、高分辨率和隐私保护等优势，在智能家居、智慧养老和安防监控等领域具有广泛的应用前景。毫米波雷达三维点云是一种重要的空间数据表达形式，对于人体行为姿态识别具有极大的价值。然而，由于毫米波雷达点云具有强稀疏性，给精准快速识别人体动作带来了巨大的挑战。针对这一问题，该文公开了一个毫米波雷达人体动作三维点云数据集mmWave-3DPCHM-1.0，并提出了相应的数据处理方法和人体动作识别模型。该数据集由TI公司的IWR1443-ISK和Vayyar公司的vBlu射频成像模组分别采集，包括常见的12种人体动作，如走路、挥手、站立和跌倒等。在网络模型方面，该文将边缘卷积(EdgeConv)与Transformer相结合，提出了一种处理长时序三维点云的网络模型，即Point EdgeConv and Transformer (PETer)网络。该网络通过边缘卷积对三维点云逐帧创建局部有向邻域图，以提取单帧点云的空间几何特征，并通过堆叠多个编码器的Transformer模块，提取多帧点云之间的时序关系。实验结果表明，所提出的PETer网络在所构建的TI数据集和Vayyar数据集上的平均识别准确率分别达到98.77%和99.51%，比传统最优的基线网络模型提高了大约5%，且网络规模仅为1.09 M，适于在存储受限的边缘设备上部署。
- 人体动作识别 /
- 毫米波雷达 /
- 三维点云 /
- 深度学习 /
- 卷积神经网络
Abstract: Millimeter-wave radar is increasingly being adopted for smart home systems, elder care, and surveillance monitoring, owing to its adaptability to environmental conditions, high resolution, and privacy-preserving capabilities. A key factor in effectively utilizing millimeter-wave radar is the analysis of point clouds, which are essential for recognizing human postures. However, the sparse nature of these point clouds poses significant challenges for accurate and efficient human action recognition. To overcome these issues, we present a 3D point cloud dataset tailored for human actions captured using millimeter-wave radar (mmWave-3DPCHM-1.0). This dataset is enhanced with advanced data processing techniques and cutting-edge human action recognition models. Data collection is conducted using Texas Instruments (TI)’s IWR1443-ISK and Vayyar’s vBlu radio imaging module, covering 12 common human actions, including walking, waving, standing, and falling. At the core of our approach is the Point EdgeConv and Transformer (PETer) network, which integrates edge convolution with transformer models. For each 3D point cloud frame, PETer constructs a locally directed neighborhood graph through edge convolution to extract spatial geometric features effectively. The network then leverages a series of Transformer encoding models to uncover temporal relationships across multiple point cloud frames. Extensive experiments reveal that the PETer network achieves exceptional recognition rates of 98.77% on the TI dataset and 99.51% on the Vayyar dataset, outperforming the traditional optimal baseline model by approximately 5%. With a compact model size of only 1.09 MB, PETer is well-suited for deployment on edge devices, providing an efficient solution for real-time human action recognition in resource-constrained environments.
- Human Action Recognition (HAR) /
- Millimeter-wave radar /
- 3D point cloud /
- Deep learning /
- Convolutional Neural Networks (CNN)

HTML全文

图 1 TI的IWR1443-ISK毫米波雷达

Figure 1. TI’s IWR1443-ISK millimeter-wave radar

下载: 全尺寸图片幻灯片

图 2 Vayyar的vBlu射频成像模组

Figure 2. Vayyar’s vBlu RF imaging module

下载: 全尺寸图片幻灯片

图 3 毫米波雷达三维点云生成过程

Figure 3. 3D point cloud generation process for millimeter-wave radar

下载: 全尺寸图片幻灯片

图 4 数据采集场景

Figure 4. Data acquisition scenario

下载: 全尺寸图片幻灯片

图 5 数据采集区域

Figure 5. Data collection area

下载: 全尺寸图片幻灯片

图 6 人体动作类型

Figure 6. Types of human actions

下载: 全尺寸图片幻灯片

图 7 张开双臂动作的点云数据示例

Figure 7. Example of point cloud data for open arms action

下载: 全尺寸图片幻灯片

图 8 基于毫米波雷达点云数据的人体动作识别流程

Figure 8. Human action recognition process based on 3D point cloud data in millimeter-wave radar

下载: 全尺寸图片幻灯片

图 9 动态干扰滤除

Figure 9. Dynamic interference filtering

下载: 全尺寸图片幻灯片

图 10 TI雷达点云经过多帧融合与聚类的效果对比

Figure 10. Comparison of the effect of TI radar point cloud after multi-frame fusion and clustering

下载: 全尺寸图片幻灯片

图 11 PETer网络结构图

Figure 11. Structure diagram of PETer network

下载: 全尺寸图片幻灯片

图 12 利用边缘卷积构建局部有向邻域图

Figure 12. Construction of local directed neighborhood graph using edge convolution

下载: 全尺寸图片幻灯片

图 13 不同融合帧数时的识别准确率

Figure 13. Accuracy with different fused frames

下载: 全尺寸图片幻灯片

图 14 不同样本划分时的识别准确率

Figure 14. Accuracy with different sample divisions

下载: 全尺寸图片幻灯片

图 15 不同采样点数时的识别准确率

Figure 15. Recognition accuracy with different sampling points

下载: 全尺寸图片幻灯片

图 16 不同K值的识别准确率

Figure 16. Recognition accuracy with different K values

下载: 全尺寸图片幻灯片

图 17 PETer网络在TI数据集上不同学习率时的准确率和损失曲线

Figure 17. Accuracy and loss curves of PETer network with different learning rates on TI dataset

下载: 全尺寸图片幻灯片

图 18 PETer网络在Vayyar数据集上不同学习率时的准确率和损失曲线

Figure 18. Accuracy and loss curves of PETer network for different learning rates on Vayyar dataset

下载: 全尺寸图片幻灯片

图 19 PETer网络的混淆矩阵

Figure 19. Confusion matrix of PETer’s network

下载: 全尺寸图片幻灯片

1 面向人体动作识别的毫米波雷达三维点云数据集1.0发布网页

1. Release webpage of 3D point cloud dataset from millimeter-wave radar for human action recognition (mmWave-3DPCHM-1.0)

下载: 全尺寸图片幻灯片

表 1 TI毫米波雷达的参数配置

Table 1. Parameter configuration of TI millimeter-wave radar

参数	数值
天线数	3发4收
工作频段	77 GHz
信号形式	FMCW
信号带宽	3.4 GHz (最大带宽为4 GHz)
帧频率	30 帧/s
距离分辨率	4.4 cm
方位维角度分辨率	15°

下载: 导出CSV

表 2 Vayyar毫米波雷达的参数配置

Table 2. Parameter configuration of Vayyar millimeter-wave radar

参数	数值
天线数	24发22收
工作频段	60 GHz
信号形式	FMCW
信号带宽	480 MHz (最大可达到2.5 GHz)
帧频率	10 帧/s
距离分辨率	31.25 cm
方位维/俯仰维角度分辨率	6°

下载: 导出CSV

表 3 志愿者信息

Table 3. Information of volunteers

人员	身高(cm)	体重(kg)	年龄	性别	S1场景
People1	183	90	23	男	√
People2	160	45	23	女	√
People3	178	80	24	男	√
People4	173	65	25	男	√
People5	188	80	25	男	√
People6	176	65	25	男	√
People7	172	75	24	男	√

下载: 导出CSV

表 4 文件名称

Table 4. File names

文件夹名(动作)	文件名示例
Box	people1_box_1.xlsx, people1_box_2.xlsx, ···, people7_box_3.xlsx
Fall	people1_fall_1.xlsx, people1_ fall_2.xlsx, ···, people7_ fall_3.xlsx
Jump	people1_jump_1.xlsx, people1_ jump_2.xlsx, ···, people7_ jump _3.xlsx
Left hand wave	people1_left hand wave_1.xlsx, people1_ left hand wave_2.xlsx, ···, people7_ left hand wave_3.xlsx
Left forerake	people1_left forerake_1.xlsx, people1_left forerake_2.xlsx, ···, people7_left forerake_3.xlsx
Open arms	people1_open arms_1.xlsx, people1_open arms_2.xlsx, ···, people7_open arms_3.xlsx
Right hand wave	people1_right hand wave_1.xlsx, people1_right hand wave_2.xlsx, ···, people7_right hand wave_3.xlsx
Right forerake	people1_right forerake_1.xlsx, people1_right forerake_2.xlsx, ···, people7_right forerake_3.xlsx
Sit	people1_sit_1.xlsx, people1_sit_2.xlsx, ···, people7_sit_3.xlsx
Squat	people1_squat_1.xlsx, people1_squat_2.xlsx, ···, people7_squat_3.xlsx
Stand	people1_stand_1.xlsx, people1_stand_2.xlsx, ···, people7_stand_3.xlsx
Walk	people1_walk_1.xlsx, people1_walk_2.xlsx, ···, people7_walk_3.xlsx

下载: 导出CSV

表 5 数据格式

Table 5. Data format

Frame	Point number	x	y	z	Intensity
0	1	1.515625000	–0.291015625	0.177734375	20.53078461
1	1	1.605468750	1.562500000	–0.298828125	27.83903503
1	2	1.634765625	–0.367187500	–0.058593750	31.11934280
1	3	1.507812500	–0.339843700	0.164062500	22.17483902
2	1	1.683593750	–0.494140625	–0.126953125	25.19828033
2	2	1.677734375	–0.576171875	0.490234375	27.63427925
$\vdots $	$\vdots $	$\vdots $	$\vdots $	$\vdots $	$\vdots $
5399	4	1.765625	–0.40234375	0.326171875	26.33468437

下载: 导出CSV

表 6 边缘卷积模块的参数配置

Table 6. Parameter configuration of edge convolution module

类型	Num×kernel_size, Stride	输出大小 (Batch, Channel, Length, d)
Input	--	(32, 4, 100, -)
KNN graph-1	K=10	(32, 8, 100, 10)
Conv2d-2	64×(1, 1), 1	(32, 64, 100, 10)
MaxPool1d-3	1×10	(32, 64, 100, -)
Conv2d-4	128×(1, 1), 1	(32, 128, 100, -)
MaxPool1d-5	1×100	(32, 128, -, -)

下载: 导出CSV

表 7 不同网络模块组合的识别准确率(%)

Table 7. Recognition accuracy for different combinations of network modules (%)

方法	TI数据识别率	Vayyar数据识别率
PointNet + Transformer	94.12	95.23
(PointNet++)+ Transformer	96.13	97.78
EdgeConv + LSTM	94.26	99.51
EdgeConv + GRU	92.42	94.45
EdgeConv + RNN	94.26	96.23
PETer (EdgeConv+Transformer)	98.73	99.51

下载: 导出CSV

表 8 不同网络模型在TI数据集和Vayyar数据集上的动作识别准确率(%)

Table 8. Action recognition accuracy of different network models on TI and Vayyar dataset (%)

数据集	模型	打拳	跌倒	跳	左前倾	左挥手	开双臂	右前倾	右挥手	静坐	下蹲	站立	步行	平均
TI	PointNet	75.97	99.84	68.95	70.48	60.76	72.92	55.52	62.54	84.48	78.57	85.40	75.75	74.26
	PointNet++	87.88	98.98	74.01	75.88	77.22	72.99	57.11	66.91	88.55	87.56	94.78	83.36	80.44
	P4Transformer	98.23	98.04	81.76	80.50	82.66	87.36	85.30	88.93	96.85	92.99	96.78	89.93	89.94
	SequentialPointNet	97.02	98.74	92.87	93.52	92.29	98.28	85.36	89.86	98.80	97.60	94.01	86.91	93.77
	PETer (ours)	99.40	99.06	97.25	99.90	97.85	99.20	99.36	96.41	99.53	99.80	98.96	98.49	98.77
Vayyar	PointNet	88.83	98.28	72.13	76.74	66.59	81.13	55.84	68.09	90.14	74.74	80.75	78.21	77.62
	PointNet++	94.30	99.68	71.30	64.24	78.08	86.26	69.20	73.53	91.38	83.84	94.53	84.47	82.57
	P4Transformer	98.41	98.87	79.07	92.55	88.73	94.20	93.13	80.89	98.80	98.54	94.56	97.64	92.95
	SequentialPointNet	99.81	99.71	86.20	97.15	92.45	98.06	94.23	87.43	98.74	98.12	97.28	97.83	95.58
	PETer (ours)	99.97	100.00	100.00	99.63	97.25	99.66	99.60	99.43	100.00	100.00	98.76	99.83	99.51

下载: 导出CSV

表 9 不同网络模型的计算量与复杂度对比

Table 9. Comparison of computational load and complexity of different network models

模型	规模 (MB)	GFLOPS	参数量 (M)
PointNet	13.21	28.83	3.46
PointNet++	5.60	55.87	1.47
P4Transformer	161.51	443.65	42.34
SequentialPointNet	11.41	219.23	2.99
PETer (ours)	1.09	4.62	0.35

下载: 导出CSV

参考文献(26)

[1]	AHMAD T, JIN Lianwen, ZHANG Xin, et al. Graph convolutional neural network for human action recognition: A comprehensive survey[J]. IEEE Transactions on Artificial Intelligence, 2021, 2(2): 128–145. doi: 10.1109/TAI.2021.3076974.
[2]	金添, 宋永坤, 戴永鹏, 等. UWB-HA4D-1.0: 超宽带雷达人体动作四维成像数据集[J]. 雷达学报, 2022, 11(1): 27–39. doi: 10.12000/JR22008. JIN Tian, SONG Yongkun, DAI Yongpeng, et al. UWB-HA4D-1.0: An ultra-wideband radar human activity 4D imaging dataset[J]. Journal of Radars, 2022, 11(1): 27–39. doi: 10.12000/JR22008.
[3]	ANGUITA D, GHIO A, ONETO L, et al. A public domain dataset for human activity recognition using smartphones[C]. European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium, 2013: 437–442.
[4]	SOOMRO K, ZAMIR A R, and SHAH M. UCF101: A dataset of 101 human actions classes from videos in the wild[R]. CRCV-TR-12-01, 2012.
[5]	AMIRI S M, POURAZAD M T, NASIOPOULOS P, et al. Non-intrusive human activity monitoring in a smart home environment[C]. 2013 IEEE 15th International Conference on e-Health Networking, Applications and Services (Healthcom 2013), Lisbon, Portugal, 2013: 606–610. doi: 10.1109/HealthCom.2013.6720748.
[6]	BLOOM V, MAKRIS D, and ARGYRIOU V. G3D: A gaming action dataset and real time action recognition evaluation framework[C]. 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, USA, 2012: 7–12. doi: 10.1109/CVPRW.2012.6239175.
[7]	LIU Jun, SHAHROUDY A, PEREZ M, et al. NTU RGB+D 120: A large-scale benchmark for 3D human activity understanding[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(10): 2684–2701. doi: 10.1109/TPAMI.2019.2916873.
[8]	RAEIS H, KAZEMI M, and SHIRMOHAMMADI S. Human activity recognition with device-free sensors for well-being assessment in smart homes[J]. IEEE Instrumentation & Measurement Magazine, 2021, 24(6): 46–57. doi: 10.1109/MIM.2021.9513637.
[9]	丁传威, 刘芷麟, 张力, 等. 基于MIMO雷达成像图序列的切向人体姿态识别方法[J]. 雷达学报(中英文), 2025, 14(1): 151–167. doi: 10.12000/JR24116. DING Chuanwei, LIU Zhilin, ZHANG Li, et al. Tangential human posture recognition with sequential images based on MIMO radar[J]. Journal of Radar, 2025, 14(1): 151–167. doi: 10.12000/JR24116.
[10]	金添, 何元, 李新羽, 等. 超宽带雷达人体行为感知研究进展[J]. 电子与信息学报, 2022, 44(4): 1147–1155. doi: 10.11999/JEIT211044. JIN Tian, HE Yuan, LI Xinyu, et al. Advances in human activity sensing using ultra-wide band radar[J]. Journal of Electronics & Information Technology, 2022, 44(4): 1147–1155. doi: 10.11999/JEIT211044.
[11]	JIN Biao, MA Xiao, HU Bojun, et al. Gesture-mmWAVE: Compact and accurate millimeter-wave radar-based dynamic gesture recognition for embedded devices[J]. IEEE Transactions on Human-Machine Systems, 2024, 54(3): 337–347. doi: 10.1109/THMS.2024.3385124.
[12]	ZHANG Yushu, JI Junhao, WEN Wenying, et al. Understanding visual privacy protection: A generalized framework with an instance on facial privacy[J]. IEEE Transactions on Information Forensics and Security, 2024, 19: 5046–5059. doi: 10.1109/TIFS.2024.3389572.
[13]	HASCH J, TOPAK E, SCHNABEL R, et al. Millimeter-wave technology for automotive radar sensors in the 77 GHz frequency band[J]. IEEE Transactions on Microwave Theory and Techniques, 2012, 60(3): 845–860. doi: 10.1109/TMTT.2011.2178427.
[14]	JIN Biao, MA Xiao, ZHANG Zhenkai, et al. Interference-robust millimeter-wave radar-based dynamic hand gesture recognition using 2-D CNN-transformer networks[J]. IEEE Internet of Things Journal, 2024, 11(2): 2741–2752. doi: 10.1109/JIOT.2023.3293092.
[15]	JIN Biao, PENG Yu, KUANG Xiaofei, et al. Robust dynamic hand gesture recognition based on millimeter wave radar using atten-TsNN[J]. IEEE Sensors Journal, 2022, 22(11): 10861–10869. doi: 10.1109/JSEN.2022.3170311.
[16]	SENGUPTA A, JIN Feng, ZHANG Renyuan, et al. mm-Pose: Real-time human skeletal posture estimation using mmWave radars and CNNs[J]. IEEE Sensors Journal, 2020, 20(17): 10032–10044. doi: 10.1109/JSEN.2020.2991741.
[17]	YU Zheqi, TAHA A, TAYLOR W, et al. A radar-based human activity recognition using a novel 3-D point cloud classifier[J]. IEEE Sensors Journal, 2022, 22(19): 18218–18227. doi: 10.1109/JSEN.2022.3198395.
[18]	QI C R, SU Hao, MO Kaichun, et al. PointNet: Deep learning on point sets for 3D classification and segmentation[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, 2017: 77–85. doi: 10.1109/CVPR.2017.16.
[19]	Qi C R, Yi L, Su H et al. PointNet++: Deep hierarchical feature learning on point sets in a metric space[C]. 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 5105–5114.
[20]	FAN Hehe, YANG Yi, and KANKANHALLI M. Point 4D transformer networks for Spatio-temporal modeling in point cloud videos[C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, USA, 2021: 14199–14208. doi: 10.1109/CVPR46437.2021.01398.
[21]	PÜTZ S, WIEMANN T, and HERTZBERG J. Tools for visualizing, annotating and storing triangle meshes in ROS and RViz[C]. 2019 European Conference on Mobile Robots (ECMR), Prague, Czech Republic, 2019: 1–6. doi: 10.1109/ECMR.2019.8870953.
[22]	DENG Dingsheng. DBSCAN clustering algorithm based on density[C]. 2020 7th International Forum on Electrical Engineering and Automation (IFEEA), Hefei, China, 2020: 949–953. doi: 10.1109/IFEEA51475.2020.00199.
[23]	LIN Y P, YEH Y M, CHOU Yuchen, et al. Attention EdgeConv for 3D point cloud classification[C]. 2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Tokyo, Japan, 2021: 2018–2022.
[24]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[C]. 9th International Conference on Learning Representations, 2020: 1–11.
[25]	HENDRYCKS D and GIMPEL K. Gaussian error linear units (GELUs)[EB/OL]. https://doi.org/10.48550/arXiv.1606.08415, 2016.
[26]	LI Xing, HUANG Qian, WANG Zhijian, et al. Real-time 3-D human action recognition based on hyperpoint sequence[J]. IEEE Transactions on Industrial Informatics, 2023, 19(8): 8933–8942. doi: 10.1109/TII.2022.3223225.