
靳标 孙康圣 吴昊 李子璇 张贞凯 蔡焱 李荣民 张向群 杜根远

基金项目: 国家自然科学基金(61701416),江苏省自然科学基础研究计划面上项目(BK20211341),河南省重点研发专项(241111212500),江苏省研究生科研与实践创新计划项目(SJCX24_2605)

  • 责任主编:金添 Corresponding Editor: JIN Tian
  • 中图分类号: TN958.94

3D Point Cloud from Millimeter-wave Radar for Human Action Recognition: Dataset and Method

Funds: The National Natural Science Foundation of China (61701416), Natural Science Foundation of Jiangsu Province of China (BK20211341), Key Research and Development Project of Henan Province (241111212500), Postgraduate Research & Practice Innovation Program of Jiangsu Province (SJCX24_2605)
  • 摘要: 毫米波雷达凭借其出色的环境适应性、高分辨率和隐私保护等优势,在智能家居、智慧养老和安防监控等领域具有广泛的应用前景。毫米波雷达三维点云是一种重要的空间数据表达形式,对于人体行为姿态识别具有极大的价值。然而,由于毫米波雷达点云具有强稀疏性,给精准快速识别人体动作带来了巨大的挑战。针对这一问题,该文公开了一个毫米波雷达人体动作三维点云数据集mmWave-3DPCHM-1.0,并提出了相应的数据处理方法和人体动作识别模型。该数据集由TI公司的IWR1443-ISK和Vayyar公司的vBlu射频成像模组分别采集,包括常见的12种人体动作,如走路、挥手、站立和跌倒等。在网络模型方面,该文将边缘卷积(EdgeConv)与Transformer相结合,提出了一种处理长时序三维点云的网络模型,即Point EdgeConv and Transformer (PETer)网络。该网络通过边缘卷积对三维点云逐帧创建局部有向邻域图,以提取单帧点云的空间几何特征,并通过堆叠多个编码器的Transformer模块,提取多帧点云之间的时序关系。实验结果表明,所提出的PETer网络在所构建的TI数据集和Vayyar数据集上的平均识别准确率分别达到98.77%和99.51%,比传统最优的基线网络模型提高了大约5%,且网络规模仅为1.09 M,适于在存储受限的边缘设备上部署。


  • 图  1  TI的IWR1443-ISK毫米波雷达

    Figure  1.  TI’s IWR1443-ISK millimeter-wave radar

    图  2  Vayyar的vBlu射频成像模组

    Figure  2.  Vayyar’s vBlu RF imaging module

    图  3  毫米波雷达三维点云生成过程

    Figure  3.  3D point cloud generation process for millimeter-wave radar

    图  4  数据采集场景

    Figure  4.  Data acquisition scenario

    图  5  数据采集区域

    Figure  5.  Data collection area

    图  6  人体动作类型

    Figure  6.  Types of human actions

    图  7  张开双臂动作的点云数据示例

    Figure  7.  Example of point cloud data for open arms action

    图  8  基于毫米波雷达点云数据的人体动作识别流程

    Figure  8.  Human action recognition process based on 3D point cloud data in millimeter-wave radar

    图  9  动态干扰滤除

    Figure  9.  Dynamic interference filtering

    图  10  TI雷达点云经过多帧融合与聚类的效果对比

    Figure  10.  Comparison of the effect of TI radar point cloud after multi-frame fusion and clustering

    图  11  PETer网络结构图

    Figure  11.  Structure diagram of PETer network

    图  12  利用边缘卷积构建局部有向邻域图

    Figure  12.  Construction of local directed neighborhood graph using edge convolution

    图  13  不同融合帧数时的识别准确率

    Figure  13.  Accuracy with different fused frames

    图  14  不同样本划分时的识别准确率

    Figure  14.  Accuracy with different sample divisions

    图  15  不同采样点数时的识别准确率

    Figure  15.  Recognition accuracy with different sampling points

    图  16  不同K值的识别准确率

    Figure  16.  Recognition accuracy with different K values

    图  17  PETer网络在TI数据集上不同学习率时的准确率和损失曲线

    Figure  17.  Accuracy and loss curves of PETer network with different learning rates on TI dataset

    图  18  PETer网络在Vayyar数据集上不同学习率时的准确率和损失曲线

    Figure  18.  Accuracy and loss curves of PETer network for different learning rates on Vayyar dataset

    图  19  PETer网络的混淆矩阵

    Figure  19.  Confusion matrix of PETer’s network

    1  面向人体动作识别的毫米波雷达三维点云数据集1.0发布网页

    1.  Release webpage of 3D point cloud dataset from millimeter-wave radar for human action recognition (mmWave-3DPCHM-1.0)

    表  1  TI毫米波雷达的参数配置

    Table  1.   Parameter configuration of TI millimeter-wave radar

    参数 数值
    天线数 3发4收
    工作频段 77 GHz
    信号形式 FMCW
    信号带宽 3.4 GHz (最大带宽为4 GHz)
    帧频率 30 帧/s
    距离分辨率 4.4 cm
    方位维角度分辨率 15°
    表  2  Vayyar毫米波雷达的参数配置

    Table  2.   Parameter configuration of Vayyar millimeter-wave radar

    参数 数值
    天线数 24发22收
    工作频段 60 GHz
    信号形式 FMCW
    信号带宽 480 MHz (最大可达到2.5 GHz)
    帧频率 10 帧/s
    距离分辨率 31.25 cm
    表  3  志愿者信息

    Table  3.   Information of volunteers

    人员 身高(cm) 体重(kg) 年龄 性别 S1场景
    People1 183 90 23
    People2 160 45 23
    People3 178 80 24
    People4 173 65 25
    People5 188 80 25
    People6 176 65 25
    People7 172 75 24
    表  4  文件名称

    Table  4.   File names

    文件夹名(动作) 文件名示例
    Box people1_box_1.xlsx, people1_box_2.xlsx, ···, people7_box_3.xlsx
    Fall people1_fall_1.xlsx, people1_ fall_2.xlsx, ···, people7_ fall_3.xlsx
    Jump people1_jump_1.xlsx, people1_ jump_2.xlsx, ···, people7_ jump _3.xlsx
    Left hand wave people1_left hand wave_1.xlsx, people1_ left hand wave_2.xlsx, ···, people7_ left hand wave_3.xlsx
    Left forerake people1_left forerake_1.xlsx, people1_left forerake_2.xlsx, ···, people7_left forerake_3.xlsx
    Open arms people1_open arms_1.xlsx, people1_open arms_2.xlsx, ···, people7_open arms_3.xlsx
    Right hand wave people1_right hand wave_1.xlsx, people1_right hand wave_2.xlsx, ···, people7_right hand wave_3.xlsx
    Right forerake people1_right forerake_1.xlsx, people1_right forerake_2.xlsx, ···, people7_right forerake_3.xlsx
    Sit people1_sit_1.xlsx, people1_sit_2.xlsx, ···, people7_sit_3.xlsx
    Squat people1_squat_1.xlsx, people1_squat_2.xlsx, ···, people7_squat_3.xlsx
    Stand people1_stand_1.xlsx, people1_stand_2.xlsx, ···, people7_stand_3.xlsx
    Walk people1_walk_1.xlsx, people1_walk_2.xlsx, ···, people7_walk_3.xlsx
    表  5  数据格式

    Table  5.   Data format

    Frame Point number x y z Intensity
    0 1 1.515625000 0.291015625 0.177734375 20.53078461
    1 1 1.605468750 1.562500000 0.298828125 27.83903503
    1 2 1.634765625 0.367187500 0.058593750 31.11934280
    1 3 1.507812500 0.339843700 0.164062500 22.17483902
    2 1 1.683593750 0.494140625 0.126953125 25.19828033
    2 2 1.677734375 0.576171875 0.490234375 27.63427925
    $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $
    5399 4 1.765625 0.40234375 0.326171875 26.33468437
    表  6  边缘卷积模块的参数配置

    Table  6.   Parameter configuration of edge convolution module

    类型 Num×kernel_size, Stride 输出大小 (Batch, Channel, Length, d)
    Input -- (32, 4, 100, -)
    KNN graph-1 K=10 (32, 8, 100, 10)
    Conv2d-2 64×(1, 1), 1 (32, 64, 100, 10)
    MaxPool1d-3 1×10 (32, 64, 100, -)
    Conv2d-4 128×(1, 1), 1 (32, 128, 100, -)
    MaxPool1d-5 1×100 (32, 128, -, -)
    表  7  不同网络模块组合的识别准确率(%)

    Table  7.   Recognition accuracy for different combinations of network modules (%)

    方法 TI数据识别率 Vayyar数据识别率
    PointNet + Transformer 94.12 95.23
    (PointNet++)+ Transformer 96.13 97.78
    EdgeConv + LSTM 94.26 99.51
    EdgeConv + GRU 92.42 94.45
    EdgeConv + RNN 94.26 96.23
    PETer (EdgeConv+Transformer) 98.73 99.51
    表  8  不同网络模型在TI数据集和Vayyar数据集上的动作识别准确率(%)

    Table  8.   Action recognition accuracy of different network models on TI and Vayyar dataset (%)

    数据集 模型 打拳 跌倒 左前倾 左挥手 开双臂 右前倾 右挥手 静坐 下蹲 站立 步行 平均
    TI PointNet 75.97 99.84 68.95 70.48 60.76 72.92 55.52 62.54 84.48 78.57 85.40 75.75 74.26
    PointNet++ 87.88 98.98 74.01 75.88 77.22 72.99 57.11 66.91 88.55 87.56 94.78 83.36 80.44
    P4Transformer 98.23 98.04 81.76 80.50 82.66 87.36 85.30 88.93 96.85 92.99 96.78 89.93 89.94
    SequentialPointNet 97.02 98.74 92.87 93.52 92.29 98.28 85.36 89.86 98.80 97.60 94.01 86.91 93.77
    PETer (ours) 99.40 99.06 97.25 99.90 97.85 99.20 99.36 96.41 99.53 99.80 98.96 98.49 98.77
    Vayyar PointNet 88.83 98.28 72.13 76.74 66.59 81.13 55.84 68.09 90.14 74.74 80.75 78.21 77.62
    PointNet++ 94.30 99.68 71.30 64.24 78.08 86.26 69.20 73.53 91.38 83.84 94.53 84.47 82.57
    P4Transformer 98.41 98.87 79.07 92.55 88.73 94.20 93.13 80.89 98.80 98.54 94.56 97.64 92.95
    SequentialPointNet 99.81 99.71 86.20 97.15 92.45 98.06 94.23 87.43 98.74 98.12 97.28 97.83 95.58
    PETer (ours) 99.97 100.00 100.00 99.63 97.25 99.66 99.60 99.43 100.00 100.00 98.76 99.83 99.51
    表  9  不同网络模型的计算量与复杂度对比

    Table  9.   Comparison of computational load and complexity of different network models

    模型 规模 (MB) GFLOPS 参数量 (M)
    PointNet 13.21 28.83 3.46
    PointNet++ 5.60 55.87 1.47
    P4Transformer 161.51 443.65 42.34
    SequentialPointNet 11.41 219.23 2.99
    PETer (ours) 1.09 4.62 0.35
图(20) / 表(9)
