结合时空特征的超宽带MIMO雷达图像人体轮廓恢复与动作识别方法

宋永坤 游文杰 岳磊 杨清荣 戴永鹏 金添

宋永坤, 游文杰, 岳磊, 等. 结合时空特征的超宽带MIMO雷达图像人体轮廓恢复与动作识别方法[J]. 雷达学报(中英文), 待出版. doi: 10.12000/JR25218
引用本文: 宋永坤, 游文杰, 岳磊, 等. 结合时空特征的超宽带MIMO雷达图像人体轮廓恢复与动作识别方法[J]. 雷达学报(中英文), 待出版. doi: 10.12000/JR25218
SONG Yongkun, YOU Wenjie, YUE Lei, et al. Human contour restoration and action recognition in ultra-wideband radar imaging based on spatio-temporal features[J]. Journal of Radars, in press. doi: 10.12000/JR25218
Citation: SONG Yongkun, YOU Wenjie, YUE Lei, et al. Human contour restoration and action recognition in ultra-wideband radar imaging based on spatio-temporal features[J]. Journal of Radars, in press. doi: 10.12000/JR25218

结合时空特征的超宽带MIMO雷达图像人体轮廓恢复与动作识别方法

DOI: 10.12000/JR25218 CSTR: 32380.14.JR25218
基金项目: 国家自然科学基金青年项目(62401086),湖南省自然科学基金青年项目(2024JJ6065),湖南省教育厅优秀青年项目(25B0222)
详细信息
    作者简介:

    宋永坤,博士,副教授,主要研究方向为新体制雷达技术、人体行为智能感知

    游文杰,硕士生,主要研究方向为人体轮廓恢复、动作识别

    岳 磊,博士,讲师,主要研究方向为信号处理、深度学习

    杨清荣,硕士生,主要研究方向为动作识别、姿态重构

    戴永鹏,博士,讲师,主要研究方向为MIMO阵列雷达成像与图像增强

    金 添,博士,教授,主要研究方向为新体制雷达系统、智能感知与处理

    通讯作者:

    宋永坤 songyk1118@163.com

    责任主编:郭世盛 Corresponding Editor: :GUO Shisheng

  • 中图分类号: TN957

Human Contour Restoration and Action Recognition in Ultra-wideband Radar Imaging Based on Spatio-temporal Features

Funds: The National Natural Science Foundation of China (62401086), Hunan Provincial Natural Science Foundation Youth Program (2024JJ6065), Hunan Provincial Department of Education Outstanding Youth Project (25B0222)
More Information
  • 摘要: 超宽带(UWB)MIMO雷达因其分辨率良好、穿透性强、隐私保护性好以及对光照条件不敏感等优势,在人体智能感知领域展现出巨大潜力,但低图像分辨率导致轮廓模糊、动作难辨。基于以上背景,该文提出一种融合时空特征的人体轮廓恢复与动作识别联合框架(STWTnet)。该方法采用多任务网络框架,利用Res2Net和小波下采样提取雷达图像空间细节特征,并以Transformer建立时空依赖,通过多任务学习共享人体轮廓恢复与动作识别的共性特征,同时避免特征冲突,实现两任务的互补。在自建同步UWB-光学数据集上的实验表明,STWTnet具有较好的动作识别率而且在轮廓精度显著优于现有技术,为隐私友好、全天候的人体行为理解提供了新途径。

     

  • 图  1  算法总体框图

    Figure  1.  Overall block diagram of the algorithm

    图  2  Res2Net基本结构图

    Figure  2.  Basic structure diagram of Res2Net

    图  3  小波分解结构图

    Figure  3.  Structure diagram of wavelet decomposition

    图  4  Transformer编码器结构

    Figure  4.  Structure of Transformer encoder

    图  5  STWTnet动作识别分支

    Figure  5.  Action recognition branch of STWTnet

    图  6  多任务学习特征流向示意图

    Figure  6.  Schematic diagram of feature flow in multi-task learning

    图  7  UWB MIMO雷达系统

    Figure  7.  UWB MIMO radar system

    图  8  采集场景

    Figure  8.  Collection scenarios

    图  9  各消融实验的混淆矩阵

    Figure  9.  Confusion matrices of various ablation experiments

    图  10  不同方法的人体轮廓恢复对比

    Figure  10.  Comparison of human contour restoration with different methods

    图  11  多任务结果预测

    Figure  11.  Multi-task result prediction

    图  12  遮挡场景实测

    Figure  12.  Actual measurement of occluded scenes

    图  13  人员相互遮挡场景测试

    Figure  13.  Test of mutual occlusion scenes between personnel

    表  1  超宽带MIMO雷达具体参数表

    Table  1.   Specific parameters table of UWB MIMO radar

    参数 指标
    工作频段 2.5~3.5 GHz
    信号带宽 1 GHz
    信号体制 调频连续波(FMCW)制式
    MIMO阵列 12发射通道+ 8接收通道
    可穿透介质 塑料、木板、砖墙等
    系统尺寸 84 cm×84 cm
    下载: 导出CSV

    表  2  不同训练方法下的性能与参数对比

    Table  2.   Comparison of performance and parameters under different training

    训练方法mIoUAccuracy(%)
    单任务人体轮廓恢复0.8716-
    单任务动作识别-97.4
    单阶段联合0.761597.6
    两阶段联合0.872198.5
    下载: 导出CSV

    表  3  不同损失权重参数下的性能对比

    Table  3.   Comparison of performance under different loss weight parameters

    $ {\lambda }_{{\mathrm{seg}}} $ $ {\lambda }_{{\mathrm{action}}} $ mIoU Accuracy(%)
    0.1 0.9 0.8699 98.6
    0.3 0.7 0.8712 98.6
    0.5 0.5 0.8721 98.5
    0.7 0.3 0.8720 98.2
    下载: 导出CSV

    表  4  单帧输入下不同方法的性能与参数对比

    Table  4.   Comparison of performance and parameters of different methods under single-frame input

    网络模型mIoUParameters(MB)FLOPs(GB)
    Unet0.72413.431.0
    RPSNet0.748123.4137.3
    Deeplabv30.73639.640.8
    TranUnet0.765115.115.0
    SegNeXt0.7053.57.4
    SegFormer0.73435.710.6
    STWTnet0.86757.412.5
    下载: 导出CSV

    表  5  核心模块消融实验结果

    Table  5.   Results of core module ablation experiments

    网络mIoUAccuracy(%)
    STWTnet(Conv Downsample)0.81796.52
    STWTnet(ResNet)0.79894.42
    STWTnet(LSTM)0.82196.01
    STWTnet0.87298.56
    注:STWTnet()表示以STWTnet完整模型为基础,将原模块替换为传统模块(X)。
    下载: 导出CSV

    表  6  输入不同帧长的性能对比

    Table  6.   Comparison of performance with different input frame lengths

    帧长mIoUAccuracy(%)Flops(GB)
    10.8670-12.46
    40.872198.5648.54
    80.872398.7277.44
    120.872097.44106.34
    下载: 导出CSV

    表  7  不同距离下的性能对比

    Table  7.   Comparison of performance at different distances

    测试距离(m) mIoU Accuracy(%)
    1~3 0.8686 98.12
    3~5 0.8913 98.57
    5~7 0.8527 99.01
    下载: 导出CSV

    表  8  不同场景下的性能对比

    Table  8.   Comparison of performance under different scenarios

    测试场景 mIoU Accuracy(%)
    场景1 0.8720 98.56
    场景2 0.8104 98.42
    下载: 导出CSV
  • [1] YANG Xiaopeng, GAO Weicheng, QU Xiaodong, et al. Through-the-wall radar human activity micro-Doppler signature representation method based on joint Boulic-sinusoidal pendulum model[J]. IEEE Transactions on Microwave Theory and Techniques, 2025, 73(2): 1248–1263. doi: 10.1109/TMTT.2024.3441591.
    [2] WANG Wendong, ZHAO Chengzhi, LI Xin, et al. Research on multimodal fusion recognition method of upper limb motion patterns[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 4008312. doi: 10.1109/TIM.2023.3289556.
    [3] ZHAO Mingmin, LI Tianhong, ALSHEIKH M A, et al. Through-wall human pose estimation using radio signals[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7356–7365. doi: 10.1109/CVPR.2018.00768.
    [4] PIERCE J, WONG R Y, and MERRILL N. Sensor illumination: Exploring design qualities and ethical implications of smart cameras and image/video analytics[C]. 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, USA, 2020: 1–19. doi: 10.1145/3313831.3376347.
    [5] ZHOU Xu, QIAN Lichang, YOU Pengjie, et al. Fall detection using convolutional neural network with multi-sensor fusion[C]. 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), San Diego, USA, 2018: 1–5. doi: 10.1109/ICMEW.2018.8551564.
    [6] 郑学召, 丁文, 黄渊, 等. 不同领域下超宽带雷达探测呼吸心跳信号研究综述[J]. 雷达学报(中英文), 2025, 14(1): 204–228. doi: 10.12000/JR24154.

    ZHENG Xuezhao, DING Wen, HUANG Yuan, et al. A review of UWB radar detection of respiration and heartbeat signals in different scenarios[J]. Journal of Radars, 2025, 14(1): 204–228. doi: 10.12000/JR24154.
    [7] FAN Lijie, LI Tianhong, YUAN Yuan, et al. In-home daily-life captioning using radio signals[C]. The 16th European Conference on Computer Vision, Glasgow, UK, 2020: 105–123. doi: 10.1007/978-3-030-58536-5_7.
    [8] SONG Ruiyuan, ZHANG Dongheng, WU Zhi, et al. RF-URL: Unsupervised representation learning for RF sensing[C]. The 28th Annual International Conference on Mobile Computing and Networking, Sydney, Australia, 2022: 282–295. doi: 10.1145/3495243.3560529.
    [9] LONG Xianlei, WANG Ping, GU Fuqiang, et al. Accurate and efficient human activity recognition through semi-supervised deep learning[J]. IEEE Sensors Journal, 2025, 25(12): 23105–23116. doi: 10.1109/JSEN.2025.3561418.
    [10] RANI S, CHOWDHURY A, CHAKRAVARTY T, et al. Exploiting unique state transitions to capture micro-Doppler signatures of human actions using CW radar[J]. IEEE Sensors Journal, 2021, 21(24): 27878–27886. doi: 10.1109/JSEN.2021.3126436.
    [11] 陈一凡, 刘剑刚, 贾勇, 等. 基于仿真样本迁移学习的穿墙雷达高分辨成像方法[J]. 雷达学报(中英文), 2024, 13(4): 807–821. doi: 10.12000/JR24049.

    CHEN Yifan, LIU Jiangang, JIA Yong, et al. High-resolution imaging method for through-the-wall radar based on transfer learning with simulation samples[J]. Journal of Radars, 2024, 13(4): 807–821. doi: 10.12000/JR24049.
    [12] 金添, 何元, 李新羽, 等. 超宽带雷达人体行为感知研究进展[J]. 电子与信息学报, 2022, 44(4): 1147–1155. doi: 10.11999/JEIT211044.

    JIN Tian, HE Yuan, LI Xinyu, et al. Advances in human activity sensing using ultra-wide band radar[J]. Journal of Electronics & Information Technology, 2022, 44(4): 1147–1155. doi: 10.11999/JEIT211044.
    [13] ZHAO Lin, ZHOU Hui, ZHU Xinge, et al. LIF-Seg: LiDAR and camera image fusion for 3D LiDAR semantic segmentation[J]. IEEE Transactions on Multimedia, 2024, 26: 1158–1168. doi: 10.1109/TMM.2023.3277281.
    [14] ZHENG Zhijie, PAN Jun, NI Zhikang, et al. Recovering human pose and shape from through-the-wall radar images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5112015. doi: 10.1109/TGRS.2022.3162333.
    [15] ZHENG Zhijie, ZHANG Diankun, LIANG Xiao, et al. Unsupervised human contour extraction from through-wall radar images using dual UNet[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 3500705. doi: 10.1109/LGRS.2022.3229954.
    [16] SONG Ruiyuan, ZHANG Dongheng, WU Zhi, et al. RF-URL 2.0: A general unsupervised representation learning method for RF sensing[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025, 47(10): 8889–8906. doi: 10.1109/TPAMI.2025.3587718.
    [17] DONG Ruchan, XU Dazhuan, ZHAO Jin, et al. Sig-NMS-based faster R-CNN combining transfer learning for small target detection in VHR optical remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(11): 8534–8545. doi: 10.1109/TGRS.2019.2921396.
    [18] CONG J, QU L, YANG T, et al. Global–Local point cloud transformer for through-the-wall human activity recognition based on UWB radar[J]. IEEE Transactions on Aerospace and Electronic Systems, 2026, 62: 493–505. doi: 10.1109/TAES.2025.3623615.
    [19] 宋永坤, 晏天兴, 张可, 等. 基于点云时空特征的超宽带雷达轻量化人体行为识别方法[J]. 雷达学报(中英文), 2025, 14(1): 1–15. doi: 10.12000/JR24110.

    SONG Yongkun, YAN Tianxing, ZHANG Ke, et al. A lightweight human activity recognition method for ultra-wideband radar based on spatiotemporal features of point clouds[J]. Journal of Radars, 2025, 14(1): 1–15. doi: 10.12000/JR24110.
    [20] LI Xiaoxiong, ZHANG Shuning, CHEN Si, et al. Through-wall multi-person action recognition using enhanced YOLOv5 and IR-UWB radar[J]. IEEE Sensors Journal, 2025, 25(3): 5711–5722. doi: 10.1109/JSEN.2024.3513983.
    [21] WANG Changlong, ZHU Dongsheng, SUN Lijuan, et al. Real-time through-wall multihuman localization and behavior recognition based on MIMO radar[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5104312. doi: 10.1109/TGRS.2023.3274207.
    [22] SONG Yongkun, YAN Tianxing, ZHANG Ke, et al. Human action recognition based on bimodal data and spationtemporal features using UWB MIMO radar[J]. IEEE Transactions on Aerospace and Electronic Systems, 2025, 61(5): 12598–12612. doi: 10.1109/TAES.2025.3575751.
    [23] 丁传威, 刘芷麟, 张力, 等. 基于MIMO雷达成像图序列的切向人体姿态识别方法[J]. 雷达学报(中英文), 2025, 14(1): 151–167. doi: 10.12000/JR24116.

    DING Chuanwei, LIU Zhilin, ZHANG Li, et al. Tangential human posture recognition with sequential images based on MIMO radar[J]. Journal of Radars, 2025, 14(1): 151–167. doi: 10.12000/JR24116.
    [24] ZHENG Zhijie, ZHANG Diankun, LIANG Xiao, et al. RadarFormer: End-to-end human perception with through-wall radar and transformers[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(12): 18285–18299. doi: 10.1109/TNNLS.2023.3314031.
    [25] 张锐, 龚汉钦, 宋瑞源, 等. 基于4D成像雷达的隔墙人体姿态重建与行为识别研究[J]. 雷达学报(中英文), 2025, 14(1): 44–61. doi: 10.12000/JR24132.

    ZHANG Rui, GONG Hanqin, SONG Ruiyuan, et al. Through-wall human pose reconstruction and action recognition using four-dimensional imaging radar[J]. Journal of Radars, 2025, 14(1): 44–61. doi: 10.12000/JR24132.
    [26] GAO Shanghua, CHENG Mingming, ZHAO Kai, et al. Res2Net: A new multi-scale backbone architecture[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(2): 652–662. doi: 10.1109/TPAMI.2019.2938758.
    [27] XU Guoping, LIAO Wentao, ZHANG Xuan, et al. Haar wavelet downsampling: A simple but effective downsampling module for semantic segmentation[J]. Pattern Recognition, 2023, 143: 109819. doi: 10.1016/j.patcog.2023.109819.
    [28] WANG Ao, CHEN Hui, LIU Lihao, et al. YOLOv10: Real-time end-to-end object detection[C]. The 38th International Conference on Neural Information Processing Systems, Vancouver, Canada, 2024: 3429.
    [29] KIRILLOV A, MINTUN E, RAVI N, et al. Segment anything[C]. 2023 IEEE/CVF International Conference on Computer Vision, Paris, France, 2023: 3992–4003. doi: 10.1109/ICCV51070.2023.00371.
    [30] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]. The 16th European Conference on Computer Vision, Glasgow, UK, 2020: 213–229. doi: 10.1007/978-3-030-58452-8_13.
    [31] SOUMA R, KIDERA S, and KIRIMOTO T. Fast and accurate permittivity estimation algorithm for UWB internal imaging radar[C]. 2011 3rd International Asia-Pacific Conference on Synthetic Aperture Radar (APSAR). Seoul, Korea (South), 2011: 1–4.
    [32] ANDERSSON L E. On the determination of a function from spherical averages[J]. SIAM Journal on Mathematical Analysis, 1988, 19(1): 214–232. doi: 10.1137/0519016.
    [33] LI Xunsong, SUN Pengzhan, LIU Yangcen, et al. Simultaneous detection and interaction reasoning for object-centric action recognition[J]. IEEE Transactions on Multimedia, 2025, 27: 5283–5295. doi: 10.1109/TMM.2025.3543033.
    [34] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2999–3007. doi: 10.1109/ICCV.2017.32.
    [35] WU Fuping, ZHANG Le, SUN Yang, et al. MT-CooL: Multi-task cooperative learning via flat minima searching[J]. IEEE Transactions on Medical Imaging, 2025, 44(4): 1648–1658. doi: 10.1109/TMI.2024.3512173.
    [36] RONNEBERGER O, FISCHER P, and BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]. The 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 2015: 234–241. doi: 10.1007/978-3-319-24574-4_28.
    [37] CHEN L C, ZHU Yukun, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 833–851. doi: 10.1007/978-3-030-01234-2_49.
    [38] CHEN J N, LU Y Y, YU Q H, et al. TransUNet: Transformers make strong encoders for medical image segmentation[C]. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2021, Cham, 2021: 3–13. doi: 10.1007/978-3-030-87196-0_1.
    [39] XIE Enze, WANG Wenhai, YU Zhiding, et al. SegFormer: Simple and efficient design for semantic segmentation with transformers[C]//Advances in Neural Information Processing Systems 35 (NeurIPS 2021). La Jolla, CA, USA: Neural Information Processing Foundation, 2021: 12077–12090.
    [40] GUO Menghao, LU Chengze, HOU Qibin, et al. SegNeXt: Rethinking convolutional attention design for semantic segmentation[C]. The 36th International Conference on Neural Information Processing Systems, New Orleans, USA, 2022: 84.
  • 加载中
图(13) / 表(8)
计量
  • 文章访问数: 
  • HTML全文浏览量: 
  • PDF下载量: 
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-10-31
  • 修回日期:  2026-01-16

目录

    /

    返回文章
    返回