Turn off MathJax
Article Contents
FANG Chao, WANG Yong, ZHOU Mu, et al. End-to-end cross-person gesture recognition via mamba fusion network and millimeter-wave radar[J]. Journal of Radars, in press. doi: 10.12000/JR25260
Citation: FANG Chao, WANG Yong, ZHOU Mu, et al. End-to-end cross-person gesture recognition via mamba fusion network and millimeter-wave radar[J]. Journal of Radars, in press. doi: 10.12000/JR25260

End-to-end Cross-person Gesture Recognition Via Mamba Fusion Network and Millimeter-wave Radar

DOI: 10.12000/JR25260 CSTR: 32380.14.JR25260
Funds:  The National Natural Science Foundation of China (52302059, 62571074, 62501100), The Chongqing Major Project of Technological Innovation and Application Development (CSTB2025TIAD-STX0022), The Science and Technology Research Program of Chongqing Municipal Education Commission (KJQN202400616), The New Chongqing Youth Innovation Talent Project (CSTB2025YITP-QCRCX0100)
More Information
  • Corresponding author: WANG Yong, yongwang@cqupt.edu.cn
  • Received Date: 2025-12-03
  • Rev Recd Date: 2026-06-23
  • Available Online: 2026-06-25
  • As a noninvasive and contactless sensing technology, millimeter-wave radar has attracted considerable attention because of its broad application potential in human-computer interaction, smart homes, and virtual reality. Existing deep learning models achieve strong performance in recognizing gestures from trained users owing to their powerful feature extraction capabilities; however, their recognition accuracy degrades significantly when applied to new users with different gesture habits and hand sizes. To improve model generalization in cross-user scenarios, this paper proposes a millimeter-wave radar gesture recognition network that integrates end-to-end learning with a state space model. The proposed method directly processes raw radar data cubes and incorporates a Mamba module to capture long-range spatiotemporal dependencies. This enables the adaptive extraction and robust representation of user-independent gesture features. Experimental results show that the proposed end-to-end architecture effectively captures discriminative gesture patterns that are invariant across users. On the cross-user test set, the proposed method achieved an average recognition accuracy of 94.28% with a standard deviation of 2.55% across 11 folds, while the highest single-fold accuracy reached 97.50%. These results substantially outperform those of conventional deep learning methods and validate the generalization capability of the proposed method in cross-user application scenarios.

     

  • loading
  • [1]
    靳标, 孙康圣, 吴昊, 等. 基于毫米波雷达三维点云的人体动作识别数据集与方法[J]. 雷达学报(中英文), 2025, 14(1): 73–90. doi: 10.12000/JR24195.

    JIN Biao, SUN Kangsheng, WU Hao, et al. 3D point cloud from millimeter-wave radar for human action recognition: Dataset and method[J]. Journal of Radars, 2025, 14(1): 73–90. doi: 10.12000/JR24195.
    [2]
    WANG Yong, SHU Yuhong, JIA Xiuqian, et al. Multifeature fusion-based hand gesture sensing and recognition system[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 3507005. doi: 10.1109/LGRS.2021.3086136.
    [3]
    LIU Zhaoyu, XIONG Yuyong, WU Gaoyang, et al. Super-resolution and accurate full-field displacement measurement with millimeter-wave radars[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 8507011. doi: 10.1109/TIM.2023.3327467.
    [4]
    张锐, 龚汉钦, 宋瑞源, 等. 基于4D成像雷达的隔墙人体姿态重建与行为识别研究[J]. 雷达学报(中英文), 2025, 14(1): 44–61. doi: 10.12000/JR24132.

    ZHANG Rui, GONG Hanqin, SONG Ruiyuan, et al. Through-wall human pose reconstruction and action recognition using four-dimensional imaging radar[J]. Journal of Radars, 2025, 14(1): 44–61. doi: 10.12000/JR24132.
    [5]
    赵雅琴, 宋雨晴, 吴晗, 等. 基于DenseNet和卷积注意力模块的高精度手势识别[J]. 电子与信息学报, 2024, 46(3): 967–976. doi: 10.11999/JEIT230165.

    ZHAO Yaqin, SONG Yuqing, WU Han, et al. High-precision gesture recognition based on DenseNet and convolutional block attention module[J]. Journal of Electronics & Information Technology, 2024, 46(3): 967–976. doi: 10.11999/JEIT230165.
    [6]
    ZHANG Lin, YUAN Kang, CHU Hongqing, et al. Pedestrian collision risk assessment based on state estimation and motion prediction[J]. IEEE Transactions on Vehicular Technology, 2022, 71(1): 98–111. doi: 10.1109/TVT.2021.3127008.
    [7]
    LU Jianchao, ZHENG Xi, SHENG M, et al. Efficient human activity recognition using a single wearable sensor[J]. IEEE Internet of Things Journal, 2020, 7(11): 11137–11146. doi: 10.1109/JIOT.2020.2995940.
    [8]
    QIN Zhen, ZHANG Yibo, MENG Shuyu, et al. Imaging and fusing time series for wearable sensor-based human activity recognition[J]. Information Fusion, 2020, 53: 80–87. doi: 10.1016/j.inffus.2019.06.014.
    [9]
    DING Chuanwei, ZHANG Li, CHEN Haoyu, et al. Human motion recognition with spatial-temporal-ConvLSTM network using dynamic range-Doppler frames based on portable FMCW radar[J]. IEEE Transactions on Microwave Theory and Techniques, 2022, 70(11): 5029–5038. doi: 10.1109/TMTT.2022.3200097.
    [10]
    MLIKI H, BOUHLEL F, and HAMMAMI M. Human activity recognition from UAV-captured video sequences[J]. Pattern Recognition, 2020, 100: 107140. doi: 10.1016/j.patcog.2019.107140.
    [11]
    DING Chuanwei, ZHANG Li, CHEN Haoyu, et al. Sparsity-based human activity recognition with PointNet using a portable FMCW radar[J]. IEEE Internet of Things Journal, 2023, 10(11): 10024–10037. doi: 10.1109/JIOT.2023.3235808.
    [12]
    LI Xinyu, HE Yuan, FIORANELLI F, et al. Semisupervised human activity recognition with radar micro-Doppler signatures[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5103112. doi: 10.1109/TGRS.2021.3090106.
    [13]
    ZHU Simin, GUENDEL R G, YAROVOY A, et al. Continuous human activity recognition with distributed radar sensor networks and CNN–RNN architectures[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5115215. doi: 10.1109/TGRS.2022.3189746.
    [14]
    DING Wen, GUO Xuemei, and WANG Guoli. Radar-based human activity recognition using hybrid neural network model with multidomain fusion[J]. IEEE Transactions on Aerospace and Electronic Systems, 2021, 57(5): 2889–2898. doi: 10.1109/TAES.2021.3068436.
    [15]
    WANG Xiang, GUO Shisheng, CHEN Jiahui, et al. GCN-enhanced multidomain fusion network for through-wall human activity recognition[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 4024005. doi: 10.1109/LGRS.2022.3176117.
    [16]
    STADELMAYER T, SANTRA A, WEIGEL R, et al. Data-driven radar processing using a parametric convolutional neural network for human activity classification[J]. IEEE Sensors Journal, 2021, 21(17): 19529–19540. doi: 10.1109/JSEN.2021.3092002.
    [17]
    ZHAO Running, MA Xiaolin, LIU Xinhua, et al. An end-to-end network for continuous human motion recognition via radar radios[J]. IEEE Sensors Journal, 2021, 21(5): 6487–6496. doi: 10.1109/JSEN.2020.3040865.
    [18]
    WANG Shuai, MEI Luoyu, LIU Ruofeng, et al. Multi-modal fusion sensing: A comprehensive review of millimeter-wave radar and its integration with other modalities[J]. IEEE Communications Surveys & Tutorials, 2025, 27(1): 322–352. doi: 10.1109/COMST.2024.3398004.
    [19]
    ZHAO Peijun, LU C X, WANG Bing, et al. CubeLearn: End-to-end learning for human motion recognition from raw mmWave radar signals[J]. IEEE Internet of Things Journal, 2023, 10(12): 10236–10249. doi: 10.1109/JIOT.2023.3237494.
    [20]
    EROL B and AMIN M G. Radar data cube processing for human activity recognition using multisubspace learning[J]. IEEE Transactions on Aerospace and Electronic Systems, 2019, 55(6): 3617–3628. doi: 10.1109/TAES.2019.2910980.
    [21]
    HE Yan, TU Bing, LIU Bo, et al. 3DSS-Mamba: 3D-spectral-spatial Mamba for hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5534216. doi: 10.1109/TGRS.2024.3472091.
    [22]
    GU A and DAO T. Mamba: Linear-time sequence modeling with selective state spaces[C]. The First Conference on Language Modeling, Philadelphia, USA, 2024.
    [23]
    WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 3–19. doi: 10.1007/978-3-030-01234-2_1.
    [24]
    LI Jianjun, XU Hongji, ZENG Jiaqi, et al. Radar-based human activity recognition using dual-stream spatial and temporal feature fusion network[J]. IEEE Transactions on Aerospace and Electronic Systems, 2024, 60(2): 1835–1847. doi: 10.1109/TAES.2023.3344685.
    [25]
    QIAN Yujia, CHEN Chuan, TANG Longzhen, et al. Parallel LSTM-CNN network with radar multispectrogram for human activity recognition[J]. IEEE Sensors Journal, 2023, 23(2): 1308–1317. doi: 10.1109/JSEN.2022.3224083.
    [26]
    WANG Congming, ZHAO Xiaohui, and LI Zan. DCS-CTN: Subtle gesture recognition based on TD-CNN-Transformer via millimeter-wave radar[J]. IEEE Internet of Things Journal, 2023, 10(20): 17680–17693. doi: 10.1109/JIOT.2023.3280227.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索
    Article views(61) PDF downloads(12) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint