Waveform Selection Method of Cognitive Radar Target Tracking Based on Reinforcement Learning
-
摘要: 认知雷达通过不断与环境互动并从经验中学习,根据获得的知识不断调整其波形、参数和照射策略,以在复杂多变的场景中实现稳健的目标跟踪,其波形设计在提高跟踪性能方面一直备受关注。该文提出了一种用于跟踪高机动目标的认知雷达波形选择框架,该框架考虑了恒定速度(CV)、恒定加速度(CA)和协同转弯(CT)模型的组合,在该框架的基础上设计了基于准则优化(CBO)和熵奖励Q学习(ERQL)方法进行最优波形选择。该方法将雷达与目标集成到一个闭环中,发射波形随目标状态的变化实时更新,从而达到对目标的最佳跟踪性能。数值结果表明,与CBO方法相比,所提出的ERQL方法大大减少了获取最优波形的处理时间,并实现了与CBO相近的跟踪性能,相比于固定参数(Fixed-P)方法,极大地提高了机动目标的跟踪精度。
-
关键词:
- 目标跟踪 /
- 认知雷达 /
- 波形挑选 /
- 基于准则优化(CBO) /
- 熵奖励Q学习(ERQL)
Abstract: Based on the obtained knowledge through ceaseless interaction with the environment and learning from the experience, cognitive radar continuously adjusts its waveform, parameters, and illumination strategies to achieve robust target tracking in complex and changing scenarios. Its waveform design has been receiving attention to improve tracking performance. In this paper, we propose a novel framework of cognitive radar waveform selection for the tracking of high-maneuvering targets. The framework considers the combination of Constant Velocity (CV), Constant Acceleration (CA), and Coordinate Turn (CT) motions. We also design Criterion-Based Optimization (CBO) and Entropy Reward Q-Learning (ERQL) methods to perform waveform selection based on this framework. To provide the optimum target tracking performance, it merges the radar and target into a closed loop, updating the broadcast waveform in real-time as the target state changes. The suggested ERQL technique achieves about the same tracking performance as the CBO while using much less processing time than the CBO, according to numerical results. The proposed ERQL method significantly increases the tracking accuracy of moving targets as compared to the fixed parameter approach. -
表 1 CBO/ERQL算法
Table 1. CBO/ERQL algorithm
输入:$k - 1$时刻的状态估计${\hat {\boldsymbol{x}}_{k - 1|k - 1} }$, ${{\boldsymbol{P}}_{k - 1|k - 1} }$,k时刻的量
测${{\boldsymbol{z}}_k}$。输出:最佳发射波形参数${{\boldsymbol{\theta}} _{k + 1} }$。 (1) 通过IMM滤波器中的交互输入和模型滤波过程,计算每个模
型在时间k的估计值$\hat {\boldsymbol{x}}_{k|k}^{{\rm{CV}}},{\text{ } }{\boldsymbol{P}}_{k|k}^{{\rm{CV}}}$\$\hat {\boldsymbol{x} }_{k|k}^{ {\rm{CA} }}$, ${\boldsymbol{P}}_{k|k}^{{\rm{CA}}}$\$\hat {\boldsymbol{x}}_{k|k}^{{\rm{CT}}},{\text{ } }{\boldsymbol{P}}_{k|k}^{{\rm{CT}}}$。(2) 通过式(8)、式(10)、式(11)、式(13)计算各模型的预测概率
$\bar c_k^{(i)}$和预测状态估计误差协方差${\boldsymbol{P}}_{k + 1|k + 1}^{(i)}$。(3) 通过式(37)的加权融合,得到${\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$} }{{\boldsymbol{P}}} _{k + 1|k + 1} }$。 (4) if (CBO) (5) 通过网格搜索找到式(30)或式(34)的最优波形参数${{\boldsymbol{\theta}} _{k + 1} }$。 (6) else (ERQL) (7) 根据式(38)和式(39)计算预测奖励${r_{k + 1}}$,通过式(35)更新每
个波形的Q表,重复此步骤,直到完成所需的单步预测次数或者
Q表收敛。(8) 选择Q表中最大Q值所对应的策略作为$k + 1$时刻的波形选择
策略$ \pi _{k + 1}^{\text{*}}(s) $。(9) 根据波形选择策略$ \pi _{k + 1}^*(s) $选择波形参数${{\boldsymbol{\theta}} _{k + 1} }$。 (10) end if (11) 根据波形参数${{\boldsymbol{\theta}} _{k + 1} }$,发射最优波形。 表 2 不同方法的ARMSE对比结果
Table 2. ARMSE comparison results of different methods
方法 ${\bar X_{{\rm{pos}}} }$ ${\bar Y_{{\rm{pos}}} }$ ${\bar X_{{\rm{vel}}} }$ ${\bar Y_{{\rm{vel}}} }$ Fixed-P 18.05 m 20.47 m 2.88 m/s 4.10 m/s Min-MSE 13.83 m 15.55 m 1.50 m/s 1.93 m/s Max-MI 14.44 m 15.79 m 1.46 m/s 1.92 m/s ERQL-10 15.40 m 17.98 m 1.87 m/s 2.55 m/s ERQL-40 14.25 m 15.95 m 1.71 m/s 2.32 m/s 表 3 CBO和ERQL方法相比于Fixed-P方法的跟踪性能改善与CPU时间比较(%)
Table 3. CBO and ERQL methods compared with Fixed-P methods for improved tracking performance and CPU time (%)
方法 ${X_{{\rm{pos}}} }$ ${Y_{{\rm{pos}}} }$ ${X_{{\rm{vel}}} }$ ${Y_{{\rm{vel}}} }$ CPU time Min-MSE 23.38 24.04 47.92 52.93 8619 Max-MI 20.61 22.86 49.13 53.17 7893 ERQL-10 14.68 12.16 34.84 37.80 283 ERQL-20 16.01 16.76 37.28 40.73 545 ERQL-40 21.05 22.08 40.63 43.41 1081 ERQL-80 15.51 15.68 41.11 47.07 2016 -
[1] YUAN Ye, YI Wei, HOSEINNEZHAD R, et al. Robust power allocation for resource-aware multi-target tracking with colocated MIMO radars[J]. IEEE Transactions on Signal Processing, 2021, 69: 443–458. doi: 10.1109/TSP.2020.3047519 [2] SUN Zhichao, YEN G G, WU Junjie, et al. Mission planning for energy-efficient passive UAV radar imaging system based on substage division collaborative search[J]. IEEE Transactions on Cybernetics, 2023, 53(1): 275–288. doi: 10.1109/TCYB.2021.3090662 [3] LIANG Jing and LIANG Qilian. Design and analysis of distributed radar sensor networks[J]. IEEE Transactions on Parallel and Distributed Systems, 2011, 22(11): 1926–1933. doi: 10.1109/TPDS.2011.45 [4] HAYKIN S. Cognitive radar: A way of the future[J]. IEEE Signal Processing Magazine, 2006, 23(1): 30–40. doi: 10.1109/MSP.2006.1593335 [5] LUO Zihan, LIANG Jing, and XU Zekai. Intelligent waveform optimization for target tracking in radar sensor networks[C]. 10th International Conference on Communications, Signal Processing, and Systems (CSPS), Changbaishan, China, 2021: 165–172. [6] HAYKIN S. Cognition is the key to the next generation of radar systems[C]. 2009 IEEE 13th Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop, Marco Island, USA, 2009: 463–467. [7] HAYKIN S, ZIA A, ARASARATNAM I, et al. Cognitive tracking radar[C]. 2010 IEEE Radar Conference, Arlington, USA, 2010: 1467–1470. [8] GUERCI J R. Cognitive radar: A knowledge-aided fully adaptive approach[C]. 2010 IEEE Radar Conference, Arlington, USA, 2010: 1365–1370. [9] GUERCI J R, GUERCI R M, RANAGASWAMY M, et al. CoFAR: Cognitive fully adaptive radar[C]. 2014 IEEE Radar Conference, Cincinnati, USA, 2014: 984–989. [10] GUERCI J R. Cognitive Radar: The Knowledge-Aided Fully Adaptive Approach[M]. 2nd ed. Norwood, USA: Artech House, 2020. [11] BELL K L, BAKER C J, SMITH G E, et al. Cognitive radar framework for target detection and tracking[J]. IEEE Journal of Selected Topics in Signal Processing, 2015, 9(8): 1427–1439. doi: 10.1109/JSTSP.2015.2465304 [12] SMITH G E, CAMMENGA Z, MITCHELL A, et al. Experiments with cognitive radar[C]. 2015 IEEE 6th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), Cancun, Mexico, 2015: 293–296. [13] ZHANG Lingzhao and JIANG Min. Cognitive radar target tracking algorithm based on waveform selection[C]. 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 2021: 1506–1510. [14] HULEIHEL W, TABRIKIAN J, and SHAVIT R. Optimal adaptive waveform design for cognitive MIMO radar[J]. IEEE Transactions on Signal Processing, 2013, 61(20): 5075–5089. doi: 10.1109/TSP.2013.2269045 [15] ALDAYEL O, MONGA V, and RANGASWAMY M. Successive QCQP refinement for MIMO radar waveform design under practical constraints[J]. IEEE Transactions on Signal Processing, 2016, 64(14): 3760–3774. doi: 10.1109/TSP.2016.2552501 [16] FENG Shuo and HAYKIN S. Cognitive risk control for transmit-waveform selection in vehicular radar systems[J]. IEEE Transactions on Vehicular Technology, 2018, 67(10): 9542–9556. doi: 10.1109/TVT.2018.2857718 [17] SAVAGE C O and MORAN B. Waveform selection for maneuvering targets within an IMM framework[J]. IEEE Transactions on Aerospace and Electronic Systems, 2007, 43(3): 1205–1214. doi: 10.1109/TAES.2007.4383612 [18] CLEMENTE C, SHOROKHOV I, PROUDLER I, et al. Radar waveform libraries using fractional Fourier transform[C]. 2014 IEEE Radar Conference, Cincinnati, USA, 2014: 855–858. [19] ZHAO Dehua, WEI Yinsheng, and LIU Yongtan. Real-time waveform adaption in spectral crowed environment using a sub-waveforms-based library[C]. 2016 CIE International Conference on Radar, Guangzhou, China, 2016: 1–5. [20] NGUYEN N H, DOGANCAY K, and DAVIS L M. Adaptive waveform selection for multistatic target tracking[J]. IEEE Transactions on Aerospace and Electronic Systems, 2015, 51(1): 688–701. doi: 10.1109/TAES.2014.130723 [21] ROMAN J. R., GARNHAM J. W. and ANTONIK P., Information Theoretic Criterion for Waveform Selection. Fourth IEEE Workshop on Sensor Array and Multichannel Processing, 2006., Waltham, MA, USA, 2006, 444-448, doi: 10.1109/SAM.2006.1706172. [22] CAO Xin, ZHENG Zhe, and AN Di. Adaptive waveform selection algorithm based on reinforcement learning for cognitive radar[C]. 2019 IEEE 2nd International Conference on Automation, Electronics and Electrical Engineering (AUTEEE), Shenyang, China, 2019: 208–213. [23] HAN Bo, HUANG Hanqiao, LEI Lei, et al. An improved IMM algorithm based on STSRCKF for maneuvering target tracking[J]. IEEE Access, 2019, 7: 57795–57804. doi: 10.1109/ACCESS.2019.2912983 [24] BLACKMAN S S, DEMPSTER R J, BUSCH M T, et al. IMM/MHT solution to radar benchmark tracking problem[J]. IEEE Transactions on Aerospace and Electronic Systems, 1999, 35(2): 730–738. doi: 10.1109/7.766953 [25] KERSHAW D J and EVANS R J. Optimal waveform selection for tracking systems[J]. IEEE Transactions on Information Theory, 1994, 40(5): 1536–1550. doi: 10.1109/18.333866 [26] SIRA S P, PAPANDREOU-SUPPAPPOLA A, and MORRELL D. Advances in Waveform-Agile Sensing for Tracking[M]. Cham: Springer, 2009: 59–60. [27] WILLIAMS J L. Information theoretic sensor management[D]. [Ph. D. dissertation], Massachusetts Institute of Technology, 2007: 41–42. [28] ATHANS M and TSE E. A direct derivation of the optimal linear filter using the maximum principle[J]. IEEE Transactions on Automatic Control, 1967, 12(6): 690–698. doi: 10.1109/TAC.1967.1098732 [29] THORNTON C E, KOZY M A, BUEHRER R M, et al. Deep reinforcement learning control for radar detection and tracking in congested spectral environments[J]. IEEE Transactions on Cognitive Communications and Networking, 2020, 6(4): 1335–1349. doi: 10.1109/TCCN.2020.3019605 [30] WANG Qing, QIAO Yanming, and GAO Lirong. A cognitive radar waveform optimization approach based on deep reinforcement learning[C]. 2019 IEEE International Conference on Signal, Information and Data Processing (ICSIDP), Chongqing, China, 2019: 1–6.