Active Stealth Method based on an Intelligent Electromagnetic Jamming Strategy for Airborne Platforms
-
摘要: 当前空中平台主要采取外形优化设计、吸波材料覆盖等被动隐身技术进行雷达隐身,但由于诸多技术瓶颈导致其在多方向、宽频谱上的隐身性能受到制约,主动隐身技术作为一种补充手段逐渐成为研究重点。为提升分布式雷达组网探测系统探测下空中平台隐身性能,该文结合认知电子战特点,以大幅降低雷达接收机对电磁干扰与空中目标的感知能力为目标,提出一种基于自卫/伴随式智能电磁干扰策略的空中平台主动隐身方法,在灵活干扰波束指向、多干扰频段覆盖基础上,通过构建自适应电磁干扰策略,实现多方向、宽频段上的目标雷达散射截面积(RCS)等效缩减。具体来说,引入强化学习机制构建电子战策略生成框架,首先依托平台自身或伴随平台认知电子战系统,对外部雷达组网探测系统电磁辐射信号实时感知,并结合先期信息等综合构建高完备性观测空间,然后基于干扰样式带宽、功率、辐射方向等参数构建行为空间,并从影响雷达工作状态、降低电磁干扰暴露风险等角度设计多级奖励函数,最后采用强化学习算法完成智能体的引导训练与智能干扰策略优选。仿真实验结果表明,与传统被动隐身技术与固定干扰策略相比,所提方法有效降低了雷达组网对空中平台的探测范围与电磁干扰感知能力,对多频段雷达站的平均RCS等效缩减量最大可达9.4 dB,电磁干扰隐蔽占比不低于97.83%。生成策略下的干扰参数可根据外部电磁环境变化动态调整,有效提升了空中平台的雷达隐身性能,可为后续主动隐身技术思路发展提供参考。
-
关键词:
- 主动隐身 /
- 雷达组网探测 /
- 认知电子战 /
- 强化学习 /
- 智能电磁干扰策略生成
Abstract: Current airborne platforms rely primarily on passive stealth techniques, such as shape optimization and radar-absorbing material coatings, to reduce their radar signatures. However, due to several technical bottlenecks, their stealth performance remains constrained in terms of multidirectional and wideband effectiveness. As a complementary approach, active stealth has gradually become a research focus. To enhance the stealth performance of airborne platforms against distributed radar network detection systems, this paper proposes an active stealth method based on a self-defense/escort intelligent electromagnetic jamming strategy inspired by the principles of cognitive electronic warfare. The proposed method aims to reduce radar receivers’ perception of both electromagnetic interference and airborne targets. Through flexible jamming beam steering and multiband jamming coverage, it achieves an equivalent reduction of the target Radar Cross-Section (RCS) over multiple directions and wide frequency bands via an adaptive electromagnetic jamming strategy. Specifically, a reinforcement learning mechanism is introduced to construct an electronic warfare strategy generation framework. First, the platform’s onboard or escort cognitive electronic warfare system is used to sense, in real time, the electromagnetic radiation signals of external radar networked detection systems, and a comprehensive observation space is established by integrating prior intelligence and other relevant data. Then, an action space is formulated based on jamming parameters, such as bandwidth, power, and radiation direction. In addition, a multilevel reward function is designed to influence radar working states and reduce the risk of electromagnetic jamming exposure. Finally, a reinforcement learning algorithm is employed to train the agent and optimize the intelligent jamming strategies. Simulation results show that, compared with conventional passive stealth techniques and fixed jamming strategies, the proposed method effectively reduces both the detection range of radar networks and their perception of electromagnetic interference. The maximum average equivalent RCS reduction achieved for multiband radar stations is 9.4 dB, while the concealment rate of electromagnetic interference remains above 97.83%. Moreover, the general jamming parameters can be dynamically adjusted in response to changes in the external electromagnetic environment, substantially improving the radar stealth performance of airborne platforms and providing a reference for the development of future active stealth technologies. -
表 1 主要雷达工作状态特点
Table 1. The characteristics of the main radar working states
雷达工作状态 发射信号特点 无干扰下搜索 主空域内搜索信号波形种类相对固定,一般为低重频信号,脉宽和重周可能存在多种组合以满足不同波位的能量需求,波束驻留时间可随区域搜索任务灵活调整,可利用多波束并行搜索提升搜索效率,波束驻留时间整体较短,
单个驻留周期内的脉冲数目较为固定,工作频点固定,波束扫描顺序固定。干扰下搜索 相比于无干扰下搜索,倾向采用中高重频信号、增加波束驻留时间以提高回波信号能量积累,
单个驻留周期内的脉冲数目增多,并使用频率捷变、波形捷变、PRI抖动等措施抗干扰。无干扰下跟踪 相比于搜索,信号波形种类更多(例如相位编码、频率编码等),具备更高的距离和速度分辨率,以满足不同目标类型的跟踪需求;一般采用中高重频信号,为保证目标跟踪稳定性,波束驻留时间整体较长,单个驻留周期内的
脉冲数目较多,工作频点固定,且可根据目标机动特性实时调整波束指向。干扰下跟踪 相比于无干扰下跟踪,增加波束驻留时间以提高回波信号能量积累,单个驻留周期内的脉冲数目增多,并使用频率捷变、波形捷变等措施抗干扰,并利用多波束合成技术增强目标跟踪稳定性。 无干扰下识别 采用大带宽线性调频或特定编码信号以最求最大分辨力,采用长相干处理间隔,波束驻留时间整体较长且稳定,
呈现“凝视”特征,单个驻留周期内的脉冲数目多。干扰下识别 相比于无干扰下识别,为维持目标识别所需的信干噪比,使用频率捷变、波形捷变等措施抗干扰。 表 2 行为空间约束规则
Table 2. The constraints rules of action space
干扰参数 基于专家经验的约束限制 干扰样式 ①对大脉宽、高功率雷达信号,噪声压制干扰无法满足雷达接收内信干噪比要求时,一般使用转发欺骗干扰以获取高信号处理增益;对小脉宽、低功率雷达信号,一般使用噪声压制干扰避免侦察漏脉冲、干扰时域覆盖不完全。②在对方雷达搜索状态下,倾向使用宽带压制干扰信号,避免假目标信号提前暴露自身;在跟踪状态下,若对方使用大范围跳频信号,倾向使用宽带压制信号,降低侦察漏脉冲概率;否则倾向使用转发欺骗干扰信号,以获取信号处理增益。 干扰带宽 噪声压制干扰:宽带;转发欺骗干扰:窄带,与雷达信号带宽相同。 干扰功率 存在最大干扰功率与动态范围限制,最小步进值限制。 表 3 观测空间-雷达工作状态关系映射
Table 3. Mapping between observation space and radar working states
观测要素 对应雷达工作状态分析 信号数据率$ \text{Df} $ ①当单位采样时间内采集到的雷达脉冲信号数量(来自同一雷达)上升,即$ \text{Df} $增大,代表此时雷达可能由搜索状态转为跟踪、识别,或干扰下搜索状态;②当$ \text{Df} $减小甚至降为0,代表雷达对平台照射行为减少,可能转为无干扰下搜索状态。 信号频率变化
方差$ D\left(f\right) $①当单位采样时间内采集到的雷达信号频点变化增大,即$ D\left(f\right) $增大,代表此时雷达可能由于感受到明显干扰而采取频率捷变措施,处于干扰下搜索、跟踪或识别状态;②当$ D\left(f\right) $减小甚至降为0,代表雷达可能处于无干扰下搜索、
跟踪或识别状态。信号重周$ \text{PRI} $ 与信号数据率相关联,①当$ \text{PRI} $减小时,代表此时雷达可能为了增加单个相关处理间隔内的信号累积数目、提升信号数据率,可能由搜索状态转为跟踪、识别,或干扰下搜索状态;②当$ \text{PRI} $增大时,代表雷达可能重新转为无干扰下搜索状态。 表 4 各算法性能评估结果对比
Table 4. Comparison of evaluation results for mainstream algorithm
算法 训练时间(s) 收敛轮次/回合数 最后K个回合的平均回合奖励(K=200) 最后K个回合的平均回合奖励标准差(K=200) DDPG 564 532 394.3 12.5 PPO 342 457 443.5 10.52 TD3 436 436 471.5 10.18 1 基于TD3的主动隐身策略智能体训练伪代码附录1典型雷达信号类型的信号处理增益计算分析
1. Training pseudocode of TD3-based active stealth strategy agent
输入:随机初始化环境参数,主要包括地面雷达和飞机编队的位置信息,以及地面雷达的工作参数、飞机编队的RCS参数特性等先验信息;Actor网络$ {\pi }_{\phi } $、Critic_1网络$ {Q}_{\theta 1} $、Critic_2网络$ {Q}_{\theta 2} $、目标Actor网络$ {\pi }_{\phi '} $、目标Critic_1网络$ {Q}_{\theta 1'} $、目标Critic_2网络$ {Q}_{\theta 2'} $,经验回放池D等。
输出:主动隐身策略模型$ {\pi }_{\phi } $。For训练轮次Episode = 1 to M执行 初始化空中平台状态$ {s}_{0} $和观测空间特征向量$ \mathbf{o}{\mathbf{b}}_{0} $,其中$ \mathbf{o}{\mathbf{b}}_{0} $为$ \mathbf{OB} $转换后的一维向量,$ \mathbf{o}{\mathbf{b}}_{0}\in {R}^{14\times 1} $ For 时间步t = 1 to T执行 $ {\pi }_{\phi } $根据当前状态$ \left(s,\mathbf{o}{\mathbf{b}}_{0}\right) $输出动作$ \mathbf{ac} $:$ \mathbf{ac}\leftarrow {\pi }_{\phi }\left(s\right)+\varepsilon $,$ \varepsilon \sim N\left(0,\delta \right) $。其中,$ \mathbf{ac}\in {R}^{3\times 1} $,$ \varepsilon $表示服从均值为0、方差为$ \sigma $的高斯噪声 在环境中执行动作$ \mathbf{ac} $,观测奖励$ Rw $和下一时刻状态$ \left(s',\mathbf{ob}'\right) $,并将$ \left(s,\mathbf{ac},Rw,s'\right) $存入经验回放池D 从D中随机抽取N个样本进行小批量训练(由于RCS随空中平台运动变化较大,这里不采用优先经验回放,有助于训练收敛),对$ {\pi }_{\phi } $,
$ {Q}_{\theta 1} $, $ {Q}_{\theta 2} $, $ {\pi }_{\phi '} $, $ {Q}_{\theta 1'} $, $ {Q}_{\theta 2'} $进行更新计算目标Actor网络$ {\pi }_{\phi '} $在状态$ s' $下的输出动作$ \overline{\mathbf{ac}}\leftarrow {\pi }_{\phi '}\left(s'\right)+\varepsilon ' $, $ \varepsilon '\sim clip\left(N\left(0,\delta \right),-l,l\right) $。其中,l为截断噪声边界 运用目标Critic网络$ {Q}_{\theta 1'} $, $ {Q}_{\theta 2'} $计算目标Q值y:
$ y\leftarrow \text{Rw}+\gamma \min \left({Q}_{\theta 1'}\left(s',\overline{\mathbf{ac}}\right),{Q}_{\theta 2'}\left(s',\overline{\mathbf{ac}}\right)\right) $。其中,$ {Q}_{\theta 1'}\left(s',\overline{\mathbf{ac}}\right) $, $ {Q}_{\theta 2'}\left(s',\overline{\mathbf{ac}}\right) $为$ {Q}_{\theta 1'} $, $ {Q}_{\theta 2'} $对状态$ s' $下$ \overline{\mathbf{ac}} $的动作价值估计更新Critic网络$ {Q}_{\theta 1} $, $ {Q}_{\theta 2} $,有$ \theta i\leftarrow {\arg }_{i=1,2}\min \left[{N}^{-1}\cdot \sum {\left(y-{Q}_{\theta i}\left(s,\mathbf{ac}\right)\right)}^{2}\right] $。其中, $ {Q}_{\theta i}\left(s,\mathbf{ac}\right) $ 为$ {Q}_{\theta 1} $, $ {Q}_{\theta 2} $对状态s下$ \mathbf{ac} $的动
作价值估计If t (mod T) == 0 则 更新$ {\pi }_{\phi } $,有$ {\nabla }_{\phi }J\left(\phi \right)={N}^{-1}\sum {\nabla }_{\mathbf{ac}}{Q}_{\theta 1}\left(s,\mathbf{ac}\right)\left| {}_{\mathbf{ac}={{\pi }_{\phi }}\left(s\right)}\right.{\nabla }_{\phi }{\pi }_{\phi }\left(s\right) $ 软更新目标网络$ {\pi }_{\phi '} $, $ {Q}_{\theta 1'} $, $ {Q}_{\theta 2'} $参数,$ \theta i'\leftarrow \tau \cdot \theta i+\left(1-\tau \right)\theta i' $, $ \phi '\leftarrow \tau \cdot \phi +\left(1-\tau \right)\phi ' $。其中$ \tau $为软更新参数 End if End for End for 输出智能干扰策略模型$ {\pi }_{\phi } $ 表 5 雷达站主要典型参数
Table 5. The main typical parameters of radar stations
参数类型 雷达1站 雷达2站 雷达3站 雷达4站 发射机峰值功率 40 kW 100 kW 100 kW 50 kW 天线主瓣增益 20 dB 30 dB 30 dB 20 dB 工作频段 L频段 C频段 C频段 S频段 接收机目标检测所需信噪比 12 dB 21 dB 21 dB 13 dB 发射/接收损耗 3 dB/3 dB 3 dB/3 dB 3 dB/3 dB 3 dB/3 dB 搜索信号脉宽/带宽/重周组合 150 us/5 MHz/
2000 us30 us/5 MHz/500 us 30 us/5 MHz/500 us 100 us/5 MHz/ 1000 us跟踪信号脉宽/带宽/重周组合 组合1:100 us/
5 MHz/1500 us
组合2:50 us/
5 MHz/1000 us组合1:15 us/
5 MHz/200 us
组合2:7 us/
5 MHz/100 us组合1:15 us/
5 MHz/200 us
组合2:7 us/
5 MHz/100 us组合1:50 us/
5 MHz/334 us
组合2:25 us/
5 MHz/167 us搜索/跟踪状态下单个相干处理间隔内的脉冲累积数目 32/64 16/32 16/32 8/16 表 6 机载电子战系统主要典型参数
Table 6. The main typical parameters of airborne electronic warfare systems
参数类型 具体取值指标 等效辐射功率 100 W 动态范围 70 dB 步进精度 1 dB 干扰样式 宽带压制干扰(噪声干扰)、转发欺骗干扰(切片干扰) 干扰带宽 宽带压制干扰:[50 MHz, 100 MHz, 150 MHz, 200 MHz, 250 MHz, 300 MHz]共6种带宽;
转发欺骗干扰:与雷达信号带宽相同接收损耗 4 dB 极化失配损失 3 dB 表 7 未使用干扰策略下各雷达站及雷达组网探测系统工作状态占比情况(%)
Table 7. The working state proportions of each radar station and radar network detection system without jamming strategy (%)
类型 无干扰下搜索 无干扰下跟踪 雷达1站 1.63 98.37 雷达2站 37.13 62.87 雷达3站 35.77 64.23 雷达4站 22.22 77.78 雷达组网探测系统 0 100 表 8 使用固定干扰策略下各雷达站及雷达组网探测系统工作状态占比情况(%)
Table 8. The working state proportions of each radar station and radar network detection system with fixed jamming strategy(%)
类型 无干扰下搜索 干扰下搜索 无干扰下跟踪 干扰下跟踪 雷达1站 1.08 98.92 0.00 0.00 雷达2站 1.08 98.92 0.00 0.00 雷达3站 1.08 98.92 0.00 0.00 雷达4站 1.08 98.92 0.00 0.00 雷达组网探测系统 1.08 98.92 0.00 0.00 表 9 使用智能干扰策略下各雷达站及雷达组网探测系统工作状态占比情况(%)
Table 9. The working state proportions of each radar station and radar network detection system with intelligent jamming strategy (%)
类型 无干扰下搜索 干扰下搜索 无干扰下跟踪 干扰下跟踪 雷达1站 95.93 0.00 1.90 2.17 雷达2站 99.19 0.00 0.81 0.00 雷达3站 88.00 0.00 10.40 1.60 雷达4站 97.02 0.54 1.63 0.81 雷达组网探测系统 79.95 0.54 14.63 4.88 表 10 相对各雷达站的RCS平均等效缩减量(dB)
Table 10. The average equivalent RCS reduction for each radar station(dB)
类型 最大RCS等效
缩减量最小RCS等效
缩减量平均RCS等效
缩减量雷达1站 23.67 –21.64 9.40 雷达2站 18.43 –33.13 2.91 雷达3站 17.52 –26.78 2.84 雷达4站 20.51 –31.30 3.24 表 11 使用智能干扰策略下对各雷达站与雷达组网的电磁干扰隐蔽性比例(%)
Table 11. The jamming stealth ratio for each radar station and radar network detection system with intelligent jamming strategy(%)
类型 电磁干扰暴露比例 电磁干扰隐蔽比例 雷达1站 2.17 97.83 雷达2站 0.00 100.00 雷达3站 1.60 98.40 雷达4站 1.35 98.65 雷达组网探测系统 5.42 94.58 -
[1] XU Fan, LAI Wenhai, and SHEN Kaiming. Intelligent surface assisted radar stealth against unauthorized ISAC[J]. IEEE Wireless Communications Letters, 2025, 14(4): 1149–1153. doi: 10.1109/LWC.2025.3535921. [2] TIWARI P, PATHAK S K, and SIJU V. Design, development and characterization of resistive arm based planar and conformal metasurfaces for RCS reduction[J]. Scientific Reports, 2022, 12(1): 14992. doi: 10.1038/s41598-022-19075-x. [3] SUN Chengtao, LI Dawei, LIAO Wenhe, et al. Rigid-flexible interlocked metastructures enable conformal stealth[J]. Science Advances, 2025, 11(49): eaeb7870. doi: 10.1126/sciadv.aeb7870. [4] OUYANG Wenchong, DING Chengbiao, LIU Qi, et al. Arrayed multiple atmospheric-pressure plasma jet sources for active stealth[J]. Cell Reports Physical Science, 2024, 5(1): 101771. doi: 10.1016/j.xcrp.2023.101771. [5] CHANG Qi, JI Jinzu, WU Wenxing, et al. An optically transparent metamaterial absorber with tunable absorption bandwidth and low infrared emissivity[J]. Materials, 2023, 16(23): 7357. doi: 10.3390/ma16237357. [6] ZHANG Haonan, GUO Qingxin, LIU Jinbo, et al. Low radar cross section metasurface based on polarization converter[C]. 2024 IEEE International Symposium on Antennas and Propagation and INC/USNC-URSI Radio Science Meeting, Firenze, Italy, 2024: 1321–1322. doi: 10.1109/AP-S/INC-USNC-URSI52054.2024.10686477. [7] LU Yao, SU Jianxun, LIU Jinbo, et al. Ultrawideband monostatic and bistatic RCS reductions for both copolarization and cross polarization based on polarization conversion and destructive interference[J]. IEEE Transactions on Antennas and Propagation, 2019, 67(7): 4936–4941. doi: 10.1109/TAP.2019.2911185. [8] RAJABALIPANAH H and ABDOLALI A. Ultrabroadband monostatic/bistatic RCS reduction via high-entropy phase-encoded polarization conversion metasurfaces[J]. IEEE Antennas and Wireless Propagation Letters, 2019, 18(6): 1233–1237. doi: 10.1109/LAWP.2019.2913465. [9] SU Jianxun, LI Wenyu, QU Meijun, et al. Ultrawideband RCS reduction metasurface based on hybrid mechanism of absorption and phase cancellation[J]. IEEE Transactions on Antennas and Propagation, 2022, 70(10): 9415–9424. doi: 10.1109/TAP.2022.3184538. [10] 刘雄. 主动隐身技术中目标的有源对消效果研究[D]. [硕士论文], 电子科技大学, 2018.LIU Xiong. Research on cancellation effect of targets in active stealh technology[D]. [Master dissertation], University of Electronic Science and Technology of China, 2018. [11] 边晓臣, 黄沛霖, 姬金祖. 基于线性调频波的有源对消隐身仿真及分析[J]. 北京航空航天大学学报, 2016, 42(8): 1769–1776. doi: 10.13700/j.bh.1001-5965.2015.0492.BIAN Xiaochen, HUANG Peilin, and JI Jinzu. Simulation and analysis of active cancellation stealth based on LFM wave[J]. Journal of Beijing University of Aeronautics and Astronautics, 2016, 42(8): 1769–1776. doi: 10.13700/j.bh.1001-5965.2015.0492. [12] LEE I G, YOON Y J, CHOI K S, et al. Design of an optical transparent absorber and defect diagnostics analysis based on near-field measurement[J]. Sensors, 2021, 21(9): 3076. doi: 10.3390/s21093076. [13] 王家兴. 有源对消隐身试验系统关键技术研究[D]. [硕士论文], 电子科技大学, 2024. doi: 10.27005/d.cnki.gdzku.2024.000532.WANG Jiaxing. Research on active stealth cancellation test system and its key technologies[D]. [Master dissertation], University of Electronic Science and Technology of China, 2024. doi: 10.27005/d.cnki.gdzku.2024.000532. [14] SENGUPTA S, COUNCIL H, JACKSON D R, et al. Active radar cross section reduction of an object using microstrip antennas[J]. Radio Science, 2020, 55(2): 1–20. doi: 10.1029/2019RS006939. [15] 冯清帅. 电磁散射调控理论与技术研究[D]. [硕士论文], 电子科技大学, 2022. doi: 10.27005/d.cnki.gdzku.2022.003203.FENG Qingshuai. Research on electromagnetic scattering control theory and technology[D]. [Master dissertation], University of Electronic Science and Technology of China, 2022. doi: 10.27005/d.cnki.gdzku.2022.003203. [16] 刘松涛, 雷震烁, 温镇铭, 等. 认知电子战研究进展[J]. 探测与控制学报, 2020, 42(5): 1–15.LIU Songtao, LEI Zhenshuo, WEN Zhenming, et al. A development review on cognitive electronic warfare[J]. Journal of Detection & Control, 2020, 42(5): 1–15. [17] BARBOSA M, PRALON L, RAMOS A L L, et al. On a closer look of a Doppler tolerant noise radar waveform in surveillance applications[J]. Sensors, 2024, 24(8): 2532. doi: 10.3390/s24082532. [18] CHEN Dong, ZHANG Kaixiang, WANG Yongqiang, et al. Communication-efficient decentralized multi-agent reinforcement learning for cooperative adaptive cruise control[J]. IEEE Transactions on Intelligent Vehicles, 2024, 9(10): 6436–6449. doi: 10.1109/TIV.2024.3368025. [19] XU Yuting, WANG Chao, LIANG Jiakai, et al. Deep reinforcement learning based decision making for complex jamming waveforms[J]. Entropy, 2022, 24(10): 2441. doi: 10.3390/e24101441. [20] 鲁永为, 张赛楠, 郭慧峰, 等. 基于间歇采样延时叠加的干扰时序研究[J]. 现代雷达, 2020, 42(5): 52–56. doi: 10.16592/j.cnki.1004-7859.2020.05.010.LU Yongwei, ZHANG Sainan, GUO Huifeng, et al. A study on jamming time sequence based on interrupted sampling with time-delay superposition[J]. Modern Radar, 2020, 42(5): 52–56. doi: 10.16592/j.cnki.1004-7859.2020.05.010. [21] 潘小义, 刘晓斌, 陈吉源, 等. 间歇采样转发干扰技术研究述评[J]. 系统工程与电子技术, 2024, 46(9): 2887–2907. doi: 10.12305/j.issn.1001-506X.2024.09.01.PAN Xiaoyi, LIU Xiaobin, CHEN Jiyuan, et al. Overview of intermittent sampling repeater jamming technology[J]. Systems Engineering and Electronics, 2024, 46(9): 2887–2907. doi: 10.12305/j.issn.1001-506X.2024.09.01. [22] 谯梁, 杨帅, 王鑫, 等. 雷达干扰效果评估与协同干扰策略分配算法研究[J]. 航天电子对抗, 2019, 35(3): 27–32. doi: 10.3969/j.issn.1673-2421.2019.03.007.QIAO Liang, YANG Shuai, WANG Xin, et al. Effect evaluation of radar interference and cooperative interference strategy allocation algorithm[J]. Aerospace Electronic Warfare, 2019, 35(3): 27–32. doi: 10.3969/j.issn.1673-2421.2019.03.007. [23] FUJIMOTO S, VAN HOOF H, and MEGER D. Addressing function approximation error in actor-critic methods[C]. The 35th International Conference on Machine Learning, Stockholm, Sweden, 2018: 1587–1596. [24] WANG Di and HU Mengqi. Deep deterministic policy gradient with compatible critic network[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(8): 4332–4344. doi: 10.1109/TNNLS.2021.3117790. [25] SAHA U, JAWAD A, SHAHRIA S, et al. Proximal policy optimization-based reinforcement learning approach for DC-DC boost converter control: A comparative evaluation against traditional control techniques[J]. Heliyon, 2024, 10(18): e37823. doi: 10.1016/j.heliyon.2024.e37823. [26] 李健涛, 王轲昕, 刘凯, 等. 基于深度强化学习的干扰资源分配方法[J]. 现代雷达, 2023, 45(10): 44–51. doi: 10.16592/j.cnki.1004-7859.2023.10.005.LI Jiantao, WANG Kexin, LIU Kai, et al. Jamming resource allocation method based on deep reinforcement learning[J]. Modern Radar, 2023, 45(10): 44–51. doi: 10.16592/j.cnki.1004-7859.2023.10.005. -
作者中心
专家审稿
责编办公
编辑办公
下载: