2023 Vol. 12, No. 3
2023, 12(3): 471-499.
Multi-Radar Collaborative Surveillance (MRCS) technology enables a geographically distributed detection configuration through the linkage of multiple radars, which can fully obtain detection gains in terms of spatial and frequency diversity, thereby enhancing the detection performance and viability of radar systems in the context of complex electromagnetic environments. MRCS is one of the key development directions in radar technology and has received extensive attention in recent years. Considerable research on MRCS has been conducted, and numerous achievements in system architecture design, signal processing, and resource scheduling for MRCS have been accumulated. This paper first summarizes the concept of MRCS technology, elaborates on the signal processing-based closed-loop mechanism of cognitive collaboration, and analyzes the challenges faced in the process of MRCS’s implementation. Then, the paper focuses on cognitive tracking and resource scheduling algorithms and implements the technical summary regarding the connotation characteristics, system configuration, tracking model, information fusion, performance evaluation, resource scheduling algorithm, optimization criteria, and cognitive process of cognitive tracking. The relevance between multi-radar cognitive tracking and its system resource scheduling is further analyzed. Subsequently, the recent research trends of cognitive tracking and resource scheduling algorithms are identified and summarized in terms of five aspects: radar resource elements, information fusion architectures, tracking performance indicators, resource scheduling models, and complex task scenarios. Finally, the full text is summarized and future technology in this field is explored to provide a reference for subsequent research on related technologies.
Multi-Radar Collaborative Surveillance (MRCS) technology enables a geographically distributed detection configuration through the linkage of multiple radars, which can fully obtain detection gains in terms of spatial and frequency diversity, thereby enhancing the detection performance and viability of radar systems in the context of complex electromagnetic environments. MRCS is one of the key development directions in radar technology and has received extensive attention in recent years. Considerable research on MRCS has been conducted, and numerous achievements in system architecture design, signal processing, and resource scheduling for MRCS have been accumulated. This paper first summarizes the concept of MRCS technology, elaborates on the signal processing-based closed-loop mechanism of cognitive collaboration, and analyzes the challenges faced in the process of MRCS’s implementation. Then, the paper focuses on cognitive tracking and resource scheduling algorithms and implements the technical summary regarding the connotation characteristics, system configuration, tracking model, information fusion, performance evaluation, resource scheduling algorithm, optimization criteria, and cognitive process of cognitive tracking. The relevance between multi-radar cognitive tracking and its system resource scheduling is further analyzed. Subsequently, the recent research trends of cognitive tracking and resource scheduling algorithms are identified and summarized in terms of five aspects: radar resource elements, information fusion architectures, tracking performance indicators, resource scheduling models, and complex task scenarios. Finally, the full text is summarized and future technology in this field is explored to provide a reference for subsequent research on related technologies.
2023, 12(3): 500-515.
Compared with single-radar systems, spatially separated networked radar usually has better detection performance due to its advantages of spatial and frequency diversities. Most of the current fusion detection methods based on networked radar systems only rely on the echo amplitude information of the target without considering the Doppler information that a coherent radar system can obtain to assist detection of targets. Intuitively, the spatial position and radial velocity of a target observed by different radar devices in the networked radar systems should meet certain physical constraints under which the target and false target can be substantially distinguished. Based on this consideration, fusion detection for the networked radar aided by a Doppler information algorithm is proposed in this paper. First, a set of inequalities is constructed based on the coupling between the observation of the same target’s azimuth and Doppler velocity by multiple radar stations. Then, a two-phase method, an algorithm in operational research, is used to judge whether the inequalities have a feasible solution, based on which a judgment is made on whether the target exists. Finally, some simulations are conducted, which show that the proposed algorithm can effectively improve the detection performance of the networked radar system fusion detection. Additionally, the influence of radar station location and target position on the fusion detection performance of the proposed algorithm is analyzed.
Compared with single-radar systems, spatially separated networked radar usually has better detection performance due to its advantages of spatial and frequency diversities. Most of the current fusion detection methods based on networked radar systems only rely on the echo amplitude information of the target without considering the Doppler information that a coherent radar system can obtain to assist detection of targets. Intuitively, the spatial position and radial velocity of a target observed by different radar devices in the networked radar systems should meet certain physical constraints under which the target and false target can be substantially distinguished. Based on this consideration, fusion detection for the networked radar aided by a Doppler information algorithm is proposed in this paper. First, a set of inequalities is constructed based on the coupling between the observation of the same target’s azimuth and Doppler velocity by multiple radar stations. Then, a two-phase method, an algorithm in operational research, is used to judge whether the inequalities have a feasible solution, based on which a judgment is made on whether the target exists. Finally, some simulations are conducted, which show that the proposed algorithm can effectively improve the detection performance of the networked radar system fusion detection. Additionally, the influence of radar station location and target position on the fusion detection performance of the proposed algorithm is analyzed.
2023, 12(3): 516-528.
Multistation cooperative radar target recognition aims to enhance recognition performance by utilizing the complementarity between multistation information. Conventional multistation cooperative target recognition methods do not explicitly consider the issue of interstation data differences and typically adopt relatively simple fusion strategies, which makes it difficult to obtain accurate and robust recognition performance. In this study, we propose an angle-guided transformer fusion network for multistation radar High-Resolution Range Profile (HRRP) target recognition. The extraction of the local and global features of the single-station HRRP is conducted via feature extraction, which employs a transformer as its main structure. Furthermore, three new auxiliary modules are created to facilitate fusion learning: the angle-guided module, the prefeature interaction module, and the deep attention feature fusion module. First, the angle guidance module enhances the robustness and consistency of features via modeling data differences between multiple stations and reinforces individual features associated with the observation perspective. Second, the fusion approach is optimized, and the multilevel hierarchical fusion of multistation features is achieved by combining the prefeature interaction module and the deep attention feature fusion module. Finally, the experiments are conducted on the basis of the simulated multistation scenarios with measured data, and the outcomes demonstrate that our approach can effectively enhance the performance of target recognition in multistation coordination.
Multistation cooperative radar target recognition aims to enhance recognition performance by utilizing the complementarity between multistation information. Conventional multistation cooperative target recognition methods do not explicitly consider the issue of interstation data differences and typically adopt relatively simple fusion strategies, which makes it difficult to obtain accurate and robust recognition performance. In this study, we propose an angle-guided transformer fusion network for multistation radar High-Resolution Range Profile (HRRP) target recognition. The extraction of the local and global features of the single-station HRRP is conducted via feature extraction, which employs a transformer as its main structure. Furthermore, three new auxiliary modules are created to facilitate fusion learning: the angle-guided module, the prefeature interaction module, and the deep attention feature fusion module. First, the angle guidance module enhances the robustness and consistency of features via modeling data differences between multiple stations and reinforces individual features associated with the observation perspective. Second, the fusion approach is optimized, and the multilevel hierarchical fusion of multistation features is achieved by combining the prefeature interaction module and the deep attention feature fusion module. Finally, the experiments are conducted on the basis of the simulated multistation scenarios with measured data, and the outcomes demonstrate that our approach can effectively enhance the performance of target recognition in multistation coordination.
2023, 12(3): 529-540.
To reduce the probability of UAV (Unmanned Aerial Vehicle) being destroyed during a reconnaissance mission, this study proposes an effective path planning algorithm to reduce the target threat. First, high-resolution airborne radar is used for robust tracking and estimation of multiple extended targets. Subsequently, the targets are classified based on the threat degree calculated via fuzzy TOPSIS (Technique for Order Preference by Similarity to an Ideal Solution). Next, path planning of a UAV is performed considering joint optimization of multiple task decision-making (the joint evaluation of the target threat degree and target tracking performance) as an evaluation criterion. The simulation results indicate that the fuzzy threat assessment method is effective in multiple extended target tracking, and the proposed UAV path planning algorithm is reasonable. Thus the target threat is efficiently reduced without losing the tracking accuracy.
To reduce the probability of UAV (Unmanned Aerial Vehicle) being destroyed during a reconnaissance mission, this study proposes an effective path planning algorithm to reduce the target threat. First, high-resolution airborne radar is used for robust tracking and estimation of multiple extended targets. Subsequently, the targets are classified based on the threat degree calculated via fuzzy TOPSIS (Technique for Order Preference by Similarity to an Ideal Solution). Next, path planning of a UAV is performed considering joint optimization of multiple task decision-making (the joint evaluation of the target threat degree and target tracking performance) as an evaluation criterion. The simulation results indicate that the fuzzy threat assessment method is effective in multiple extended target tracking, and the proposed UAV path planning algorithm is reasonable. Thus the target threat is efficiently reduced without losing the tracking accuracy.
2023, 12(3): 541-549.
Most traditional multi-aircraft flight path optimization methods are oriented toward area coverage, use static optimization models, and face the challenge of model mismatch under complex dynamic environments. Therefore, this study proposes a flight path optimization method for dynamic area coverage based on multi-aircraft radars. First, we introduce an attenuation factor to this method to characterize the actual coverage effect of airborne radar on a dynamic environment, and we take the area coverage rate under the dynamic area coverage background as the optimization function. After integrating the constraints of multi-dimensional flight path control parameters to be optimized, we built a mathematical model for dynamic area coverage flight path optimization based on multi-aircraft radars. Then, the stochastic optimization method is used to solve the flight path optimization problem of dynamic area coverage. Finally, the simulation results show that the proposed flight path optimization method can significantly improve the dynamic coverage performance in dynamic areas compared with the search mode using preset flight paths based on multi-aircraft radars. Compared with the traditional flight path optimization method oriented to static environments, the dynamic coverage performance of our proposed method is improved by approximately 6% on average.
Most traditional multi-aircraft flight path optimization methods are oriented toward area coverage, use static optimization models, and face the challenge of model mismatch under complex dynamic environments. Therefore, this study proposes a flight path optimization method for dynamic area coverage based on multi-aircraft radars. First, we introduce an attenuation factor to this method to characterize the actual coverage effect of airborne radar on a dynamic environment, and we take the area coverage rate under the dynamic area coverage background as the optimization function. After integrating the constraints of multi-dimensional flight path control parameters to be optimized, we built a mathematical model for dynamic area coverage flight path optimization based on multi-aircraft radars. Then, the stochastic optimization method is used to solve the flight path optimization problem of dynamic area coverage. Finally, the simulation results show that the proposed flight path optimization method can significantly improve the dynamic coverage performance in dynamic areas compared with the search mode using preset flight paths based on multi-aircraft radars. Compared with the traditional flight path optimization method oriented to static environments, the dynamic coverage performance of our proposed method is improved by approximately 6% on average.
2023, 12(3): 550-562.
A utility maximization-based multiradar online task planning algorithm aiming at the real-time multitask planning problem is proposed in this paper. Using the maximization of the task utility function as the objective, multiradar task planning is formulated as an integer programming-based mixed multivariable optimization problem. Then, two algorithms, namely heuristic greedy search and convex relaxation-based two-step decoupling are proposed to solve the resulting NP-hard optimization problem in polynomial time, respectively. Simulation experiments demonstrate that compared with the optimal exhaustive search algorithm, the proposed algorithms can effectively reduce the computing time or improve solution efficiency such that the real-time requirement of online task planning can be satisfied.
A utility maximization-based multiradar online task planning algorithm aiming at the real-time multitask planning problem is proposed in this paper. Using the maximization of the task utility function as the objective, multiradar task planning is formulated as an integer programming-based mixed multivariable optimization problem. Then, two algorithms, namely heuristic greedy search and convex relaxation-based two-step decoupling are proposed to solve the resulting NP-hard optimization problem in polynomial time, respectively. Simulation experiments demonstrate that compared with the optimal exhaustive search algorithm, the proposed algorithms can effectively reduce the computing time or improve solution efficiency such that the real-time requirement of online task planning can be satisfied.
2023, 12(3): 563-575.
The joint optimization problem of transmit power and dwell time of radar for asynchronous multi-target tracking in heterogeneous multiple radar networks with imperfect detection is investigated. Firstly, all the asynchronous measurements from different radar node in each fusion sampling interval are fused into composite measurement, thus the Bayesian Cramér-Rao Lower Bound (BCRLB) analytical expression of the asynchronous target tracking error with parameters such as radar node selection, transmit power and dwell time with imperfect detection is derived and used as the asynchronous target tracking accuracy measure. Based on this, a joint optimization model of transmit power and dwell time for asynchronous multi-target tracking in heterogeneous multiple radar networks with imperfect detection is established, with the optimization objective of minimizing the asynchronous multi-target tracking error and the constraints of given system transmit resource limitations, the parameters such as radar node selection, transmit power and dwell time in different radar networks are designed adaptively and optimally so as to improve the asynchronous multi-target tracking accuracy of the heterogeneous multiple radar networks system. Finally, a four-step decomposition algorithm combined with the Sequential Quadratic Programming (SQP) algorithm and cyclic minimization method is used to solve the optimization problem. Simulation results demonstrate that the asynchronous multi-target tracking accuracy of the heterogeneous multiple radar networks outperforms existing algorithms.
The joint optimization problem of transmit power and dwell time of radar for asynchronous multi-target tracking in heterogeneous multiple radar networks with imperfect detection is investigated. Firstly, all the asynchronous measurements from different radar node in each fusion sampling interval are fused into composite measurement, thus the Bayesian Cramér-Rao Lower Bound (BCRLB) analytical expression of the asynchronous target tracking error with parameters such as radar node selection, transmit power and dwell time with imperfect detection is derived and used as the asynchronous target tracking accuracy measure. Based on this, a joint optimization model of transmit power and dwell time for asynchronous multi-target tracking in heterogeneous multiple radar networks with imperfect detection is established, with the optimization objective of minimizing the asynchronous multi-target tracking error and the constraints of given system transmit resource limitations, the parameters such as radar node selection, transmit power and dwell time in different radar networks are designed adaptively and optimally so as to improve the asynchronous multi-target tracking accuracy of the heterogeneous multiple radar networks system. Finally, a four-step decomposition algorithm combined with the Sequential Quadratic Programming (SQP) algorithm and cyclic minimization method is used to solve the optimization problem. Simulation results demonstrate that the asynchronous multi-target tracking accuracy of the heterogeneous multiple radar networks outperforms existing algorithms.
2023, 12(3): 576-589.
This paper establishes a hybrid distributed Phased-Array Multiple-Input Multiple-Output (PA-MIMO) radar system model, which combines coherent processing gain and spatial diversity gain to synergistically improve the target detection performance. We derive a Likelihood Ratio Test (LRT) detector based on the Neyman-Pearson (NP) criterion for the hybrid distributed PA-MIMO radar system. The coherent processing gain and spatial diversity gain are jointly optimized by implementing subarray-level and array element–level optimal configurations at the transceiver and transmitter ends. Moreover, a Quantum Particle Swarm Optimization-based Stochastic Rounding (SR-QPSO) algorithm is proposed for the integer programming-based configuration model. This algorithm ensures that the optimal array-element configuration strategy is obtained with less iteration and achieves the joint optimization of subarray and array-element levels. Finally, simulations verify that the proposed optimal configuration offers substantial improvements compared to other typical radar systems, with a detection probability of 0.98 and an effective range of 1166.3 km, as well as a considerably improved detection performance.
This paper establishes a hybrid distributed Phased-Array Multiple-Input Multiple-Output (PA-MIMO) radar system model, which combines coherent processing gain and spatial diversity gain to synergistically improve the target detection performance. We derive a Likelihood Ratio Test (LRT) detector based on the Neyman-Pearson (NP) criterion for the hybrid distributed PA-MIMO radar system. The coherent processing gain and spatial diversity gain are jointly optimized by implementing subarray-level and array element–level optimal configurations at the transceiver and transmitter ends. Moreover, a Quantum Particle Swarm Optimization-based Stochastic Rounding (SR-QPSO) algorithm is proposed for the integer programming-based configuration model. This algorithm ensures that the optimal array-element configuration strategy is obtained with less iteration and achieves the joint optimization of subarray and array-element levels. Finally, simulations verify that the proposed optimal configuration offers substantial improvements compared to other typical radar systems, with a detection probability of 0.98 and an effective range of 1166.3 km, as well as a considerably improved detection performance.
2023, 12(3): 590-601.
For the resource allocation problem of multitarget tracking in a spectral coexistence environment, this study proposes a joint transmit power and dwell time allocation algorithm for radar networks. First, the predicted Bayesian Cramér-Rao Lower Bound (BCRLB) with the variables of radar node selection, transmit power and dwell time is derived as the performance metric for multi-target tracking accuracy. On this basis, a joint optimization model of transmit power and dwell time allocation for multitarget tracking in radar networks under spectral coexistence is built to collaboratively optimize the radar node selection, transmit power and dwell time of radar networks, This joint optimization model aims to minimize the multitarget tracking BCRLB while satisfying the given transmit resources of radar networks and the predetermined maximum allowable interference energy threshold of the communication base station. Subsequently, for the aforementioned optimization problem, a two-step decomposition method is used to decompose it into multiple subconvex problems, which are solved by combining the Semi-Definite Programming (SDP) and cyclic minimization algorithms. The simulation results showed that, compared with the existing algorithms, the proposed algorithm can effectively improve the multitarget tracking accuracy of radar networks while ensuring that the communication base station works properly.
For the resource allocation problem of multitarget tracking in a spectral coexistence environment, this study proposes a joint transmit power and dwell time allocation algorithm for radar networks. First, the predicted Bayesian Cramér-Rao Lower Bound (BCRLB) with the variables of radar node selection, transmit power and dwell time is derived as the performance metric for multi-target tracking accuracy. On this basis, a joint optimization model of transmit power and dwell time allocation for multitarget tracking in radar networks under spectral coexistence is built to collaboratively optimize the radar node selection, transmit power and dwell time of radar networks, This joint optimization model aims to minimize the multitarget tracking BCRLB while satisfying the given transmit resources of radar networks and the predetermined maximum allowable interference energy threshold of the communication base station. Subsequently, for the aforementioned optimization problem, a two-step decomposition method is used to decompose it into multiple subconvex problems, which are solved by combining the Semi-Definite Programming (SDP) and cyclic minimization algorithms. The simulation results showed that, compared with the existing algorithms, the proposed algorithm can effectively improve the multitarget tracking accuracy of radar networks while ensuring that the communication base station works properly.
2023, 12(3): 602-615.
This study proposes a fast power allocation algorithm under a low interception background for a collocated MIMO radar that simultaneously tracks multiple maneuvering targets. First, the target maneuver process is modeled as an Adaptive Current Statistical (ACS) model, and a particle filter is used to estimate the state of each target. Second, the Predicted Conditional Cramer-Rao Lower Bound (PC-CRLB) is derived, and the target comprehensive threat assessment model is constructed based on the target motion and electromagnetic characteristics. Subsequently, an optimization model with respect to transmitting power is established by developing the weighted sum of the target tracking error evaluation index and the unintercepted probability of radar as the optimization objective. Thereafter, to solve the model using the monotonically decreasing property of the objective function, a solving algorithm based on sequence relaxation is proposed. Finally, a simulation is conducted to verify the effectiveness and timeliness of the proposed algorithm. The results indicate that the proposed algorithm can effectively improve the target tracking accuracy and low interception performance of the radar system. Further, its run speed is increased by nearly 50% compared with that of the interior point method.
This study proposes a fast power allocation algorithm under a low interception background for a collocated MIMO radar that simultaneously tracks multiple maneuvering targets. First, the target maneuver process is modeled as an Adaptive Current Statistical (ACS) model, and a particle filter is used to estimate the state of each target. Second, the Predicted Conditional Cramer-Rao Lower Bound (PC-CRLB) is derived, and the target comprehensive threat assessment model is constructed based on the target motion and electromagnetic characteristics. Subsequently, an optimization model with respect to transmitting power is established by developing the weighted sum of the target tracking error evaluation index and the unintercepted probability of radar as the optimization objective. Thereafter, to solve the model using the monotonically decreasing property of the objective function, a solving algorithm based on sequence relaxation is proposed. Finally, a simulation is conducted to verify the effectiveness and timeliness of the proposed algorithm. The results indicate that the proposed algorithm can effectively improve the target tracking accuracy and low interception performance of the radar system. Further, its run speed is increased by nearly 50% compared with that of the interior point method.
2023, 12(3): 616-628.
In this study, a real-time dwell scheduling algorithm based on pulse interleaving is proposed for a distributed radar network system. A time pointer vector is introduced to indicate the moment when the dwell task with the highest synthetic priority should be chosen. This task is further allocated to the radar node with the lowest interleaving time utilization ratio, effectively reducing the time gaps during scheduling. Meanwhile, the pulse interleaving analysis determines whether the assigned dwell task can be scheduled successfully on the corresponding radar node. The time slot occupation matrix and energy assumption matrix are introduced to indicate the time and energy resource consumption of radar nodes, which not only simplifies the pulse interleaving analysis process but also enables pulse interleaving among the tasks with different pulse repetition intervals and numbers. Furthermore, to improve the efficiency of dwell scheduling, a threshold of interleaving time utilization ratio is set to adaptively choose the sliding step of the time pointer. The simulation results reveal that the proposed algorithm can execute real-time dwell scheduling for a distributed radar network system and achieve better scheduling performance than the existing dwell scheduling algorithm.
In this study, a real-time dwell scheduling algorithm based on pulse interleaving is proposed for a distributed radar network system. A time pointer vector is introduced to indicate the moment when the dwell task with the highest synthetic priority should be chosen. This task is further allocated to the radar node with the lowest interleaving time utilization ratio, effectively reducing the time gaps during scheduling. Meanwhile, the pulse interleaving analysis determines whether the assigned dwell task can be scheduled successfully on the corresponding radar node. The time slot occupation matrix and energy assumption matrix are introduced to indicate the time and energy resource consumption of radar nodes, which not only simplifies the pulse interleaving analysis process but also enables pulse interleaving among the tasks with different pulse repetition intervals and numbers. Furthermore, to improve the efficiency of dwell scheduling, a threshold of interleaving time utilization ratio is set to adaptively choose the sliding step of the time pointer. The simulation results reveal that the proposed algorithm can execute real-time dwell scheduling for a distributed radar network system and achieve better scheduling performance than the existing dwell scheduling algorithm.
2023, 12(3): 629-641.
For the Multi-Target Tracking (MTT) of distributed netted phased array radars, this paper proposes a joint beam and dwell time allocation algorithm driven by dynamic threats. First, a Bayesian Cramer-Rao Lower Bound (BCRLB), including beam and dwell time allocation, is derived. Then, a comprehensive threat evaluation scale is constructed based on the real-time motion state of the target, and a utility function based on the tracking accuracy reference threshold and contributed weights is designed for targets with different threats to measure the relationship of resource allocation prioritization among multiple targets. Afterward, an optimal distribution model of the joint beam and the dwell time driven by the dynamic threat of the target is established; the utility function is combined with the resources of the netted phased array radar system. Finally, the problem is solved using a reward-based iterative descent search algorithm, and the effectiveness of the algorithm is verified via simulation. The simulation results show that the proposed algorithm can determine the tracking accuracy requirements of different targets and allocate tracking resources based on the multi-target threat assessment results, thereby improving the comprehensive tracking accuracy of networked phased array radars.
For the Multi-Target Tracking (MTT) of distributed netted phased array radars, this paper proposes a joint beam and dwell time allocation algorithm driven by dynamic threats. First, a Bayesian Cramer-Rao Lower Bound (BCRLB), including beam and dwell time allocation, is derived. Then, a comprehensive threat evaluation scale is constructed based on the real-time motion state of the target, and a utility function based on the tracking accuracy reference threshold and contributed weights is designed for targets with different threats to measure the relationship of resource allocation prioritization among multiple targets. Afterward, an optimal distribution model of the joint beam and the dwell time driven by the dynamic threat of the target is established; the utility function is combined with the resources of the netted phased array radar system. Finally, the problem is solved using a reward-based iterative descent search algorithm, and the effectiveness of the algorithm is verified via simulation. The simulation results show that the proposed algorithm can determine the tracking accuracy requirements of different targets and allocate tracking resources based on the multi-target threat assessment results, thereby improving the comprehensive tracking accuracy of networked phased array radars.
2023, 12(3): 642-656.
The traditional networked radar power allocation is typically optimized with a given jamming model, while the jammer resource allocation is optimized with a given radar power allocation method; such research lack gaming and interaction. Given the rising seriousness of combat scenarios in which radars and jammers compete, this study suggests a deep game problem of networked radar power allocation under escort suppression jamming, in which intelligent target jamming is trained using Deep Reinforcement Learning (DRL). First, the jammer and the networked radar are mapped as two agents in this problem. Based on the jamming model and the radar detection model, the target detection model of the networked radar under suppressed jamming and the optimized objective function for maximizing the target detection probability are established. In terms of the networked radar agent, the radar power allocation vector is generated by the Proximal Policy Optimization (PPO) policy network. In terms of the jammer agent, a hybrid policy network is designed to simultaneously create beam selection and power allocation actions. Domain knowledge is introduced to construct more effective reward functions. Three kinds of domain knowledge, namely target detection model, equal power allocation strategy, and greedy interference power allocation strategy, are employed to produce guided rewards for the networked radar agent and the jammer agent, respectively. Consequently, the learning efficiency and performance of the agent are improved. Lastly, alternating training is used to learn the policy network parameters of both agents. The experimental results show that when the jammer adopts the DRL-based resource allocation strategy, the DRL-based networked radar power allocation is significantly better than the particle swarm-based and the artificial fish swarm-based networked radar power allocation in both target detection probability and run time metrics.
The traditional networked radar power allocation is typically optimized with a given jamming model, while the jammer resource allocation is optimized with a given radar power allocation method; such research lack gaming and interaction. Given the rising seriousness of combat scenarios in which radars and jammers compete, this study suggests a deep game problem of networked radar power allocation under escort suppression jamming, in which intelligent target jamming is trained using Deep Reinforcement Learning (DRL). First, the jammer and the networked radar are mapped as two agents in this problem. Based on the jamming model and the radar detection model, the target detection model of the networked radar under suppressed jamming and the optimized objective function for maximizing the target detection probability are established. In terms of the networked radar agent, the radar power allocation vector is generated by the Proximal Policy Optimization (PPO) policy network. In terms of the jammer agent, a hybrid policy network is designed to simultaneously create beam selection and power allocation actions. Domain knowledge is introduced to construct more effective reward functions. Three kinds of domain knowledge, namely target detection model, equal power allocation strategy, and greedy interference power allocation strategy, are employed to produce guided rewards for the networked radar agent and the jammer agent, respectively. Consequently, the learning efficiency and performance of the agent are improved. Lastly, alternating training is used to learn the policy network parameters of both agents. The experimental results show that when the jammer adopts the DRL-based resource allocation strategy, the DRL-based networked radar power allocation is significantly better than the particle swarm-based and the artificial fish swarm-based networked radar power allocation in both target detection probability and run time metrics.