Most Downloaded

1
Deep learning is primarily used for target detection in Synthetic Aperture Radar (SAR) images; however, its performance heavily relies on large-scale labeled datasets. The detection performance of deep learning models degrades when applied to SAR data with varying distributions, hindering their real-world applicability. In addition, manual labeling of SAR data is costly. Hence, cross-domain learning strategies based on multisource information are being explored to address these challenges. These strategies can assist detection models in realizing cross-domain knowledge migration by integrating prior information from optical remote sensing images or heterogeneous SAR images acquired from different sensors. This paper focuses on cross-domain learning technologies within the deep learning framework. In addition, it provides a systematic overview of the latest research progress in this field and analyzes the core issues, advantages, and applicable scenarios of existing technologies from a methodological perspective. It outlines future research directions based on the law of technological evolution, aiming to offer theoretical support and methodological references to enhance the generalizability of target detection in SAR images. Deep learning is primarily used for target detection in Synthetic Aperture Radar (SAR) images; however, its performance heavily relies on large-scale labeled datasets. The detection performance of deep learning models degrades when applied to SAR data with varying distributions, hindering their real-world applicability. In addition, manual labeling of SAR data is costly. Hence, cross-domain learning strategies based on multisource information are being explored to address these challenges. These strategies can assist detection models in realizing cross-domain knowledge migration by integrating prior information from optical remote sensing images or heterogeneous SAR images acquired from different sensors. This paper focuses on cross-domain learning technologies within the deep learning framework. In addition, it provides a systematic overview of the latest research progress in this field and analyzes the core issues, advantages, and applicable scenarios of existing technologies from a methodological perspective. It outlines future research directions based on the law of technological evolution, aiming to offer theoretical support and methodological references to enhance the generalizability of target detection in SAR images.
2
This study addresses the issue of fine-grained feature extraction and classification for Low-Slow-Small (LSS) targets, such as birds and drones, by proposing a multi-band multi-angle feature fusion classification method. First, data from five types of rotorcraft drones and bird models were collected at multiple angles using K-band and L-band frequency-modulated continuous-wave radars, forming a dataset for LSS target detection. Second, to capture the periodic vibration characteristics of the L-band target signals, empirical mode decomposition was applied to extract high-frequency features and reduce noise interference. For the K-band echo signals, short-time Fourier transform was applied to obtain high-resolution micro-Doppler features from various angles. Based on these features, a Multi-band Multi-angle Feature Fusion Network (MMFFNet) was designed, incorporating an improved convolutional long short-term memory network for temporal feature extraction, along with an attention fusion module and a multiscale feature fusion module. The proposed architecture improves target classification accuracy by integrating features from both bands and angles. Validation using a real-world dataset showed that compared with methods relying on single radar features, the proposed approach improved the classification accuracy for seven types of LSS targets by 3.1% under a high Signal-to-Noise Ratio (SNR) of 5 dB and by 12.3% under a low SNR of −3 dB. This study addresses the issue of fine-grained feature extraction and classification for Low-Slow-Small (LSS) targets, such as birds and drones, by proposing a multi-band multi-angle feature fusion classification method. First, data from five types of rotorcraft drones and bird models were collected at multiple angles using K-band and L-band frequency-modulated continuous-wave radars, forming a dataset for LSS target detection. Second, to capture the periodic vibration characteristics of the L-band target signals, empirical mode decomposition was applied to extract high-frequency features and reduce noise interference. For the K-band echo signals, short-time Fourier transform was applied to obtain high-resolution micro-Doppler features from various angles. Based on these features, a Multi-band Multi-angle Feature Fusion Network (MMFFNet) was designed, incorporating an improved convolutional long short-term memory network for temporal feature extraction, along with an attention fusion module and a multiscale feature fusion module. The proposed architecture improves target classification accuracy by integrating features from both bands and angles. Validation using a real-world dataset showed that compared with methods relying on single radar features, the proposed approach improved the classification accuracy for seven types of LSS targets by 3.1% under a high Signal-to-Noise Ratio (SNR) of 5 dB and by 12.3% under a low SNR of −3 dB.
3
With the increasing demands on imaging accuracy, efficiency, and robustness in modern three-Dimensional (3D) Synthetic Aperture Radar (SAR) imaging systems, the performance of traditional 3D imaging methods, such as matched filtering and compressed sensing, has become limited in these aspects. In recent years, the rapid development of Deep Learning (DL) technology has provided new theoretical solutions for SAR 3D imaging by enabling the integration of neural networks with physical radar imaging models, leading to the emergence of a learning-based imaging paradigm that combines data-driven and model-driven approaches. This paper systematically reviews recent research progress in DL-based SAR 3D imaging. Focusing on two core issues, namely super-resolution imaging and enhanced imaging, this paper discusses current research advances and hotspots in SAR 3D imaging. These include super-resolution 3D imaging methods based on feedforward neural networks and deep unfolding networks, as well as 3D enhancement techniques such as multichannel data preprocessing and point cloud post-processing. This paper also summarizes publicly available datasets for SAR 3D imaging. In addition, this paper explores current research challenges in DL SAR 3D imaging, including high-generalization and high-precision DL SAR super-resolution 3D imaging technology, DL SAR elevation dimension disambiguation technology, integrated study of DL SAR 3D imaging and image enhancement, and the construction of DL SAR 3D imaging datasets. This paper provides an outlook on future development trends, aiming to offer research references and technical guidance for scholars in related fields. With the increasing demands on imaging accuracy, efficiency, and robustness in modern three-Dimensional (3D) Synthetic Aperture Radar (SAR) imaging systems, the performance of traditional 3D imaging methods, such as matched filtering and compressed sensing, has become limited in these aspects. In recent years, the rapid development of Deep Learning (DL) technology has provided new theoretical solutions for SAR 3D imaging by enabling the integration of neural networks with physical radar imaging models, leading to the emergence of a learning-based imaging paradigm that combines data-driven and model-driven approaches. This paper systematically reviews recent research progress in DL-based SAR 3D imaging. Focusing on two core issues, namely super-resolution imaging and enhanced imaging, this paper discusses current research advances and hotspots in SAR 3D imaging. These include super-resolution 3D imaging methods based on feedforward neural networks and deep unfolding networks, as well as 3D enhancement techniques such as multichannel data preprocessing and point cloud post-processing. This paper also summarizes publicly available datasets for SAR 3D imaging. In addition, this paper explores current research challenges in DL SAR 3D imaging, including high-generalization and high-precision DL SAR super-resolution 3D imaging technology, DL SAR elevation dimension disambiguation technology, integrated study of DL SAR 3D imaging and image enhancement, and the construction of DL SAR 3D imaging datasets. This paper provides an outlook on future development trends, aiming to offer research references and technical guidance for scholars in related fields.
4
Synthetic Aperture Radar (SAR) is widely used in military and civilian applications, with intelligent target interpretation of SAR images being a crucial component of SAR applications. Vision-Language Models (VLMs) play an important role in SAR target interpretation. By incorporating natural language understanding, VLMs effectively address the challenges posed by large intraclass variability in target characteristics and the scarcity of high-quality labeled samples, thereby advancing the field from purely visual interpretation toward semantic understanding of targets. Drawing upon our team’s extensive research experience in SAR target interpretation theory, algorithms, and applications, this paper provides a comprehensive review of intelligent SAR target interpretation based on VLMs. We provide an in-depth analysis of existing challenges and tasks, summarize the current state of research, and compile available open-source datasets. Furthermore, we systematically outline the evolution, ranging from task-specific VLMs to contrastive-, conversational-, and generative-based VLMs and foundational models. Finally, we discuss the latest challenges and future outlooks in SAR target interpretation by VLMs. Synthetic Aperture Radar (SAR) is widely used in military and civilian applications, with intelligent target interpretation of SAR images being a crucial component of SAR applications. Vision-Language Models (VLMs) play an important role in SAR target interpretation. By incorporating natural language understanding, VLMs effectively address the challenges posed by large intraclass variability in target characteristics and the scarcity of high-quality labeled samples, thereby advancing the field from purely visual interpretation toward semantic understanding of targets. Drawing upon our team’s extensive research experience in SAR target interpretation theory, algorithms, and applications, this paper provides a comprehensive review of intelligent SAR target interpretation based on VLMs. We provide an in-depth analysis of existing challenges and tasks, summarize the current state of research, and compile available open-source datasets. Furthermore, we systematically outline the evolution, ranging from task-specific VLMs to contrastive-, conversational-, and generative-based VLMs and foundational models. Finally, we discuss the latest challenges and future outlooks in SAR target interpretation by VLMs.
5
Marine target detection and recognition depend on the characteristics of marine targets and sea clutter. Therefore, understanding the essential features of marine targets based on the measured data is crucial for advancing target detection and recognition technology. To address the issue of insufficient data on the scattering characteristics of marine targets, the Sea-Detecting Radar Data-Sharing Program (SDRDSP) was upgraded to obtain data on marine targets and their environment under different polarizations and sea states. This upgrade expanded the physical dimension of radar target observation and improved radar and auxiliary data acquisition capabilities. Furthermore, a dual-polarized multistate scattering characteristic dataset of marine targets was constructed, and the statistical distribution characteristics, time and space correlation, and Doppler spectrum were analyzed, supporting the data usage. In the future, the types and quantities of maritime targets will continue to accumulate, providing data support for improving marine target detection and recognition performance and intelligence. Marine target detection and recognition depend on the characteristics of marine targets and sea clutter. Therefore, understanding the essential features of marine targets based on the measured data is crucial for advancing target detection and recognition technology. To address the issue of insufficient data on the scattering characteristics of marine targets, the Sea-Detecting Radar Data-Sharing Program (SDRDSP) was upgraded to obtain data on marine targets and their environment under different polarizations and sea states. This upgrade expanded the physical dimension of radar target observation and improved radar and auxiliary data acquisition capabilities. Furthermore, a dual-polarized multistate scattering characteristic dataset of marine targets was constructed, and the statistical distribution characteristics, time and space correlation, and Doppler spectrum were analyzed, supporting the data usage. In the future, the types and quantities of maritime targets will continue to accumulate, providing data support for improving marine target detection and recognition performance and intelligence.
6
This study proposes a Synthetic Aperture Radar (SAR) aircraft detection and recognition method combined with scattering perception to address the problem of target discreteness and false alarms caused by strong background interference in SAR images. The global information is enhanced through a context-guided feature pyramid module, which suppresses strong disturbances in complex images and improves the accuracy of detection and recognition. Additionally, scatter key points are used to locate targets, and a scatter-aware detection module is designed to realize the fine correction of the regression boxes to improve target localization accuracy. This study generates and presents a high-resolution SAR-AIRcraft-1.0 dataset to verify the effectiveness of the proposed method and promote the research on SAR aircraft detection and recognition. The images in this dataset are obtained from the satellite Gaofen-3, which contains 4,368 images and 16,463 aircraft instances, covering seven aircraft categories, namely A220, A320/321, A330, ARJ21, Boeing737, Boeing787, and other. We apply the proposed method and common deep learning algorithms to the constructed dataset. The experimental results demonstrate the excellent effectiveness of our method combined with scattering perception. Furthermore, we establish benchmarks for the performance indicators of the dataset in different tasks such as SAR aircraft detection, recognition, and integrated detection and recognition. This study proposes a Synthetic Aperture Radar (SAR) aircraft detection and recognition method combined with scattering perception to address the problem of target discreteness and false alarms caused by strong background interference in SAR images. The global information is enhanced through a context-guided feature pyramid module, which suppresses strong disturbances in complex images and improves the accuracy of detection and recognition. Additionally, scatter key points are used to locate targets, and a scatter-aware detection module is designed to realize the fine correction of the regression boxes to improve target localization accuracy. This study generates and presents a high-resolution SAR-AIRcraft-1.0 dataset to verify the effectiveness of the proposed method and promote the research on SAR aircraft detection and recognition. The images in this dataset are obtained from the satellite Gaofen-3, which contains 4,368 images and 16,463 aircraft instances, covering seven aircraft categories, namely A220, A320/321, A330, ARJ21, Boeing737, Boeing787, and other. We apply the proposed method and common deep learning algorithms to the constructed dataset. The experimental results demonstrate the excellent effectiveness of our method combined with scattering perception. Furthermore, we establish benchmarks for the performance indicators of the dataset in different tasks such as SAR aircraft detection, recognition, and integrated detection and recognition.
7
To address issues such as insufficient feature extraction, limited spatiotemporal correlation modeling, and poor classification performance in radar classification of Low, Slow, and Small (LSS) targets, this paper investigates on graph network-based feature extraction and classification methods. First, focusing on digital array ubiquitous radar, a radar detection dataset for LSS targets, named LSS-DAUR-1.0, is constructed; it contains Doppler and track data for six types of targets: Passenger ships, speedboats, helicopters, rotor drones, birds, and fixed-wing drones. Second, based on this dataset, the multidomain and multidimensional characteristics of the targets are analyzed, and the complementarity between Doppler and physical motion features is verified through correlation and cosine similarity analyses. On this basis, a Graph Convolutional Network with Dynamic Graph Construction (DG-GCN) classification method fusing dual features is proposed. An adaptive window adjustment, a hybrid attenuation function, and a dynamic threshold mechanism are designed to construct an adaptive dynamic graph based on spatiotemporal correlation. Combined with graph convolution-based feature learning and classification modules, this approach achieves refined classification of low, slow, and small targets. Validation on the LSS-DAUR-1.0 dataset shows that the DG-GCN achieves 99.66% classification accuracy, which is 6.78% and 17.97% higher than that of ResNet and Transformer models, respectively. The total processing time is only 4.98 ms, which is more than 80% lower than that of the aforementioned comparison models. Hence, the DG-GCN achieves both high accuracy and efficiency. In addition, noise environment tests show good robustness. Ablation experiments verify that the dynamic edge weight mechanism compensates for the lack of spatial feature correlation in purely temporal connections and improves the model’s generalizability. To address issues such as insufficient feature extraction, limited spatiotemporal correlation modeling, and poor classification performance in radar classification of Low, Slow, and Small (LSS) targets, this paper investigates on graph network-based feature extraction and classification methods. First, focusing on digital array ubiquitous radar, a radar detection dataset for LSS targets, named LSS-DAUR-1.0, is constructed; it contains Doppler and track data for six types of targets: Passenger ships, speedboats, helicopters, rotor drones, birds, and fixed-wing drones. Second, based on this dataset, the multidomain and multidimensional characteristics of the targets are analyzed, and the complementarity between Doppler and physical motion features is verified through correlation and cosine similarity analyses. On this basis, a Graph Convolutional Network with Dynamic Graph Construction (DG-GCN) classification method fusing dual features is proposed. An adaptive window adjustment, a hybrid attenuation function, and a dynamic threshold mechanism are designed to construct an adaptive dynamic graph based on spatiotemporal correlation. Combined with graph convolution-based feature learning and classification modules, this approach achieves refined classification of low, slow, and small targets. Validation on the LSS-DAUR-1.0 dataset shows that the DG-GCN achieves 99.66% classification accuracy, which is 6.78% and 17.97% higher than that of ResNet and Transformer models, respectively. The total processing time is only 4.98 ms, which is more than 80% lower than that of the aforementioned comparison models. Hence, the DG-GCN achieves both high accuracy and efficiency. In addition, noise environment tests show good robustness. Ablation experiments verify that the dynamic edge weight mechanism compensates for the lack of spatial feature correlation in purely temporal connections and improves the model’s generalizability.
8
Detection of small, slow-moving targets, such as drones using Unmanned Aerial Vehicles (UAVs) poses considerable challenges to radar target detection and recognition technology. There is an urgent need to establish relevant datasets to support the development and application of techniques for detecting small, slow-moving targets. This paper presents a dataset for detecting low-speed and small-size targets using a multiband Frequency Modulated Continuous Wave (FMCW) radar. The dataset utilizes Ku-band and L-band FMCW radar to collect echo data from six UAV types and exhibits diverse temporal and frequency domain resolutions and measurement capabilities by modulating radar cycles and bandwidth, generating an LSS-FMCWR-1.0 dataset (Low Slow Small, LSS). To further enhance the capability for extracting micro-Doppler features from UAVs, this paper proposes a method for UAV micro-Doppler extraction and parameter estimation based on the local maximum synchroextracting transform. Based on the Short Time Fourier Transform (STFT), this method extracts values at the maximum energy point in the time-frequency domain to retain useful signals and refine the time-frequency energy representation. Validation and analysis using the LSS-FMCWR-1.0 dataset demonstrate that this approach reduces entropy on an average by 5.3 dB and decreases estimation errors in rotor blade length by 27.7% compared with traditional time-frequency methods. Moreover, the proposed method provides the foundation for subsequent target recognition efforts because it balances high time-frequency resolution and parameter estimation capabilities. Detection of small, slow-moving targets, such as drones using Unmanned Aerial Vehicles (UAVs) poses considerable challenges to radar target detection and recognition technology. There is an urgent need to establish relevant datasets to support the development and application of techniques for detecting small, slow-moving targets. This paper presents a dataset for detecting low-speed and small-size targets using a multiband Frequency Modulated Continuous Wave (FMCW) radar. The dataset utilizes Ku-band and L-band FMCW radar to collect echo data from six UAV types and exhibits diverse temporal and frequency domain resolutions and measurement capabilities by modulating radar cycles and bandwidth, generating an LSS-FMCWR-1.0 dataset (Low Slow Small, LSS). To further enhance the capability for extracting micro-Doppler features from UAVs, this paper proposes a method for UAV micro-Doppler extraction and parameter estimation based on the local maximum synchroextracting transform. Based on the Short Time Fourier Transform (STFT), this method extracts values at the maximum energy point in the time-frequency domain to retain useful signals and refine the time-frequency energy representation. Validation and analysis using the LSS-FMCWR-1.0 dataset demonstrate that this approach reduces entropy on an average by 5.3 dB and decreases estimation errors in rotor blade length by 27.7% compared with traditional time-frequency methods. Moreover, the proposed method provides the foundation for subsequent target recognition efforts because it balances high time-frequency resolution and parameter estimation capabilities.
9
Research on target recognition using radar High-Resolution Range Profiles (HRRPs) is extensive and diverse in methodology. In particular, the application and development of deep learning to radar HRRP target recognition have enabled efficient, precise target perception directly from radar echoes. However, deep learning-based recognition networks rely on large amounts of training data. For non-cooperative targets, due to limited radar system parameters and rapid target attitude variations, acquiring adequate HRRP training samples that comprehensively cover target attitudes in advance is difficult in practice. Consequently, deep recognition networks are prone to overfitting and exhibit considerably degraded generalization capability. To address these issues, and given the ease of obtaining full-attitude electromagnetic simulation data for the target, this paper leverages simulated data as auxiliary information to mitigate the small-sample-size problem through data augmentation and cross-domain knowledge-transfer learning. For data augmentation, based on the analysis of differences in mean and variance between simulated and measured HRRPs within a given attitude-angle range, a linear transformation is applied to a set of simulated HRRPs spanning the same angular domain as a small set of measured HRRPs. This adjustment ensures that the simulated data’s mean and variance match the characteristics of the measured HRRPs, thereby achieving data augmentation that approximates the true distributional properties of HRRPs. Meanwhile, for cross-domain knowledge transfer learning, the proposed method introduces a domain alignment strategy based on generative adversarial constraints and a class alignment strategy based on contrastive learning constraints. These approaches draw the domain features of full-attitude simulation—strong discriminability and generalizability—closer to the measured domain features on a class-by-class basis, thereby further aiding learning from the measured domain data and leading to substantial improvements in few-shot recognition performance. Experimental results based on electromagnetic simulated and measured HRRP data for three and ten types of aircraft and ground vehicle targets, respectively, demonstrate that the proposed method yields superior recognition robustness compared with existing few-shot recognition methods. Research on target recognition using radar High-Resolution Range Profiles (HRRPs) is extensive and diverse in methodology. In particular, the application and development of deep learning to radar HRRP target recognition have enabled efficient, precise target perception directly from radar echoes. However, deep learning-based recognition networks rely on large amounts of training data. For non-cooperative targets, due to limited radar system parameters and rapid target attitude variations, acquiring adequate HRRP training samples that comprehensively cover target attitudes in advance is difficult in practice. Consequently, deep recognition networks are prone to overfitting and exhibit considerably degraded generalization capability. To address these issues, and given the ease of obtaining full-attitude electromagnetic simulation data for the target, this paper leverages simulated data as auxiliary information to mitigate the small-sample-size problem through data augmentation and cross-domain knowledge-transfer learning. For data augmentation, based on the analysis of differences in mean and variance between simulated and measured HRRPs within a given attitude-angle range, a linear transformation is applied to a set of simulated HRRPs spanning the same angular domain as a small set of measured HRRPs. This adjustment ensures that the simulated data’s mean and variance match the characteristics of the measured HRRPs, thereby achieving data augmentation that approximates the true distributional properties of HRRPs. Meanwhile, for cross-domain knowledge transfer learning, the proposed method introduces a domain alignment strategy based on generative adversarial constraints and a class alignment strategy based on contrastive learning constraints. These approaches draw the domain features of full-attitude simulation—strong discriminability and generalizability—closer to the measured domain features on a class-by-class basis, thereby further aiding learning from the measured domain data and leading to substantial improvements in few-shot recognition performance. Experimental results based on electromagnetic simulated and measured HRRP data for three and ten types of aircraft and ground vehicle targets, respectively, demonstrate that the proposed method yields superior recognition robustness compared with existing few-shot recognition methods.
10
The bistatic Synthetic Aperture Radar (SAR) system, which employs spatially separated transmitting and receiving platforms, provides high-resolution imaging of terrestrial and maritime scenes and targets in complex environments. Its advantages include flexible configuration, strong concealment capabilities, high interference resistance, and comprehensive target information acquisition, making it valuable in high-precision remote sensing mapping, covert imaging, and precision strikes. Image processing is critical for obtaining high-resolution Bistatic SAR (BiSAR) images. However, the echo model and characteristics of BiSAR substantially differ from those of traditional monostatic SAR, necessitating specialized image processing methods tailored to various operational modes and configurations. This study examines key challenges and solutions for several BiSAR configurations, including airborne BiSAR, BiSAR with high-speed and highly maneuverable platforms, spaceborne heterogeneous BiSAR, and spaceborne homogeneous BiSAR. This study also addresses motion compensation approaches and moving target imaging in BiSAR systems, reviews relevant domestic and international research advancements, and provides an outlook on future trends in BiSAR image processing. The bistatic Synthetic Aperture Radar (SAR) system, which employs spatially separated transmitting and receiving platforms, provides high-resolution imaging of terrestrial and maritime scenes and targets in complex environments. Its advantages include flexible configuration, strong concealment capabilities, high interference resistance, and comprehensive target information acquisition, making it valuable in high-precision remote sensing mapping, covert imaging, and precision strikes. Image processing is critical for obtaining high-resolution Bistatic SAR (BiSAR) images. However, the echo model and characteristics of BiSAR substantially differ from those of traditional monostatic SAR, necessitating specialized image processing methods tailored to various operational modes and configurations. This study examines key challenges and solutions for several BiSAR configurations, including airborne BiSAR, BiSAR with high-speed and highly maneuverable platforms, spaceborne heterogeneous BiSAR, and spaceborne homogeneous BiSAR. This study also addresses motion compensation approaches and moving target imaging in BiSAR systems, reviews relevant domestic and international research advancements, and provides an outlook on future trends in BiSAR image processing.
11
Geosynchronous Orbit (GEO) Synthetic Aperture Radar (SAR) detection ensures persistent, wide-area surveillance. However, this ship-detection method faces significant technical challenges, such as imaging defocusing, low Signal-to-Clutter Ratio (SCR), and large position offsets, due to the long detection distance, long synthetic aperture time, clutter accumulation within a large field of view, and nonplanar observation geometry. To address these challenges, this paper proposes a novel integrated detection-tracking-localization framework for moving-ship targets in GEO SAR. First, a GEO SAR observation signal model is established for moving ships, after which their echo characteristics within the ultra-long synthetic aperture time are analyzed in depth. On this basis, the model realizes target-image detection and long-term tracking localization via optimal subaperture processing. Using an improved back-projection imaging algorithm tailored for moving ships, effective energy accumulation and focusing of noncooperative ships under low SCR are achieved within the aperture. In addition, the relationship between the offset position of moving targets and the Range-Doppler (RD) parameters under GEO SAR nonplanar geometric observation is obtained. Second, under the assumption of short-term uniform ship motion, a bidirectional smoothing filter is applied to track the multisubaperture detection results. The velocity estimation of moving ships is obtained from the long-term tracking results, and the relocation of moving ships is realized using the RD relationship between the offset position and the actual position. Finally, the proposed framework is validated using simulation data and on-orbit GEO SAR satellite test data. Geosynchronous Orbit (GEO) Synthetic Aperture Radar (SAR) detection ensures persistent, wide-area surveillance. However, this ship-detection method faces significant technical challenges, such as imaging defocusing, low Signal-to-Clutter Ratio (SCR), and large position offsets, due to the long detection distance, long synthetic aperture time, clutter accumulation within a large field of view, and nonplanar observation geometry. To address these challenges, this paper proposes a novel integrated detection-tracking-localization framework for moving-ship targets in GEO SAR. First, a GEO SAR observation signal model is established for moving ships, after which their echo characteristics within the ultra-long synthetic aperture time are analyzed in depth. On this basis, the model realizes target-image detection and long-term tracking localization via optimal subaperture processing. Using an improved back-projection imaging algorithm tailored for moving ships, effective energy accumulation and focusing of noncooperative ships under low SCR are achieved within the aperture. In addition, the relationship between the offset position of moving targets and the Range-Doppler (RD) parameters under GEO SAR nonplanar geometric observation is obtained. Second, under the assumption of short-term uniform ship motion, a bidirectional smoothing filter is applied to track the multisubaperture detection results. The velocity estimation of moving ships is obtained from the long-term tracking results, and the relocation of moving ships is realized using the RD relationship between the offset position and the actual position. Finally, the proposed framework is validated using simulation data and on-orbit GEO SAR satellite test data.
12
A large-scale Vision-Language Model (VLM) pre-trained on massive image-text datasets performs well when processing natural images. However, there are two major challenges in applying it to Synthetic Aperture Radar (SAR) images: (1) The high cost of high-quality text annotation limits the construction of SAR image-text paired datasets, and (2) The considerable differences in image features between SAR images and optical natural images increase the difficulty of cross-domain knowledge transfer. To address these problems, this study developed a knowledge transfer method for VLM tailored to SAR images. First, this study leveraged paired SAR and optical remote sensing images and employed a generative VLM to automatically produce textual descriptions of the optical images, thereby indirectly constructing a low-cost SAR-text paired dataset. Second, a two-stage transfer strategy was designed to address the large domain discrepancy between natural and SAR images, reducing the difficulty of each transfer stage. Finally, experimental validation was conducted through the zero-shot scene classification, image retrieval, and object recognition of SAR images. The results demonstrated that the proposed method enables effective knowledge transfer from a large-scale VLM to the SAR image domain. A large-scale Vision-Language Model (VLM) pre-trained on massive image-text datasets performs well when processing natural images. However, there are two major challenges in applying it to Synthetic Aperture Radar (SAR) images: (1) The high cost of high-quality text annotation limits the construction of SAR image-text paired datasets, and (2) The considerable differences in image features between SAR images and optical natural images increase the difficulty of cross-domain knowledge transfer. To address these problems, this study developed a knowledge transfer method for VLM tailored to SAR images. First, this study leveraged paired SAR and optical remote sensing images and employed a generative VLM to automatically produce textual descriptions of the optical images, thereby indirectly constructing a low-cost SAR-text paired dataset. Second, a two-stage transfer strategy was designed to address the large domain discrepancy between natural and SAR images, reducing the difficulty of each transfer stage. Finally, experimental validation was conducted through the zero-shot scene classification, image retrieval, and object recognition of SAR images. The results demonstrated that the proposed method enables effective knowledge transfer from a large-scale VLM to the SAR image domain.
13
In recent years, the rapid development of Multimodal Large Language Models (MLLMs) and their applications in earth observation have garnered significant attention. Earth observation MLLMs achieve deep integration of multimodal information, including optical imagery, Synthetic Aperture Radar (SAR) imagery, and textual data, through the design of bridging mechanisms between large language models and vision models, combined with joint training strategies. This integration facilitates a paradigm shift in intelligent earth observation interpretation—from shallow semantic matching to higher-level understanding based on world knowledge. In this study, we systematically review the research progress in the applications of MLLMs in earth observation, specifically examining the development of Earth Observation MLLMs (EO-MLLMs), which provides a foundation for future research directions. Initially, we discuss the concept of EO-MLLMs and review their development in chronological order. Subsequently, we provide a detailed analysis and statistical summary of the proposed architectures, training methods, applications, and corresponding benchmark datasets, along with an introduction to Earth Observation Agents (EO-Agent). Finally, we summarize the research status of EO-MLLMs and discuss future research directions. In recent years, the rapid development of Multimodal Large Language Models (MLLMs) and their applications in earth observation have garnered significant attention. Earth observation MLLMs achieve deep integration of multimodal information, including optical imagery, Synthetic Aperture Radar (SAR) imagery, and textual data, through the design of bridging mechanisms between large language models and vision models, combined with joint training strategies. This integration facilitates a paradigm shift in intelligent earth observation interpretation—from shallow semantic matching to higher-level understanding based on world knowledge. In this study, we systematically review the research progress in the applications of MLLMs in earth observation, specifically examining the development of Earth Observation MLLMs (EO-MLLMs), which provides a foundation for future research directions. Initially, we discuss the concept of EO-MLLMs and review their development in chronological order. Subsequently, we provide a detailed analysis and statistical summary of the proposed architectures, training methods, applications, and corresponding benchmark datasets, along with an introduction to Earth Observation Agents (EO-Agent). Finally, we summarize the research status of EO-MLLMs and discuss future research directions.
14
Passive radar plays an important role in early warning detection and Low Slow Small (LSS) target detection. Due to the uncontrollable source of passive radar signal radiations, target characteristics are more complex, which makes target detection and identification extremely difficult. In this paper, a passive radar LSS detection dataset (LSS-PR-1.0) is constructed, which contains the radar echo signals of four typical sea and air targets, namely helicopters, unmanned aerial vehicles, speedboats, and passenger ships, as well as sea clutter data at low and high sea states. It provides data support for radar research. In terms of target feature extraction and analysis, the singular-value-decomposition sea-clutter-suppression method is first adopted to remove the influence of the strong Bragg peak of sea clutter on target echo. On this basis, four categories of ten multi-domain feature extraction and analysis methods are proposed, including time-domain features (relative average amplitude), frequency-domain features (spectral features, Doppler waterfall plot, and range Doppler features), time-frequency-domain features, and motion features (heading difference, trajectory parameters, speed variation interval, speed variation coefficient, and acceleration). Based on the actual measurement data, a comparative analysis is conducted on the characteristics of four types of sea and air targets, summarizing the patterns of various target characteristics and laying the foundation for subsequent target recognition. Passive radar plays an important role in early warning detection and Low Slow Small (LSS) target detection. Due to the uncontrollable source of passive radar signal radiations, target characteristics are more complex, which makes target detection and identification extremely difficult. In this paper, a passive radar LSS detection dataset (LSS-PR-1.0) is constructed, which contains the radar echo signals of four typical sea and air targets, namely helicopters, unmanned aerial vehicles, speedboats, and passenger ships, as well as sea clutter data at low and high sea states. It provides data support for radar research. In terms of target feature extraction and analysis, the singular-value-decomposition sea-clutter-suppression method is first adopted to remove the influence of the strong Bragg peak of sea clutter on target echo. On this basis, four categories of ten multi-domain feature extraction and analysis methods are proposed, including time-domain features (relative average amplitude), frequency-domain features (spectral features, Doppler waterfall plot, and range Doppler features), time-frequency-domain features, and motion features (heading difference, trajectory parameters, speed variation interval, speed variation coefficient, and acceleration). Based on the actual measurement data, a comparative analysis is conducted on the characteristics of four types of sea and air targets, summarizing the patterns of various target characteristics and laying the foundation for subsequent target recognition.
15

Flying birds and Unmanned Aerial Vehicles (UAVs) are typical “low, slow, and small” targets with low observability. The need for effective monitoring and identification of these two targets has become urgent and must be solved to ensure the safety of air routes and urban areas. There are many types of flying birds and UAVs that are characterized by low flying heights, strong maneuverability, small radar cross-sectional areas, and complicated detection environments, which are posing great challenges in target detection worldwide. “Visible (high detection ability) and clear-cut (high recognition probability)” methods and technologies must be developed that can finely describe and recognize UAVs, flying birds, and “low-slow-small” targets. This paper reviews the recent progress in research on detection and recognition technologies for rotor UAVs and flying birds in complex scenes and discusses effective detection and recognition methods for the detection of birds and drones, including echo modeling and recognition of fretting characteristics, the enhancement and extraction of maneuvering features in ubiquitous observation mode, distributed multi-view features fusion, differences in motion trajectories, and intelligent classification via deep learning. Lastly, the problems of existing research approaches are summarized, and we consider the future development prospects of target detection and recognition technologies for flying birds and UAVs in complex scenarios.

Flying birds and Unmanned Aerial Vehicles (UAVs) are typical “low, slow, and small” targets with low observability. The need for effective monitoring and identification of these two targets has become urgent and must be solved to ensure the safety of air routes and urban areas. There are many types of flying birds and UAVs that are characterized by low flying heights, strong maneuverability, small radar cross-sectional areas, and complicated detection environments, which are posing great challenges in target detection worldwide. “Visible (high detection ability) and clear-cut (high recognition probability)” methods and technologies must be developed that can finely describe and recognize UAVs, flying birds, and “low-slow-small” targets. This paper reviews the recent progress in research on detection and recognition technologies for rotor UAVs and flying birds in complex scenes and discusses effective detection and recognition methods for the detection of birds and drones, including echo modeling and recognition of fretting characteristics, the enhancement and extraction of maneuvering features in ubiquitous observation mode, distributed multi-view features fusion, differences in motion trajectories, and intelligent classification via deep learning. Lastly, the problems of existing research approaches are summarized, and we consider the future development prospects of target detection and recognition technologies for flying birds and UAVs in complex scenarios.

16
Synthetic Aperture Radar (SAR) offers all-weather, all-day maritime surveillance capabilities. Direct ship detection in the Range Compressed Domain (RCD) eliminates computationally intensive imaging steps—such as range cell migration correction and azimuth compression—thereby considerably improving processing efficiency for near-real-time or real-time applications. However, current detection methods face inherent limitations; traditional constant false alarm rate detectors rely on fixed statistical models and often underperform in complex sea clutter environments. In addition, deep learning approaches heavily rely on annotated data and do not fully leverage phase information; moreover, they exhibit weak interpretability. To address these issues, this paper proposes a self-supervised reinforcement learning framework for ship target detection in the SAR RCD. This framework effectively integrates the physical principles of radar signals with deep reinforcement learning, achieving enhanced detection performance while improving model interpretability and generalization. The framework has the following characteristics: (1) It introduces a reward signal-generation mechanism constrained by statistical scattering models, achieving self-supervised learning without the need for manual annotation; (2) It designs a dual-modal feature-fusion module that can jointly represent amplitude and phase information, effectively retaining the Doppler characteristics of ships; and (3) It adopts a lightweight agent module that integrates a lightweight Q-network, an adaptive feature enhancement module, and a discriminator network; this module reduces computational complexity, meets real-time processing requirements, and enhances the robustness of the model through adversarial training. Experimental results demonstrate that the proposed method achieves an average inference time of only 31.75 s on a large-scale SAR RCD dataset of 20 k×20 k pixels, with a computational load of only 23.81% compared with a two-dimensional convolutional neural network. On a complex-valued RCD dataset, the method attains F1 and recall scores of 50.72% and 54.28%, respectively, outperforming mainstream self-supervised methods by 8.76% and 10.45%, respectively. This study pioneers the application of reinforcement learning to ship detection using SAR RCD, offering a novel approach to robust maritime surveillance by integrating signal modeling and data-driven learning. Synthetic Aperture Radar (SAR) offers all-weather, all-day maritime surveillance capabilities. Direct ship detection in the Range Compressed Domain (RCD) eliminates computationally intensive imaging steps—such as range cell migration correction and azimuth compression—thereby considerably improving processing efficiency for near-real-time or real-time applications. However, current detection methods face inherent limitations; traditional constant false alarm rate detectors rely on fixed statistical models and often underperform in complex sea clutter environments. In addition, deep learning approaches heavily rely on annotated data and do not fully leverage phase information; moreover, they exhibit weak interpretability. To address these issues, this paper proposes a self-supervised reinforcement learning framework for ship target detection in the SAR RCD. This framework effectively integrates the physical principles of radar signals with deep reinforcement learning, achieving enhanced detection performance while improving model interpretability and generalization. The framework has the following characteristics: (1) It introduces a reward signal-generation mechanism constrained by statistical scattering models, achieving self-supervised learning without the need for manual annotation; (2) It designs a dual-modal feature-fusion module that can jointly represent amplitude and phase information, effectively retaining the Doppler characteristics of ships; and (3) It adopts a lightweight agent module that integrates a lightweight Q-network, an adaptive feature enhancement module, and a discriminator network; this module reduces computational complexity, meets real-time processing requirements, and enhances the robustness of the model through adversarial training. Experimental results demonstrate that the proposed method achieves an average inference time of only 31.75 s on a large-scale SAR RCD dataset of 20 k×20 k pixels, with a computational load of only 23.81% compared with a two-dimensional convolutional neural network. On a complex-valued RCD dataset, the method attains F1 and recall scores of 50.72% and 54.28%, respectively, outperforming mainstream self-supervised methods by 8.76% and 10.45%, respectively. This study pioneers the application of reinforcement learning to ship detection using SAR RCD, offering a novel approach to robust maritime surveillance by integrating signal modeling and data-driven learning.
17
Maritime target detection and identification technology are developed using large-scale, high-quality multi-sensor measurement data. Therefore, the Sea Detection Radar Data Sharing Program (SDRDSP) was upgraded to the Maritime Target Data Sharing Program (MTDSP), integrating multiple observation modalities, such as HH-polarized radar, VV-polarized radar, electro-optical devices, and Automatic Identification System (AIS) equipment to conduct multisource observation experiments on maritime vessel targets. The program collects various data types, including radar intermediate frequency/video echo slice data, visible and infrared imagery, AIS static and dynamic messages, and meteorological and hydrological data, covering representative sea conditions and multiple vessel types. A comprehensive multisource observation dataset was constructed, enabling the matching and annotation of multimodal data for the same target. Moreover, an automated data management system was implemented to support data storage, conditional retrieval, and batch export, providing a solid foundation for the automated acquisition, long-term accumulation, and efficient use of maritime target characteristic data. Based on this system and measured data, the time/frequency domain features of the same and different vessel targets under different sea states, attitudes, polarization conditions are compared and analyzed, and the statistical conclusion of the change in target features is obtained. Maritime target detection and identification technology are developed using large-scale, high-quality multi-sensor measurement data. Therefore, the Sea Detection Radar Data Sharing Program (SDRDSP) was upgraded to the Maritime Target Data Sharing Program (MTDSP), integrating multiple observation modalities, such as HH-polarized radar, VV-polarized radar, electro-optical devices, and Automatic Identification System (AIS) equipment to conduct multisource observation experiments on maritime vessel targets. The program collects various data types, including radar intermediate frequency/video echo slice data, visible and infrared imagery, AIS static and dynamic messages, and meteorological and hydrological data, covering representative sea conditions and multiple vessel types. A comprehensive multisource observation dataset was constructed, enabling the matching and annotation of multimodal data for the same target. Moreover, an automated data management system was implemented to support data storage, conditional retrieval, and batch export, providing a solid foundation for the automated acquisition, long-term accumulation, and efficient use of maritime target characteristic data. Based on this system and measured data, the time/frequency domain features of the same and different vessel targets under different sea states, attitudes, polarization conditions are compared and analyzed, and the statistical conclusion of the change in target features is obtained.
18
Millimeter-wave radar is increasingly being adopted for smart home systems, elder care, and surveillance monitoring, owing to its adaptability to environmental conditions, high resolution, and privacy-preserving capabilities. A key factor in effectively utilizing millimeter-wave radar is the analysis of point clouds, which are essential for recognizing human postures. However, the sparse nature of these point clouds poses significant challenges for accurate and efficient human action recognition. To overcome these issues, we present a 3D point cloud dataset tailored for human actions captured using millimeter-wave radar (mmWave-3DPCHM-1.0). This dataset is enhanced with advanced data processing techniques and cutting-edge human action recognition models. Data collection is conducted using Texas Instruments (TI)’s IWR1443-ISK and Vayyar’s vBlu radio imaging module, covering 12 common human actions, including walking, waving, standing, and falling. At the core of our approach is the Point EdgeConv and Transformer (PETer) network, which integrates edge convolution with transformer models. For each 3D point cloud frame, PETer constructs a locally directed neighborhood graph through edge convolution to extract spatial geometric features effectively. The network then leverages a series of Transformer encoding models to uncover temporal relationships across multiple point cloud frames. Extensive experiments reveal that the PETer network achieves exceptional recognition rates of 98.77% on the TI dataset and 99.51% on the Vayyar dataset, outperforming the traditional optimal baseline model by approximately 5%. With a compact model size of only 1.09 MB, PETer is well-suited for deployment on edge devices, providing an efficient solution for real-time human action recognition in resource-constrained environments. Millimeter-wave radar is increasingly being adopted for smart home systems, elder care, and surveillance monitoring, owing to its adaptability to environmental conditions, high resolution, and privacy-preserving capabilities. A key factor in effectively utilizing millimeter-wave radar is the analysis of point clouds, which are essential for recognizing human postures. However, the sparse nature of these point clouds poses significant challenges for accurate and efficient human action recognition. To overcome these issues, we present a 3D point cloud dataset tailored for human actions captured using millimeter-wave radar (mmWave-3DPCHM-1.0). This dataset is enhanced with advanced data processing techniques and cutting-edge human action recognition models. Data collection is conducted using Texas Instruments (TI)’s IWR1443-ISK and Vayyar’s vBlu radio imaging module, covering 12 common human actions, including walking, waving, standing, and falling. At the core of our approach is the Point EdgeConv and Transformer (PETer) network, which integrates edge convolution with transformer models. For each 3D point cloud frame, PETer constructs a locally directed neighborhood graph through edge convolution to extract spatial geometric features effectively. The network then leverages a series of Transformer encoding models to uncover temporal relationships across multiple point cloud frames. Extensive experiments reveal that the PETer network achieves exceptional recognition rates of 98.77% on the TI dataset and 99.51% on the Vayyar dataset, outperforming the traditional optimal baseline model by approximately 5%. With a compact model size of only 1.09 MB, PETer is well-suited for deployment on edge devices, providing an efficient solution for real-time human action recognition in resource-constrained environments.
19
Low-altitude targets, represented by rotor unmanned aerial vehicles, can typically adopt a slow-cruise mode. As a result, their echoes fall within the Doppler Blind Zone (DBZ) and evade radar detection and tracking. The cluttered low-altitude environment adds to further complexity. To address this issue, this study proposes a method grounded in the framework of random finite set and designed for tracking slow-moving targets with a low-altitude surveillance radar. Inspired by the Bayesian occupancy filter, the proposed method initially models the radar Field of View (FoV) as a grid map. It is uniformly partitioned along the angle-range axis, ensuring that each cell captures a specific segment of the FoV. Then, adaptive filtering parameter modules are meticulously designed by leveraging the distinct dynamic characteristics of slow-moving targets and ground clutter. Subsequently, a probability hypothesis density filter is deployed to conduct unified filtering on the grid map situated within the DBZ. The final step involves the use of clustering methods to extract information about the target of interest. Simulation results validate the effectiveness, robustness, and superior performance of the proposed method across typical surveillance scenarios involving multiple slow-moving targets, noise, and clutter. Low-altitude targets, represented by rotor unmanned aerial vehicles, can typically adopt a slow-cruise mode. As a result, their echoes fall within the Doppler Blind Zone (DBZ) and evade radar detection and tracking. The cluttered low-altitude environment adds to further complexity. To address this issue, this study proposes a method grounded in the framework of random finite set and designed for tracking slow-moving targets with a low-altitude surveillance radar. Inspired by the Bayesian occupancy filter, the proposed method initially models the radar Field of View (FoV) as a grid map. It is uniformly partitioned along the angle-range axis, ensuring that each cell captures a specific segment of the FoV. Then, adaptive filtering parameter modules are meticulously designed by leveraging the distinct dynamic characteristics of slow-moving targets and ground clutter. Subsequently, a probability hypothesis density filter is deployed to conduct unified filtering on the grid map situated within the DBZ. The final step involves the use of clustering methods to extract information about the target of interest. Simulation results validate the effectiveness, robustness, and superior performance of the proposed method across typical surveillance scenarios involving multiple slow-moving targets, noise, and clutter.
20
Synthetic Aperture Radar (SAR) and optical imagery are two key remote-sensing modalities in Earth observation, and cross-modal image matching between them is widely applied in tasks such as image fusion, joint interpretation, and high-precision geolocation. In recent years, with the rapid growth of Earth-observation data, the importance of cross-modal image matching between SAR and optical data has become increasingly prominent, and related studies have achieved notable progress. In particular, Deep Learning (DL)-based methods, owing to their strengths in cross-modal feature representation and high-level semantic extraction, have demonstrated excellent matching accuracy and adaptability across varying imaging conditions. However, most publicly available datasets are limited to small image patches and lack complete full-scene image pairs that cover realistic large-scale scenarios, making it difficult to comprehensively evaluate the performance of matching algorithms in practical remote-sensing settings and constraining advances in the training and generalization of DL models. To address these issues, this study develops and releases OSDataset2.0, a large-scale benchmark dataset for SAR-optical image matching. The dataset comprises two parts: A patch-level subset and a scene-level subset. The patch-level subset is composed of 6,476 registered 512 × 512 image pairs covering 14 countries (Argentina, Australia, Poland, Germany, Russia, France, Qatar, Malaysia, the United States, Japan, Türkiye, Singapore, India, and China); the scene-level subset consists of one pair of full-scene optical and SAR images. For full-scene images, high-precision, uniformly distributed ground-truth correspondences are provided, extracted under the principle of imaging-mechanism consistency, together with a general evaluation codebase that supports quantitative analysis of registration accuracy for arbitrary matching algorithms. To further assess the dataset’s effectiveness and challenge level, a systematic evaluation of 11 representative optical-SAR matching methods on OSDataset2.0 is conducted, covering traditional feature-based approaches and mainstream DL models. Experimental results show that the dataset not only supports effective algorithmic comparisons but also provides reliable training resources and a unified evaluation benchmark for subsequent research. Synthetic Aperture Radar (SAR) and optical imagery are two key remote-sensing modalities in Earth observation, and cross-modal image matching between them is widely applied in tasks such as image fusion, joint interpretation, and high-precision geolocation. In recent years, with the rapid growth of Earth-observation data, the importance of cross-modal image matching between SAR and optical data has become increasingly prominent, and related studies have achieved notable progress. In particular, Deep Learning (DL)-based methods, owing to their strengths in cross-modal feature representation and high-level semantic extraction, have demonstrated excellent matching accuracy and adaptability across varying imaging conditions. However, most publicly available datasets are limited to small image patches and lack complete full-scene image pairs that cover realistic large-scale scenarios, making it difficult to comprehensively evaluate the performance of matching algorithms in practical remote-sensing settings and constraining advances in the training and generalization of DL models. To address these issues, this study develops and releases OSDataset2.0, a large-scale benchmark dataset for SAR-optical image matching. The dataset comprises two parts: A patch-level subset and a scene-level subset. The patch-level subset is composed of 6,476 registered 512 × 512 image pairs covering 14 countries (Argentina, Australia, Poland, Germany, Russia, France, Qatar, Malaysia, the United States, Japan, Türkiye, Singapore, India, and China); the scene-level subset consists of one pair of full-scene optical and SAR images. For full-scene images, high-precision, uniformly distributed ground-truth correspondences are provided, extracted under the principle of imaging-mechanism consistency, together with a general evaluation codebase that supports quantitative analysis of registration accuracy for arbitrary matching algorithms. To further assess the dataset’s effectiveness and challenge level, a systematic evaluation of 11 representative optical-SAR matching methods on OSDataset2.0 is conducted, covering traditional feature-based approaches and mainstream DL models. Experimental results show that the dataset not only supports effective algorithmic comparisons but also provides reliable training resources and a unified evaluation benchmark for subsequent research.
  • First
  • Prev
  • 1
  • 2
  • 3
  • 4
  • 5
  • Last
  • Total:5
  • To
  • Go