2022 Vol. 11, No. 1

Theory and System of Synthetic Aperture Radar Microwave Vision
Three-Dimensional (3D) Synthetic Aperture Radar (SAR) imaging has considerable application potential in steep-terrain mapping and target recognition in complex environments and is an important development direction in the current SAR field. To promote the development and application of the 3D SAR imaging technology, the Aerospace Information Research Institute, Chinese Academy of Sciences designed and developed an unmanned aerial vehicle-borne Microwave-Vision 3D SAR (MV3DSAR) experimental system, which provides an experimental platform for the research and verification of related technologies. Currently, the single-polarization version of the system has been developed, and the first flight experiment has been conducted in Tianjin. This study introduces the structure, performance, key technologies, and data processing of the system. This study also presents the implementation and preliminary data processing results of the first experiment, verifying the basic performance and 3D imaging capability of the system. The MV3DSAR provides a good experimental and verification platform for analyzing 3D SAR imaging algorithms and constructing 3D SAR imaging datasets. Three-Dimensional (3D) Synthetic Aperture Radar (SAR) imaging has considerable application potential in steep-terrain mapping and target recognition in complex environments and is an important development direction in the current SAR field. To promote the development and application of the 3D SAR imaging technology, the Aerospace Information Research Institute, Chinese Academy of Sciences designed and developed an unmanned aerial vehicle-borne Microwave-Vision 3D SAR (MV3DSAR) experimental system, which provides an experimental platform for the research and verification of related technologies. Currently, the single-polarization version of the system has been developed, and the first flight experiment has been conducted in Tianjin. This study introduces the structure, performance, key technologies, and data processing of the system. This study also presents the implementation and preliminary data processing results of the first experiment, verifying the basic performance and 3D imaging capability of the system. The MV3DSAR provides a good experimental and verification platform for analyzing 3D SAR imaging algorithms and constructing 3D SAR imaging datasets.
Conceptually speaking, Synthetic Aperture Radar (SAR) microwave vision 3D imaging refers to fusing visual semantics into the SAR 3D imaging process to enhance the 3D imaging quality. For SAR Tomography (TomoSAR), it specifically means to reduce the needed observations by fully exploiting SAR visual semantics. However, what does it mean by visual semantics? From the viewpoint of visual perception, 3D structural information could be perceived from either monocular image or binocular images and the same scene could be perceived differently by different people. From the viewpoint of neurophysiology, depth perception from binocular or monocular vision has fundamentally different mechanism. Besides visual illusion phenomenon is omnipresent in daily life. Hence what kinds of visual semantics could be helpful for SAR 3D imaging from the computational point of view? What could be learnt from computer vision community to extract useful visual semantics from SAR images? This short note presents some preliminary discussions on such issues, a purely personal view on such vast topics. Conceptually speaking, Synthetic Aperture Radar (SAR) microwave vision 3D imaging refers to fusing visual semantics into the SAR 3D imaging process to enhance the 3D imaging quality. For SAR Tomography (TomoSAR), it specifically means to reduce the needed observations by fully exploiting SAR visual semantics. However, what does it mean by visual semantics? From the viewpoint of visual perception, 3D structural information could be perceived from either monocular image or binocular images and the same scene could be perceived differently by different people. From the viewpoint of neurophysiology, depth perception from binocular or monocular vision has fundamentally different mechanism. Besides visual illusion phenomenon is omnipresent in daily life. Hence what kinds of visual semantics could be helpful for SAR 3D imaging from the computational point of view? What could be learnt from computer vision community to extract useful visual semantics from SAR images? This short note presents some preliminary discussions on such issues, a purely personal view on such vast topics.
A radar human behavior perception system has penetration detection ability, which gives it a wide application prospect in the fields of security, rescue, medical treatment, and so on. Although the development of deep learning technology has promoted radar sensor research in human behavior perception, it requires more prompted dataset availability. This paper provides a four-dimensional imaging dataset of human activity using ultra-wideband radar, UWB-HA4D, which uses three-dimensional ultra-wideband multiple-input multiple-output radar as the detection sensor to capture the range-azimuth-height-time four-dimensional activity data of a human target. The dataset contains the activity data of 2757 groups for 11 human targets, including 10 common activities, such as walking, waving, and boxing. It also contains penetration and nonpenetration detection experimental scenarios. The radar system parameters, data generation process, data distribution, and other information of the dataset are introduced in detail herein. Meanwhile, several deep learning algorithms that are based on the PaddlePaddle framework and are widely used in the computer version field are applied to this dataset for human activity recognition. The experimental comparison results can be used to provide references for scholars and facilitate further investigation and research on this basis. A radar human behavior perception system has penetration detection ability, which gives it a wide application prospect in the fields of security, rescue, medical treatment, and so on. Although the development of deep learning technology has promoted radar sensor research in human behavior perception, it requires more prompted dataset availability. This paper provides a four-dimensional imaging dataset of human activity using ultra-wideband radar, UWB-HA4D, which uses three-dimensional ultra-wideband multiple-input multiple-output radar as the detection sensor to capture the range-azimuth-height-time four-dimensional activity data of a human target. The dataset contains the activity data of 2757 groups for 11 human targets, including 10 common activities, such as walking, waving, and boxing. It also contains penetration and nonpenetration detection experimental scenarios. The radar system parameters, data generation process, data distribution, and other information of the dataset are introduced in detail herein. Meanwhile, several deep learning algorithms that are based on the PaddlePaddle framework and are widely used in the computer version field are applied to this dataset for human activity recognition. The experimental comparison results can be used to provide references for scholars and facilitate further investigation and research on this basis.
Synthetic Aperture Radar Signal Processing
Conventional Synthetic Aperture Radar (SAR) can only obtain two-dimensional (2-D) azimuth-range images without accurately reflecting the three-Dimensional (3-D) scattering structure information of the targets. However, SAR Tomography (TomoSAR) is a multi-baseline interferometric measurement mode that extends the synthetic aperture principle into the elevation direction, making it possible to recover the true height of the target, thereby achieving 3-D imaging. Moreover, Differential SAR Tomography (D-TomoSAR) extends the synthetic aperture principle into the elevation and time directions simultaneously. Thus, it can obtain the target 3-D scattering structure along with the deformation speed of the observed target. GaoFen-3 (GF-3) is the first C-band multi-polarization 1 m resolution SAR satellite of China. It has several advantages, such as high-resolution, large swath width, and multiple imaging modes, which are crucial to the development of a high-resolution earth observation technology for China. Presently, GF-3 data are mainly used in the image processing field, such as target identification. However, the phase information of the SAR images is not yet fully utilized. Moreover, because of the high-dimensional imaging ability that was overlooked at the beginning of designing the system, existing SAR images acquired by GF-3 have spatial and temporal de-coherence problems. Thus, it is difficult to use the images in further interference series processing. To solve the above problems, this study achieved 3-D and four-Dimensional (4-D) imaging of buildings around Yanqi Lake, in Beijing, based on the data of seven SAR complex images. We obtained the 3-D scattering structure information of buildings and achieved millimeter-level high-precision monitoring of building deformation. The preliminary experimental results demonstrate the application potential of GF-3 SAR data and provide a technical support for the subsequent further application of the GF-3 SAR satellite in urban sensing and monitoring. Conventional Synthetic Aperture Radar (SAR) can only obtain two-dimensional (2-D) azimuth-range images without accurately reflecting the three-Dimensional (3-D) scattering structure information of the targets. However, SAR Tomography (TomoSAR) is a multi-baseline interferometric measurement mode that extends the synthetic aperture principle into the elevation direction, making it possible to recover the true height of the target, thereby achieving 3-D imaging. Moreover, Differential SAR Tomography (D-TomoSAR) extends the synthetic aperture principle into the elevation and time directions simultaneously. Thus, it can obtain the target 3-D scattering structure along with the deformation speed of the observed target. GaoFen-3 (GF-3) is the first C-band multi-polarization 1 m resolution SAR satellite of China. It has several advantages, such as high-resolution, large swath width, and multiple imaging modes, which are crucial to the development of a high-resolution earth observation technology for China. Presently, GF-3 data are mainly used in the image processing field, such as target identification. However, the phase information of the SAR images is not yet fully utilized. Moreover, because of the high-dimensional imaging ability that was overlooked at the beginning of designing the system, existing SAR images acquired by GF-3 have spatial and temporal de-coherence problems. Thus, it is difficult to use the images in further interference series processing. To solve the above problems, this study achieved 3-D and four-Dimensional (4-D) imaging of buildings around Yanqi Lake, in Beijing, based on the data of seven SAR complex images. We obtained the 3-D scattering structure information of buildings and achieved millimeter-level high-precision monitoring of building deformation. The preliminary experimental results demonstrate the application potential of GF-3 SAR data and provide a technical support for the subsequent further application of the GF-3 SAR satellite in urban sensing and monitoring.
This paper proposes a three-dimensional tomographic SAR imaging method based on a combined sparse and low-rank structures. The traditional Compressed Sensing (CS) based tomographic SAR imaging methods only utilize the sparse representation and reconstruct along the elevation axis of a given azimuth-distance unit. Considering that the target distributions in cities, forests, and other cases are relatively similar, the elevation backscattering patterns of adjacent azimuth-range cells (pixels) are expected to be highly correlated. The proposed method introduces the Karhunen-Loeve transform to characterize the low-rank structures of the elevation of the target areas and constructs a tomographic SAR imaging model that combines sparse and low-rank structures. The ADMM algorithm is applied to solve the tomographic SAR imaging model, the complex original optimization problem is decomposed into several relatively simple sub-problems, and the tomographic SAR imaging results are obtained by the alternate projection of optimization variables. This method improves the reconstruction accuracy in the case of a few interferograms or channels and has better imaging performance. Simulations and real data experiments show that the reconstruction method can effectively separate the scatterers and ensure the accuracy of the reconstruction energy, maintain a good imaging performance under the condition of reducing the number of interferograms or channels, and effectively suppress the artifacts. This paper proposes a three-dimensional tomographic SAR imaging method based on a combined sparse and low-rank structures. The traditional Compressed Sensing (CS) based tomographic SAR imaging methods only utilize the sparse representation and reconstruct along the elevation axis of a given azimuth-distance unit. Considering that the target distributions in cities, forests, and other cases are relatively similar, the elevation backscattering patterns of adjacent azimuth-range cells (pixels) are expected to be highly correlated. The proposed method introduces the Karhunen-Loeve transform to characterize the low-rank structures of the elevation of the target areas and constructs a tomographic SAR imaging model that combines sparse and low-rank structures. The ADMM algorithm is applied to solve the tomographic SAR imaging model, the complex original optimization problem is decomposed into several relatively simple sub-problems, and the tomographic SAR imaging results are obtained by the alternate projection of optimization variables. This method improves the reconstruction accuracy in the case of a few interferograms or channels and has better imaging performance. Simulations and real data experiments show that the reconstruction method can effectively separate the scatterers and ensure the accuracy of the reconstruction energy, maintain a good imaging performance under the condition of reducing the number of interferograms or channels, and effectively suppress the artifacts.
Synthetic Aperture Radar (SAR) Tomography (TomoSAR) is a novel technique that enables three-Dimensional (3-D) imaging using multi-baseline two-Dimensional (2-D) data. The essence of TomoSAR is actually to solve a one-dimensional spectral estimation problem. Compressed Sensing-based (CS) algorithm can retrieve solutions with only a few non-uniform acquisitions and has gradually become the main imaging method. In the conventional processing flow of CS algorithms, the continuous elevation direction is divided into a pre-set grid, and the targets are assumed to be exactly on the grid. \begin{document}$ {{L}}_{1} $\end{document} minimization has been proven to be effective in TomoSAR imaging. In the conventional processing flow, the continuous elevation axis is divided into fixed grids, and scatters are assumed to be exactly on the pre-set grid. However, this hypothesis is generally untenable, and will lead to a problem called “Basis Mismatch”, which is rarely discussed in TomoSAR. In this letter, we first discuss the model of Off-grid TomoSAR, and then propose an addictive perturbation model to compensate for the errors caused by the grid effect. We utilize the local optimization thresholding algorithm to solve the complex-valued \begin{document}$ {{L}}_{1} $\end{document} minimization problem of TomoSAR. We conducted experiments both on simulation data and actual airborne flight data. Our simulation results indicate that the proposed method can estimate a more accurate position of scatters, which leads to better original signal recovery. The reconstruction results of actual data verify that the impact of grid mismatch can be mostly eliminated. Synthetic Aperture Radar (SAR) Tomography (TomoSAR) is a novel technique that enables three-Dimensional (3-D) imaging using multi-baseline two-Dimensional (2-D) data. The essence of TomoSAR is actually to solve a one-dimensional spectral estimation problem. Compressed Sensing-based (CS) algorithm can retrieve solutions with only a few non-uniform acquisitions and has gradually become the main imaging method. In the conventional processing flow of CS algorithms, the continuous elevation direction is divided into a pre-set grid, and the targets are assumed to be exactly on the grid. \begin{document}$ {{L}}_{1} $\end{document} minimization has been proven to be effective in TomoSAR imaging. In the conventional processing flow, the continuous elevation axis is divided into fixed grids, and scatters are assumed to be exactly on the pre-set grid. However, this hypothesis is generally untenable, and will lead to a problem called “Basis Mismatch”, which is rarely discussed in TomoSAR. In this letter, we first discuss the model of Off-grid TomoSAR, and then propose an addictive perturbation model to compensate for the errors caused by the grid effect. We utilize the local optimization thresholding algorithm to solve the complex-valued \begin{document}$ {{L}}_{1} $\end{document} minimization problem of TomoSAR. We conducted experiments both on simulation data and actual airborne flight data. Our simulation results indicate that the proposed method can estimate a more accurate position of scatters, which leads to better original signal recovery. The reconstruction results of actual data verify that the impact of grid mismatch can be mostly eliminated.
The tomographic technique has attracted much attention because of its ability to separate overlapping scatterers in urban Synthetic Aperture Radar (SAR) images. The general method of SAR Tomography (TomSAR) imaging combines the following two aspects: estimating the distribution of the scatterers in the elevation direction and determining the number of strong scatterers in an overlapped pixel. This study applied several sophisticated spectrum estimations (e.g., Orthogonal Matching Pursuit, Sparse Learning via Iterative Minimization and Multiple Signal Classification) and model order selection approaches (e.g., Bayesian information criterion and generalized likelihood ratio test) with highly technical potential to recover the simulated overlapping scatterers. This simulation experiment is based on the parameters of the AIRCAS X-band TomoSAR data from Emei, Sichuan, China. The Cramér-Rao Lower Bound (CRLB) and recovery probability are used to evaluate the performances of different methods for the separation of overlapped scatterers. The experimental results revealed the following: (1) the standard deviation of estimation using second-order statistics is smaller than that of a single observation vector, especially when the number of acquisitions is very limited; (2) the amplitude ratio, phase difference, and elevation spacing between overlapping scatterers will have a significant impact on the different kinds of algorithms; and (3) the phase difference between overlapping scatterers will make the phase center estimation of greedy algorithm or spectrum estimation algorithm biased. The tomographic technique has attracted much attention because of its ability to separate overlapping scatterers in urban Synthetic Aperture Radar (SAR) images. The general method of SAR Tomography (TomSAR) imaging combines the following two aspects: estimating the distribution of the scatterers in the elevation direction and determining the number of strong scatterers in an overlapped pixel. This study applied several sophisticated spectrum estimations (e.g., Orthogonal Matching Pursuit, Sparse Learning via Iterative Minimization and Multiple Signal Classification) and model order selection approaches (e.g., Bayesian information criterion and generalized likelihood ratio test) with highly technical potential to recover the simulated overlapping scatterers. This simulation experiment is based on the parameters of the AIRCAS X-band TomoSAR data from Emei, Sichuan, China. The Cramér-Rao Lower Bound (CRLB) and recovery probability are used to evaluate the performances of different methods for the separation of overlapped scatterers. The experimental results revealed the following: (1) the standard deviation of estimation using second-order statistics is smaller than that of a single observation vector, especially when the number of acquisitions is very limited; (2) the amplitude ratio, phase difference, and elevation spacing between overlapping scatterers will have a significant impact on the different kinds of algorithms; and (3) the phase difference between overlapping scatterers will make the phase center estimation of greedy algorithm or spectrum estimation algorithm biased.
In SAR defocused ship images, the defocusing phenomenon of some ship targets is space-variant along the distance. In this context, a novel autofocus algorithm combining the adaptive momentum estimation optimizer and space-variant minimum entropy criteria is proposed to address these defocused ship targets. The algorithm can directly process complex images and compensate for any order phase errors. The effectiveness of the proposed method is proved by the experimental results on the simulation data and GF-3 data. Moreover, the entropy and contrast of the refocused image have been improved, and the focusing speed of the algorithm has been greatly enhanced. In SAR defocused ship images, the defocusing phenomenon of some ship targets is space-variant along the distance. In this context, a novel autofocus algorithm combining the adaptive momentum estimation optimizer and space-variant minimum entropy criteria is proposed to address these defocused ship targets. The algorithm can directly process complex images and compensate for any order phase errors. The effectiveness of the proposed method is proved by the experimental results on the simulation data and GF-3 data. Moreover, the entropy and contrast of the refocused image have been improved, and the focusing speed of the algorithm has been greatly enhanced.
To a certain extent, SAR images are affected by range ambiguity due to antenna sidelobe characteristics and pulse operating system. The work of range ambiguity suppression focuses on SAR system design and signal processing. One type of idea tries to modify the way of transmitting and receiving to block the receiving of the ambiguous energy, such as multiple elevation beams and azimuth phase coding. The other ideas are algorithms that use signal processing technology to reduce the distance ambiguity energy in echo and image domains. This paper proposes a range ambiguity suppression method that combines sparse reconstruction and matched filtering. The method performs the sparse reconstruction of the ambiguity area, estimates the ambiguity area signal using the ambiguity area image and reconstruction matrix, separates it from the echo signal to obtain the primary image signal after range ambiguity suppression, and uses matched filtering to obtain the main area image. In this method, sparse reconstruction ensures the accuracy of fuzzy signal estimation, and matched filtering ensures the efficiency of imaging processing. Simulation results show that the proposed method can effectively suppress range ambiguity, with a suppression effect of 10 dB or higher, and it has a good ability to maintain the weak targets and the details of the main image. To a certain extent, SAR images are affected by range ambiguity due to antenna sidelobe characteristics and pulse operating system. The work of range ambiguity suppression focuses on SAR system design and signal processing. One type of idea tries to modify the way of transmitting and receiving to block the receiving of the ambiguous energy, such as multiple elevation beams and azimuth phase coding. The other ideas are algorithms that use signal processing technology to reduce the distance ambiguity energy in echo and image domains. This paper proposes a range ambiguity suppression method that combines sparse reconstruction and matched filtering. The method performs the sparse reconstruction of the ambiguity area, estimates the ambiguity area signal using the ambiguity area image and reconstruction matrix, separates it from the echo signal to obtain the primary image signal after range ambiguity suppression, and uses matched filtering to obtain the main area image. In this method, sparse reconstruction ensures the accuracy of fuzzy signal estimation, and matched filtering ensures the efficiency of imaging processing. Simulation results show that the proposed method can effectively suppress range ambiguity, with a suppression effect of 10 dB or higher, and it has a good ability to maintain the weak targets and the details of the main image.
Synthetic Aperture Radar Information Extraction

Deep learning technologies have been developed rapidly in Synthetic Aperture Radar (SAR) image interpretation. The current data-driven methods neglect the latent physical characteristics of SAR; thus, the predictions are highly dependent on training data and even violate physical laws. Deep integration of the theory-driven and data-driven approaches for SAR image interpretation is of vital importance. Additionally, the data-driven methods specialize in automatically discovering patterns from a large amount of data that serve as effective complements for physical processes, whereas the integrated interpretable physical models improve the explainability of deep learning algorithms and address the data-hungry problem. This study aimed to develop physically explainable deep learning for SAR image interpretation in signals, scattering mechanisms, semantics, and applications. Strategies for blending the theory-driven and data-driven methods in SAR interpretation are proposed based on physics machine learning to develop novel learnable and explainable paradigms for SAR image interpretation. Further, recent studies on hybrid methods are reviewed, including SAR signal processing, physical characteristics, and semantic image interpretation. Challenges and future perspectives are also discussed on the basis of the research status and related studies in other fields, which can serve as inspiration.

Deep learning technologies have been developed rapidly in Synthetic Aperture Radar (SAR) image interpretation. The current data-driven methods neglect the latent physical characteristics of SAR; thus, the predictions are highly dependent on training data and even violate physical laws. Deep integration of the theory-driven and data-driven approaches for SAR image interpretation is of vital importance. Additionally, the data-driven methods specialize in automatically discovering patterns from a large amount of data that serve as effective complements for physical processes, whereas the integrated interpretable physical models improve the explainability of deep learning algorithms and address the data-hungry problem. This study aimed to develop physically explainable deep learning for SAR image interpretation in signals, scattering mechanisms, semantics, and applications. Strategies for blending the theory-driven and data-driven methods in SAR interpretation are proposed based on physics machine learning to develop novel learnable and explainable paradigms for SAR image interpretation. Further, recent studies on hybrid methods are reviewed, including SAR signal processing, physical characteristics, and semantic image interpretation. Challenges and future perspectives are also discussed on the basis of the research status and related studies in other fields, which can serve as inspiration.

High-resolution SAR images contain rich information about targets and their surroundings, but the complex electromagnetic scattering mechanism makes intuitive interpretation difficult, leading to an important research topic in SAR image interpretation. This paper summarizes the typical geometric primitives modeling method of high-frequency scattering, which is reviewed in detail with respect to surface, wedge, and vertex scatterings. Besides the classical expressions of these typical scattering mechanisms, some simulation results are presented. The difficulties in characterizing typical scattering mechanisms and key scientific problems applied to SAR image interpretation are analyzed. Furthermore, this paper proposes a complete and extensive scattering characteristic characterization system by combining and interacting with the scattering primitives based on the corresponding geometric primitives. Finally, the feasibility of developing a scattering mechanism dictionary for use in interpreting SAR image scattering information is discussed. High-resolution SAR images contain rich information about targets and their surroundings, but the complex electromagnetic scattering mechanism makes intuitive interpretation difficult, leading to an important research topic in SAR image interpretation. This paper summarizes the typical geometric primitives modeling method of high-frequency scattering, which is reviewed in detail with respect to surface, wedge, and vertex scatterings. Besides the classical expressions of these typical scattering mechanisms, some simulation results are presented. The difficulties in characterizing typical scattering mechanisms and key scientific problems applied to SAR image interpretation are analyzed. Furthermore, this paper proposes a complete and extensive scattering characteristic characterization system by combining and interacting with the scattering primitives based on the corresponding geometric primitives. Finally, the feasibility of developing a scattering mechanism dictionary for use in interpreting SAR image scattering information is discussed.
This study proposes a progressive building facade detection method based on structure priors to effectively detect building facades from massive array InSAR spatial points with noise. First, the proposed method projects the initial array of InSAR three-Dimensional (3D) points on the ground to produce connected regions that correspond to building facades and then progressively detects potential line segments in each connected region under the guidance of structure priors. Furthermore, the proposed method generates building facades according to the detected line segments and their corresponding 3D points. In this process, the line segment detection space of the current connected region is constructed based on line segments detected in its neighboring connected regions, thereby improving the overall efficiency and reliability of the current line segment detection. Experimental results confirm that the proposed method can efficiently produce more reliable building facades from a massive array of InSAR 3D points with noise, overcoming several difficulties (such as low efficiency and inferior reliability) encountered in traditional multi-model fitting methods. This study proposes a progressive building facade detection method based on structure priors to effectively detect building facades from massive array InSAR spatial points with noise. First, the proposed method projects the initial array of InSAR three-Dimensional (3D) points on the ground to produce connected regions that correspond to building facades and then progressively detects potential line segments in each connected region under the guidance of structure priors. Furthermore, the proposed method generates building facades according to the detected line segments and their corresponding 3D points. In this process, the line segment detection space of the current connected region is constructed based on line segments detected in its neighboring connected regions, thereby improving the overall efficiency and reliability of the current line segment detection. Experimental results confirm that the proposed method can efficiently produce more reliable building facades from a massive array of InSAR 3D points with noise, overcoming several difficulties (such as low efficiency and inferior reliability) encountered in traditional multi-model fitting methods.
Over the recent years, high-resolution Synthetic-Aperture Radar (SAR) images have been widely applied for intelligent interpretation of urban mapping, change detection, etc. Different from optical images, the acquisition approach and object geometry of SAR images have limited the interpretation performances of the existing deep-learning methods. This paper proposes a novel building footprint generation method for high-resolution SAR images. This method is based on supervised contrastive learning regularization, which aims to increase the similarities between intra-class pixels and diversities of interclass pixels. This increase will make the deep learning models focus on distinguishing building and nonbuilding pixels in latent space, and improve the classification accuracy. Based on public SpaceNet6 data, the proposed method can improve the segmentation performance by 1% compared to the other state-of-the-art methods. This improvement validates the effectiveness of the proposed method on real data. This method can be used for building segmentation in urban areas with complex scene background. Moreover, the proposed method can be extended for other types of land-cover segmentation using SAR images. Over the recent years, high-resolution Synthetic-Aperture Radar (SAR) images have been widely applied for intelligent interpretation of urban mapping, change detection, etc. Different from optical images, the acquisition approach and object geometry of SAR images have limited the interpretation performances of the existing deep-learning methods. This paper proposes a novel building footprint generation method for high-resolution SAR images. This method is based on supervised contrastive learning regularization, which aims to increase the similarities between intra-class pixels and diversities of interclass pixels. This increase will make the deep learning models focus on distinguishing building and nonbuilding pixels in latent space, and improve the classification accuracy. Based on public SpaceNet6 data, the proposed method can improve the segmentation performance by 1% compared to the other state-of-the-art methods. This improvement validates the effectiveness of the proposed method on real data. This method can be used for building segmentation in urban areas with complex scene background. Moreover, the proposed method can be extended for other types of land-cover segmentation using SAR images.
Convolutional Neural Networks (CNNs) are widely used in optical image classification. In the case of Synthetic Aperture Radar (SAR) images, obtaining sufficient training examples for CNNs is challenging due to the difficulties in and high cost of data annotation. Meanwhile, with the advancement of SAR image simulation technology, generating a large number of simulated SAR images with annotation is not difficult. However, due to the inevitable difference between simulated and real SAR images, it is frequently difficult to directly support the real SAR image classification. As a result, this study proposes a simulation-assisted SAR target classification method based on unsupervised domain adaptation. The proposed method integrates Multi-Kernel Maximum Mean Distance (MK-MMD) with domain adversarial training to address the domain shift problem encountered during task transition from simulated to real-world SAR image classification. Furthermore, Layer-wise Relevance Propagation (LRP) and Contrastive Layer-wise Relevance Propagation (CLRP) are utilized to explore how the proposed method influences the model decision. The experimental results show that by modifying the focus areas of the model to obtain domain-invariant features for classification, the proposed method can significantly improve classification accuracy. Convolutional Neural Networks (CNNs) are widely used in optical image classification. In the case of Synthetic Aperture Radar (SAR) images, obtaining sufficient training examples for CNNs is challenging due to the difficulties in and high cost of data annotation. Meanwhile, with the advancement of SAR image simulation technology, generating a large number of simulated SAR images with annotation is not difficult. However, due to the inevitable difference between simulated and real SAR images, it is frequently difficult to directly support the real SAR image classification. As a result, this study proposes a simulation-assisted SAR target classification method based on unsupervised domain adaptation. The proposed method integrates Multi-Kernel Maximum Mean Distance (MK-MMD) with domain adversarial training to address the domain shift problem encountered during task transition from simulated to real-world SAR image classification. Furthermore, Layer-wise Relevance Propagation (LRP) and Contrastive Layer-wise Relevance Propagation (CLRP) are utilized to explore how the proposed method influences the model decision. The experimental results show that by modifying the focus areas of the model to obtain domain-invariant features for classification, the proposed method can significantly improve classification accuracy.