
Citation: | WANG Zhirui, KANG Yuzhuo, ZENG Xuan, et al. SAR-AIRcraft-1.0: High-resolution SAR aircraft detection and recognition dataset[J]. Journal of Radars, 2023, 12(4): 906–922. doi: 10.12000/JR23043 |
Synthetic Aperture Radar (SAR) is an active microwave imaging system capable of all-day, all-weather ground observation. It is unaffected by natural conditions such as lighting, clouds, fog, and weather. SAR has become a crucial information acquisition platform in the field of remote sensing [ 1] . In recent years, with the rapid development of remote sensing imaging technology and the increasing number of in-orbit SAR satellites, the quantity and quality of data obtained by SAR systems have considerably improved, which then promotes SAR’s development and application in relevant fields [ 2] . The vast amount of high-resolution data now available provides a robust foundation for the detailed interpretation of SAR images [ 3, 4] .
Target detection and recognition are critical steps in the intelligent interpretation of SAR images. As a typical target in SAR images, aircrafts have high observational value because of their abundance and variety [ 5] . Detection and recognition of aircraft in SAR images enable the extraction of information such as aircraft model, type, location, and status. These processes effectively support applications like dynamic monitoring of key areas, situational analysis, and emergency rescue. Therefore, the detection and recognition of aircraft in high-resolution SAR images hold important research value [ 6] .
In recent years, advances in deep learning theory and technology have led to remarkable progress in target detection and recognition in SAR images using Convolutional Neural Networks (CNNs) [ 7, 8] . For aircraft detection and recognition in SAR images, Zhao et al. [ 9] proposed a multi-branch atrous convolutional feature pyramid method that uses dense connections to reduce redundant information and highlight essential features of aircraft. Ref. [ 10] designed an attention module to refine and integrate low-level texture and high-level semantic features, and the module further improves aircraft detection accuracy. For SAR ship detection and recognition tasks, Refs. [ 11, 12] reduced the dependency on predefined box hyperparameters by directly learning the location of bounding boxes, and they achieved fine-grained ship recognition. Sea vessels often form strong double-bounce reflections with the water surface, and they appear as a complete, highly connected set of strong scatterers in SAR images. In contrast, land-based aircraft are smaller, their features are harder to extract, and their scatterers are more dispersed [ 13, 14] . As such, accurate localization and recognition of land-based aircraft are more challenging.
In practical scenarios, SAR aircraft detection and recognition present major challenges. First, as shown in Fig. 1(a), aircraft are prone to interference from surrounding backgrounds such as terminals and aprons. Objects with similar scattering properties can be misidentified as aircraft, leading to false alarms and missed detections [ 15] . Second, as shown in Fig. 1(b), aircraft in SAR images appear as a series of discrete, irregularly scattered bright spots with inconsistencies in scattering strength, causing targets to be segmented into multiple discrete parts [ 16] . Consequently, SAR aircraft detection results are incomplete. Third, as shown in Fig. 1(c), the same target exhibits different visual features under various imaging angles, leading to substantial intra-class differences and increased difficulty in aircraft recognition [ 17] .
To address the issues of severe background interference and dispersed aircraft scatterers in SAR images, this study proposes a Scattering-Aware Network (SA-Net) for detecting and recognizing aircraft targets in complex SAR images. On the one hand, a Context-Guided Feature Pyramid Network (CG-FPN) enhances global information, suppresses strong interference in complex scenes, and improves the discriminative features of the targets. This network thus increases accuracy in detection and recognition. On the other hand, scattering keypoints are utilized for target localization, and a Scattering-Aware detection Head (SA-Head) module is designed to integrate the distribution characteristics of keypoints with the neural network. The correction of the bounding boxes is refined, and the accuracy of target localization is enhanced. To validate the effectiveness of SA-Net, this study constructs SAR-AIRcraft-1.0, a SAR aircraft dataset for large-scale complex scenes. Based on this dataset, a series of detection and recognition comparative experiments are conducted between SA-Net and several commonly used deep neural network models. The experimental results indicate that the SA-Net method achieves an mAP0.5 of 77.7%, showing a notable improvement over the mAP0.5 of other methods. The SA-Net method also demonstrates the excellent performance of the scattering-aware approach. The SAR-AIRcraft-1.0 dataset has been publicly released on the website of the
Journal of Radars website (
Most publicly available datasets for SAR detection and recognition tasks focus on ship targets, as shown in Tab. 1. These datasets include MSTAR (The Moving and Stationary Target Acquisition and Recognition) [ 18] , OpenSARShip [ 19] , SSDD (SAR Ship Detection Dataset) [ 20] , SAR-Ship-Dataset [ 21] , AIR-SARShip-1.0 [ 22] , HRSID (High-Resolution SAR Images Dataset) [ 23] , and FUSAR-Ship [ 24] . SAR datasets for aircraft target detection and recognition are relatively limited. Publicly reported datasets include the SAR Aircraft Detection Dataset (SADD) [ 25] and MSAR-1.0 (large-scale Multi-class SAR image target detection dataset-1.0) [ 26] . SADD is an aircraft detection dataset collected from the TerraSAR-X satellite that contains 2,966 image patches. MSAR-1.0 includes four target types: Aircraft, oil tanks, bridges, and ships. These datasets have advanced neural network development for SAR aircraft target detection. However, SADD and MSAR-1.0 only contain positional information for aircraft targets. Both lack fine-grained category annotations for aircraft, which limits further development in SAR aircraft recognition.
Dataset | Category | Instance | Image | Size | Release year | Task |
MSTAR | 10 | 5,950 | 5,950 | 128×128 | 1998 | Vehicle identification |
OpenSARShip | 17 | 11,346 | 11,346 | 256×256 | 2017 | Ship detection and recognition |
SSDD | 1 | 2,456 | 1,160 | 190~668 | 2017 | Ship detection |
SAR-Ship-Dataset | 1 | 59,535 | 43,819 | 256×256 | 2019 | Ship detection |
AIR-SARShip-1.0 | 1 | 461 | 31 |
|
2019 | Ship detection |
HRSID | 1 | 16,951 | 5,604 | 800×800 | 2020 | Ship detection and segmentation |
FUSAR-Ship | 15 | 16,144 | 16,144 | 512×512 | 2020 | Ship detection and recognition |
SADD | 1 | 7,835 | 2,966 | 224×224 | 2022 | Aircraft detection |
MSAR-1.0 | 4 | 60,396 | 28,449 | 256~2048 | 2022 | Aircraft, oil tanks, bridges, and ships detection |
SAR-AIRcraft-1.0 | 7 | 16,463 | 4,368 | 800~
|
2023 | Aircraft detection and identification |
To validate the effectiveness of the SA-Net method and further advance research in SAR aircraft target detection and recognition, this study develops SAR-AIRcraft-1.0, a large-scale SAR aircraft target dataset. The dataset has a resolution of 1 m and includes 4,368 aircraft patches across seven fine-grained aircraft types. SAR-AIRcraft-1.0 is characterized by complex scenes, diverse categories, dense targets, noise interference, varied tasks, and multi-scale properties. The dataset is now publicly accessible on the Journal of Radars website for free use for further studies.
All images in the SAR-AIRcraft-1.0 dataset are collected by the Gaofen-3 satellite. Single-polarization and a spatial resolution of 1 m are used with the spotlight imaging mode. The dataset primarily includes imagery from three civilian airports: Shanghai Hongqiao Airport, Beijing Capital Airport, and Taiwan Taoyuan Airport. The size of the airport and the number of parked aircraft are considered in the selection of the airports. The dataset contains images in four different sizes—800 × 800,
(1) Complex Scenes: The dataset includes images from multiple time phases of civilian airports. It covers large areas with background features such as terminals, vehicles, and buildings. This coverage increases the complexity of the scenes.
(2) Rich Categories: Unlike typical SAR aircraft datasets, SAR-AIRcraft-1.0 contains fine-grained category information for aircraft targets. The similar scattering characteristics across different categories make aircraft recognition more challenging.
(3) Dense Targets: Each image patch includes multiple aircraft targets. As shown in Fig. 1(a), several aircraft are parked near terminals in close proximity. Such proximity creates dense distributions where targets interfere with one another, affecting detection and recognition accuracy.
(4) Noise Interference: Owing to SAR imaging characteristics, the images contain speckle noise interference, which makes accurate detection and recognition of aircraft targets challenging.
(5) Varied Tasks: This dataset supports not only detection tasks but also fine-grained recognition because it includes category information. By cropping the aircraft targets, multi-class target patches can be generated, enabling fine-grained recognition. Moreover, with position and category data, SAR-AIRcraft-1.0 supports unified detection-recognition tasks.
(6) Multi-Scale Characteristics: The aircraft target patches in this dataset vary considerably in size. As shown in Fig. 4, some targets are under 50 × 50 pixels, whereas others exceed 100 × 100 pixels, reflecting a broad multi-scale distribution across targets.
For instance annotations, all target instances in the SAR-AIRcraft-1.0 dataset are labeled using horizontal bounding boxes. The Pascal VOC format is followed in labeling. Fig. 5(a) shows an example with annotated targets, where orange rectangles represent the bounding boxes. Each box has the target’s category displayed in the top left corner. Each image has a corresponding XML file, as shown in Fig. 5(b), which contains detailed information such as the image size and instance attributes like category and bounding box coordinates.
In the XML file, “size” represents the width and height of the image patch, “name” indicates the aircraft category, and “bndbox” provides the coordinate information for each bounding box. With the top-left corner of the image as the origin, “xmin” and “xmax” denote the minimum and maximum X coordinates, whereas “ymin” and “ymax” denote the minimum and maximum Y coordinates, respectively.
During actual training, the images in the SAR-AIRcraft-1.0 dataset are divided into training, validation, and test sets in a 7:1:2 ratio. The dataset includes multi-temporal images from various airports, encompassing large areas with complex backgrounds. Additionally, because of the imaging mechanism of SAR, images of the same scene taken from different angles exhibit substantial variations, further increasing the complexity of the scenes. Consequently, this dataset presents considerable challenges for detection and recognition tasks.
To address the issue of strong scattering interference in the background, this study proposes an integrated scattering-aware SAR image aircraft target detection and recognition method. The overall framework, shown in Fig. 6, is based on an anchor-free algorithm structure and consists of two parts: the CG-FPN and SA-Head module.
In the feature extraction network, to mitigate the effect of background interference on target features, this study proposes an improved feature pyramid module to enhance global information and reduce false alarms. The CG-FPN effectively combines contextual information around the target by adaptively adjusting the size of the receptive field, thereby enhancing the saliency features of the target.
In the target localization phase, this study designs a cascade regression module that combines scattering awareness in a two-stage process to improve the accuracy of the regression boxes. The SA-Head module first detects the scattering keypoints of the target and utilizes their positional information to obtain a rough regression box. Subsequently, the SA-Head module refines the rough regression box to generate more precise detection boxes.
The overall structure of the CG-FPN is shown in Fig. 7. Suppose the input image is denoted as I∈RW×H×3, where W and H represent the width and height of the input image, respectively. By downsampling the input image, features from different layers are obtained. Each layer has a size of (W/sl)×(H/sl)×256, where sl=2l represents the downsampling rate of the l(l=3,4,5)th layer, and the channel is set to 256. To obtain the deepest layer features, the features {P3,P4,P5,P6} are resized to a uniform size of P7 and concatenated (concat) along the channel dimension.
As shown in Fig. 7, CG-FPN applies dilated convolutions with varying dilation rates (rate = 3, 6, 12, 18, 24) on the fused deep features to aggregate multi-scale semantic information through dense connections at each feature level. Each output from the dilated convolution is added to the copied input feature and integrated with the previous layer’s feature before being input into the next layer’s dilated convolution. Finally, the original features are fused with the output features from the dilated convolutions after upsampling. The original features are retained to help the network recall prior information, thus resulting in a feature map that combines shallow detail with deep semantic information.
Besides integrating features across different layers, CG-FPN seeks to introduce interactive fusion across different channel features. Inspired by the channel attention mechanism of SENet [ 27] , global average pooling [ 28] is first used to compress spatial dimensions and obtain global information. The weights of each feature channel are then adaptively adjusted to reflect inter-channel relationships. Each weight coefficient is multiplied by the corresponding original feature to yield refined features.
To obtain additional semantic and global information, CG-FPN concatenates the attention feature map A with the feature P7. Based on this, low-resolution features are fused with the corresponding features from the previous layer to generate information-rich features. Finally, a 3 × 3 convolution layer outputs the final feature map Tl∈R(W/sl)×(H/sl)×256. The process is calculated as follows:
{I7=Concat(Conv1×1(P7)+A)Il=Upsample(Il+1)+Conv1×1(Pl),l=3,4,5,6Tl=Conv3×3(Il),l=3,4,5,6,7 | (1) |
This study proposes an SA-Head module that leverages the distribution relationships of scattering points to address discreteness issues. The module consists of localization and classification branches, as shown in Fig. 8, with separate convolution layers for each branch. The output features Tl∈R(W/sl)×(H/sl)×256 from layer l(l=3,4,5,6,7) of the feature extraction network are used as the input features for both branches.
In the localization branch, three 3 × 3 convolutional layers are first applied in Tl to obtain intermediate features Tlmid. Subsequently, these features pass through a 3 × 3 convolutional layer with 256 dimensions and a 1 × 1 convolutional layer with 18 dimensions to generate the offset field OF1∈R(W/sl)×(H/sl)×18. Inspired by DenseBox [ 29] , the first prediction of the scattering keypoints S1 is obtained by the offset from the center point, where their location is given by
S1={p1i}ni=1={pcenter+Δp1i}ni=1 | (2) |
Here, {Δp1i}ni=1 represents the predicted offset from the center point pcenter, taking into account the receptive field size, with n set to 9. p1i=(x1i,y1i) denotes the coordinates of the ith point. After obtaining the predicted scattering keypoints, the coordinates of these points are used to determine the minimum enclosing rectangle B1 in the horizontal and vertical directions. They serve as an initial coarse regression box, thereby identifying the target’s location.
xmin | (3) |
{\boldsymbol{{B}}}_{}^1 = \left( {x_{\min }^1,\;y_{\min }^1,\;x_{\max }^1,\;y_{\max }^1} \right) | (4) |
To capture the positional relationships between the aircraft’s scattering points, a supervised learning approach is applied to update the spatial distribution of scattering keypoints through regression. The ground-truth coordinates for these scattering keypoints are obtained as follows [ 17] . First, the Harris corner detector [ 30] is used to identify high-intensity points that reflect structural contours. Second, the k-means algorithm [ 31] clusters these points into 9 clusters, yielding 9 key cluster points with positional offsets relative to the center of the aircraft patch. This process removes redundant points, resulting in a more regular structure. Aircraft patches are cropped based on the ground-truth bounding box (orange box) shown in Fig. 5, and the corresponding XML files contain the bounding box coordinates. Thus, the coordinates of the aircraft patches can be derived from the XML files, and the cluster keypoint coordinates can be calculated using the XML and positional offsets. The nine cluster centers obtained through this method are treated as the ground-truth coordinates of the scattering keypoints. These points reflect the aircraft’s scattering intensity and structural feature distribution, and they provide valuable information for target discrimination.
During detection, the initial keypoint coordinates of the target are obtained to determine its rough location. However, owing to the scattering mechanism, components within the target that exhibit lower scattering density are often overlooked by the coarse regression box. This oversight results in an imprecise detection box. To address this issue, the SA-Head module employs fine localization to achieve an accurate regression box.
In fine localization, the first offset group \left\{ {\Delta p_i^1} \right\}_{i = 1}^n and a deformable convolution [ 32] are used to reconstruct the feature {{\boldsymbol{T}}_{{\mathrm{mid}}}} into a new feature map {{\boldsymbol{\tilde T}}_{{\text{mid}}}} \in {R^{\left( {W/{s_l}} \right) \times \left( {H/{s_l}} \right) \times 256}} :
\begin{split} {\tilde{{\boldsymbol{T}}}}_{\text{mid}}\left(p\right)& ={\Omega }_{3\times \text{3}}\left({{\boldsymbol T}}_{\text{mid}},\;\left\{\Delta {p}_{i}^{1}\right\}\right)\\ & ={\displaystyle \sum _{i=1}^{9}\omega \left(\Delta {p}_{i}^{1}\right)}\cdot {{\boldsymbol{T}}}_{\text{mid}}\left(p+\Delta {p}_{i}^{1}\right) \end{split} | (5) |
Here, \omega represents a series of learned weight parameters from the network, and {\Omega _{3 \times {\text{3}}}} indicates the 3 × 3 convolution operation. The calculated offsets may contain some decimal values, so this module draws on bilinear interpolation to produce the continuous feature {{\boldsymbol{T}}_{{\text{mid}}}}\left( {\tilde p} \right){\text{ = }}\displaystyle\sum\nolimits_a {\delta \left( {a,\tilde p} \right)} \cdot {{\boldsymbol{T}}_{{\text{mid}}}}\left( a \right) . In this context, a represents the integral sampling points a = \left( {{x_a},{y_a}} \right) , and \delta \left( {a,\tilde p} \right) is the bilinear interpolation weight between point a and position \tilde p = \left( {{x_{\tilde p}},{y_{\tilde p}}} \right) .
After obtaining the scattering reconstructed feature {\tilde {\boldsymbol{T}}_{{\mathrm{mid}}}} , this feature is passed through a 1 × 1 convolution layer with an output channel of 18 dimensions to produce a new offset domain {{\mathrm{OF}}_2} . The second set of predicted scattering keypoints {S^2} is obtained as follows:
{S^2} = \left\{ {\left( {p_i^2} \right)} \right\}_{i = 1}^n = \left\{ {\left( {x_i^2,y_i^2} \right)} \right\}_{i = 1}^n = \left\{ {p_i^1 + \Delta p_i^2} \right\}_{i = 1}^n | (6) |
Here, \left\{ {\Delta p_i^2} \right\}_{i = 1}^n represents the predicted offset of the second set of scattering keypoints relative to the first set of keypoints. Using the coordinates of the points, an accurate box position {\boldsymbol{B}}^2 = \left( x_{\min }^2,\;y_{\min }^2, \;x_{\max }^2, \;y_{\max }^2 \right) can be further obtained. Both sets of offsets share the same scale, so the proposed method is unaffected by issues related to the scale parameters of the regression boxes.
In the classification branch, feature {{\boldsymbol{T}}^l} first passes through three 3 × 3 convolutional layers to extract high-level semantic information about the original features. Similar to the localization branch, it is then processed by a deformable convolution layer with an offset of {{\mathrm{OF}}_1} and a 1 × 1 convolutional layer to optimize and correct the target’s class representation information. This process places greater emphasis on the important scattering components of the SAR aircraft targets. It consequently enhances their relevance and improves the recognition ability of the classification branch. Overall, the SA-Head module integrates the characteristics of the anchor-free framework and utilizes keypoint decoding to obtain the target boxes.
The overall training loss function can be divided into four parts:
L = {L_{{{\rm{loc}}} 1}} + {L_{{\rm{loc2}}}} + {L_{{\rm{keypoints}}}} + {L_{{{\rm{cls}}} }} | (7) |
where {L_{{\text{loc1}}}} and {L_{{\text{loc2}}}} represent the losses of the first and second predicted boxes compared with the ground truth of the target box. The calculations for {L_{{\mathrm{loc}}}} is as follows:
{L_{{\text{loc}}}} = \frac{1}{N}\sum\limits_{i = 1}^N {{\text{smoot}}{{\text{h}}_{L1}}\left( {{S_i} - {{\hat S}_i}} \right)} | (8) |
Here, N denotes the number of targets, {S_i} and {\hat S_i} represent the predicted box and the ground truth box, respectively, and {{\mathrm{smooth}}_{L1}} is the smoothed {{L}}1 loss function.
The assumption is that the ground truth coordinates of the target Q scattering keypoints are {\hat S_Q} = \left\{ {{{\hat p}_{iQ}}} \right\}_{i = 1}^n , and the predicted coordinates of the scattering keypoints are S_{Q}=\left\{p_{j Q}\right\}_{j=1}^{m} . Hence, the loss between the predicted coordinates and the ground truth coordinates of the scattering keypoints is calculated using Chamfer loss [ 33] :
\begin{split} {L_{{\text{keypoints}}}}{\text{ = }}& \frac{1}{N}\sum\limits_{Q = 1}^N \left( \frac{1}{{18}}\sum\limits_{m = 1}^9 \mathop {\min }\limits_n \left\|p_{mQ}^1 - {{\hat p}_{nQ}}\right\|_2\right.\\ & \left.{\text{ + }}\frac{1}{{18}}\sum\limits_{n = 1}^9 {\mathop {\min }\limits_m \left\|p_{mQ}^1 - {{\hat p}_{nQ}}\right\|_2} \right) \end{split} | (9) |
Here, n represents the ground truth keypoints of target Q, and m represents the predicted keypoints. As the loss function continues to converge, the model achieves high training accuracy.
{{\rm{Focalloss}}} \left( {{c_t}} \right) = - {\mu _t}{\left( {1 - {c_t}} \right)^\gamma }\log \left( {{c_t}} \right) | (10) |
The classification loss {L_{{\mathrm{cls}}}} uses the focal loss [ 34] function to reduce sample imbalance by adjusting the weights of positive and negative samples. In this function, {\mu _t} = 0.25 and \gamma = 2 are parameters, where {\left( {1 - {c_t}} \right)^\gamma } serves as a modulation factor, and {c_t} is the corresponding classification score.
In this section, the proposed method is compared with state-of-the-art approaches across various tasks to validate the effectiveness of SA-Net and provide benchmark metrics for the dataset. These tasks include SAR aircraft detection, fine-grained recognition and integrated detection-recognition tasks. Ablation studies are also conducted to further examine the proposed method. Additionally, this section presents a detailed analysis of the experimental outcomes and an outlook on future work.
ResNet-50
[
35]
, pretrained on the ImageNet dataset, is selected to initialize the model in this study. The batch size is set to 8 for each training iteration, and the model is trained using the stochastic gradient descent algorithm. The initial learning rate is set to 0.001 and decays to
To quantitatively evaluate the performance of the algorithm, the metrics include precision ( P) and recall ( R), as defined in Eq. (11) and Eq. (12):
P = \frac{{{{{N}}_{{\text{TP}}}}}}{{{{{N}}_{{\text{TP}}}} + {{{N}}_{{\text{FP}}}}}} | (11) |
R = \frac{{{{{N}}_{{\text{TP}}}}}}{{{{{N}}_{{\text{TP}}}}{\text{ + }}{{{N}}_{{\text{FN}}}}}} | (12) |
Here, {N_{{\mathrm{TP}}}} represents the number of correctly detected targets. {N_{{\mathrm{FP}}}} refers to the number of false positives where the detection result is positive but the true label is negative (false alarm), and {N_{{\mathrm{FN}}}} denotes the number of false negatives where the detection result is negative but the true label is positive (missed detection). The {\mathrm{F1 }}-score is used to provide a comprehensive evaluation of the algorithm’s performance, defined as follows:
{\text{F1}} = \frac{{2 \times P \times R}}{{P + R}} | (13) |
In addition, this study utilizes the Precision-Recall Curve (PRC) and Average Precision (AP). The PRC can be plotted By sorting predictions in descending order of confidence and calculating precision-recall pairs at different steps. AP reflects the shape of the PRC and provides a comprehensive evaluation of the algorithm’s performance. AP is defined as the mean of the highest precision values over a set of recall rates S = {0, 0.01, 0.02, ···, 1}, with the specific calculation given by
{\text{AP}} = \frac{1}{{101}}\sum\limits_{R \in S} {\tilde P\left( R \right)} | (14) |
Here, \tilde P\left( R \right) = \max _{R':R' \ge R} P\left( {R'} \right) is the precision corresponding to each recall R, and P\left( {R'} \right) is the precision corresponding to recall R'. After calculating the AP for each category, the mean Average Precision (mAP) is obtained as the average of AP values across all categories. Typically, AP is calculated with an Intersection-over-Union (IoU) threshold of 0.5, denoted as AP 0.5.
Target detection is a fundamental task in SAR imaging. This study utilizes the SAR-AIRcraft-1.0 dataset to train and test several widely used benchmark detection methods, including Faster R-CNN [ 36] and Cascade R-CNN [ 37] . Both are two-stage CNNs. Recently, anchor-free, single-stage detection methods have been designed to greatly reduce sensitivity to anchor-related parameters. Consequently, the classic anchor-free method RepPoints [ 38] and an SAR-specific target detection method, namely SKG-Net [ 1] , are compared.
In terms of data utilization, all aircraft targets are treated as positive samples, and the background as negative samples. All aircraft targets are categorized into a single class. Tab. 2 presents the accuracy, recall, F1-score, AP 0.5, and AP 0.75 metrics for aircraft targets across different detectors. The results indicate that SA-Net achieves the highest accuracy across different thresholds, demonstrating the effectiveness of the proposed method.
Detection methods | P | R | F1 | AP 0.5 | AP 0.75 |
Faster R-CNN | 77.6 | 78.1 | 77.8 | 71.6 | 53.6 |
Cascade R-CNN | 89.0 | 79.5 | 84.0 | 77.8 | 59.1 |
RepPoints | 62.7 | 88.7 | 81.2 | 80.3 | 52.9 |
SKG-Net | 57.6 | 88.8 | 69.9 | 79.8 | 51.0 |
SA-Net | 87.5 | 82.2 | 84.8 | 80.4 | 61.4 |
In the detection methods discussed above, most anchor boxes are redundant because of the sparse distribution of SAR aircraft targets. Anchor-free detection algorithms achieve better results on AP 0.5, with RepPoints and SKG-Net reaching 80.3% and 79.8%, respectively. This improvement may be due to anchor-free methods reducing background clutter within bounding boxes, thereby clarifying the semantic information of the targets. However, RepPoints and SKG-Net produce more false positives (false alarms) compared with anchor-based methods, reducing detection accuracy.
Among two-stage detection methods, the cascade structure of Cascade R-CNN further improves AP and various metrics compared with Faster R-CNN. Fig. 9 shows visual comparisons of the test results between the proposed method and other advanced methods. The figure illustrates that Faster R-CNN, RepPoints, and Cascade R-CNN all exhibit false positives (yellow) and false negatives (blue boxes). By contrast, SA-Net effectively reduces these false positives and negatives, validating its superior detection performance.
In this study, instance targets are cropped according to the annotation boxes in the SAR-AIRcraft-1.0 dataset, resulting in a series of instance samples. The specific quantities are shown in Tab. 3. For the fine-grained recognition experiments, seven distinct aircraft labels are selected: A330, A320/321, A220, ARJ21, Boeing 737, Boeing 787, and other.
Category | Training set number | Test set number | Total |
A330 | 278 | 31 | 309 |
A320/321 |
|
52 |
|
A220 |
|
460 |
|
ARJ21 | 825 | 362 |
|
Boeing737 | 2007 | 550 |
|
Boeing787 |
|
454 |
|
other |
|
|
|
Total |
|
|
|
This study uses recognition accuracy as the evaluation metric to quantify the performance of the fine-grained recognition task. The corresponding calculation is given by Eq. (15):
{\text{Acc}} = \frac{{\displaystyle\sum\nolimits_i {{N_{{C_i}}}} }}{{{N_{{\text{all}}}}}} | (15) |
where {N_{{C_i}}} represents the number of correctly identified samples for class {C_i} , and N represents the total number of samples.
This study conducts experimental comparison using ResNet-50, ResNet-101, ResNeXt-50, ResNeXt-101 [ 39] , and Swin Transformer [ 40] on the SAR-AIRcraft-1.0 dataset. Fifty percent of the data from the training set are selected for model training. The results of the fine-grained recognition are presented in Tab. 4, which shows that ResNet-101 outperforms ResNet-50. The ResNeXt series of models achieve excellent performance in terms of top-1 recognition accuracy. The Swin Transformer not only exhibits the highest performance in top-3 accuracy but also achieves the best recognition capability in most categories. This outcome demonstrates its outstanding feature learning ability.
Methods | Acc (top-1/top-3) | A330 | A320/321 | A220 | ARJ21 | Boeing737 | Boeing787 | Other |
ResNet-50 | 75.59/89.19 | 74.19 | 90.38 | 78.04 | 73.76 | 61.64 | 78.63 | 80.50 |
ResNet-101 | 78.58/90.37 | 93.55 | 98.08 | 76.96 | 73.76 | 71.82 | 74.67 | 84.82 |
ResNeXt-50 | 80.61/89.46 | 83.87 | 94.23 | 78.91 | 74.86 | 73.27 | 83.04 | 85.40 |
ResNeXt-101 | 82.20/91.83 | 87.10 | 100 | 80.87 | 79.83 | 71.09 | 83.92 | 87.70 |
Swin Trarsformer | 81.29/ 92.51 | 77.42 | 100 | 80.87 | 74.59 | 73.82 | 86.12 | 84.82 |
To further quantitatively assess the model’s performance and display additional details of the recognition results, this study presents a confusion matrix of the algorithm models to show the performance of different network architectures. As illustrated in Fig. 10, the probabilities along the diagonal represent the recognition accuracy for each category. Among them, the identification of aircraft targets such as A330, ARJ21, and Boeing 737 proves to be challenging, reflected by their relatively low recognition accuracy. The images of Boeing 737 and Boeing 787 are quite similar, resulting in confusion in the recognition results. This situation highlights the challenges posed by the SAR-AIRcraft-1.0 dataset.
This study selects four distinct methods for comparative experiments on integrated detection and recognition to validate the performance of different deep learning algorithms. These four methods are Faster R-CNN, Cascade R-CNN, Reppoints, and SKG-Net, which encompass anchor-based and anchor-free approaches.
During the experiments, different categories of aircraft are treated as separate classes. No data augmentation techniques are employed to maintain the original characteristics of the data. The detection performance of each algorithm is displayed in Tab. 5. For the Faster R-CNN method, the mAP 0.5 for various categories is 76.1%, and the mAP 0.75 is 62.2%. Thus, the SAR-AIRcraft-1.0 dataset presents certain detection challenges. First, different categories of SAR aircraft targets share similar structures and sizes, so distinguishing between target classes is difficult. Furthermore, owing to the scattering characteristics of SAR images and variations in imaging conditions, targets of the same category may yield different imaging results. The recognition process becomes increasingly complicated.
Category | Faster
R-CNN |
Cascade
R-CNN |
Reppoints | SKG-Net | SA-Net |
A330 | 85.0 | 87.4 | 89.8 | 79.3 | 88.6 |
A320/321 | 97.2 | 97.5 | 97.9 | 78.2 | 94.3 |
A220 | 78.5 | 74.0 | 71.4 | 66.4 | 80.3 |
ARJ21 | 74.0 | 78.0 | 73.0 | 65.0 | 78.6 |
Boeing737 | 55.1 | 54.5 | 55.7 | 65.1 | 59.7 |
Boeing787 | 72.9 | 68.3 | 51.8 | 69.6 | 70.8 |
other | 70.1 | 69.1 | 68.4 | 71.4 | 71.3 |
mAP | 76.1 | 75.7 | 72.6 | 70.7 | 77.7 |
Additionally, this study selects a more stringent metric, AP at the IoU threshold of 0.75 (AP 0.75), to evaluate the model, as shown in Tab. 6. Given the integration of global contextual features and scattering information, the proposed SA-Net achieves an mAP 0.75 of 62.8%. However, the detection accuracy varies among different categories. For instance, compared with other categories, the A320/321 demonstrates the best performance in AP 0.5 and AP 0.75 across various algorithms. This outcome is primarily because the A320/321 has a distinctive size, with a fuselage length of over 40 m, making it easy to differentiate. For ARJ21 and A220, their relatively small size and insufficient detail features lower detection accuracy.
Category | Faster
R-CNN |
Cascade
R-CNN |
Reppoints | SKG-Net | SA-Net |
A330 | 85.0 | 87.4 | 66.4 | 66.4 | 88.6 |
A320/321 | 87.7 | 73.9 | 84.9 | 49.6 | 86.6 |
A220 | 58.7 | 49.1 | 49.4 | 29.8 | 55.0 |
ARJ21 | 55.2 | 59.0 | 50.9 | 37.7 | 59.7 |
Boeing737 | 42.8 | 39.1 | 36.6 | 48.7 | 41.8 |
Boeing787 | 60.5 | 57.6 | 41.8 | 51.6 | 60.4 |
other | 45.4 | 46.1 | 43.1 | 41.1 | 47.7 |
mAP | 62.2 | 58.9 | 53.3 | 46.4 | 62.8 |
To intuitively compare various methods, this study plots the F1 curves of different methods at various thresholds, as shown in Fig. 11. The figure depicts that SA-Net consistently achieves the highest F1 score across different confidence levels compared with other advanced methods. This finding indicates that the proposed method exhibits good robustness, achieving a strong balance between detection rate and recall rate.
This study combines the FCOS [ 41] with deformable convolution as the baseline network. Ablation experiments are conducted on the SAR-AIRcraft-1.0 dataset with different modules, and the results are shown in Tab. 7. The proposed modules contribute to varying degrees of improvement in detection performance. Compared with the baseline, the CG-FPN module improves the AP 0.5 metric by 0.8%. The AP 0.5 and AP 0.75 of the SA-Net are 0.8% and 0.7% higher than those of the baseline, respectively, achieving more accurate localization of the targets.
Methods | P | R | F1 | AP 0.5 | AP 0.75 |
Baseline | 88.1 | 81.2 | 84.5 | 79.6 | 60.7 |
Baseline+SA-Head | 88.2 | 82.1 | 85.0 | 80.3 | 60.8 |
Baseline+CG-FPN | 88.6 | 81.9 | 85.1 | 80.4 | 60.4 |
SA-Net | 87.5 | 82.2 | 84.8 | 80.4 | 61.4 |
To visually compare different modules, Fig. 12 and Fig. 13 present the corresponding F1 curves and PR curves. Fig. 12 depicts that SA-Net achieves optimal results in AP 0.5 and AP 0.75, demonstrating the best performance in the high-confidence range of the F1 curve. This study introduces the SA-Head module to achieve more accurate localization of detection boxes. Fig. 13 demonstrates that the PR curve with the SA-Head module (orange curve) shows substantial improvement in AP 0.5 and AP 0.75 compared with the baseline (blue curve), indicating that the SA-Head module can enhance the network’s detection performance.
Additionally, this study introduces CG-FPN to strengthen global features and suppress scattering interference in the background. Fig. 14 displays detection results and visualization effects, where green rectangles and yellow circles represent detected targets and false alarms, respectively. As shown in Fig. 14(a), the baseline produces false alarms because of similar buildings in the background. To address this issue, CG-FPN enhances the contextual connections of features by assigning different weights to channel layers. The feature map from the last layer of the classification branch is visualized for a direct comparison. Fig. 14(c) and Fig. 14(d) show that after adding this module, the aircraft targets receive more attention. The experimental results prove that CG-FPN effectively enhances the saliency of targets and reduces false alarms in complex backgrounds.
This study conducts a series of experiments using different detection algorithms on the SAR-AIRcraft-1.0 dataset. The results demonstrate that the proposed SA-Net method exhibits superior performance. Detection results are shown in Fig. 15. Green rectangles, yellow circles, blue circles, and red circles represent detection results, false alarms, missed detections, and incorrectly identified targets, respectively. Most targets in the SA-Net method are accurately detected, but false alarms and missed detections still exist. The false alarms are primarily due to scattering representations similar to aircraft near complex backgrounds, such as terminals. Additionally, the variability in scattering conditions leads to weaker scattering of certain aircraft components. This scattering thus affects the semantic integrity of target features and results in missed detections.
In addition to these issues, Fig. 15 shows incorrectly identified instances marked with red circles. Owing to the small size of the targets and the lack of semantic features, some aircraft are misidentified as other categories. The absence of prior information, such as aircraft length, makes correctly distinguishing between different categories more challenging. Overall, detecting and identifying targets in the SAR-AIRcraft-1.0 dataset is a difficult task. The current algorithms still exhibit unsatisfactory performance, indicating further room for improvement. In future work, incorporating SAR imaging mechanisms and scattering features into deep CNNs may further enhance the detection and identification performance of the SAR-AIRcraft-1.0 dataset.
This study proposes a SAR aircraft detection and recognition method that incorporates scattering perception. It utilizes a CG-FPN to enhance global information and suppress strong interference in complex scenes, thereby achieving effective feature fusion and reducing false alarms and missed detections. The method also employs scattering keypoints to refine and correct detection boxes, improving localization accuracy. To validate the effectiveness of the proposed method, this study publicly releases a high-resolution SAR-AIRcraft-1.0 dataset. This dataset contains various categories of aircraft targets. It is characterized by complex scenes, diverse categories, dense targets, noise interference, diverse tasks, and multi-scale features. The dataset provides rich data for model training and facilitating research in SAR aircraft detection and recognition. Experiments conducted on the SAR-AIRcraft-1.0 dataset demonstrate the effectiveness of the proposed approach compared with other deep learning algorithms. In future work, integrating scattering feature information into deep CNNs can further enhance detection and recognition performance.
SAR-AIRcraft-1.0: The high-resolution SAR aircraft detection and recognition dataset is released on the official website of the
Journal of Radars, and the data and usage instructions are uploaded to the journal’s website on the “SAR-AIRcraft-1.0: High-Resolution SAR Aircraft Detection and Recognition Dataset” page (see
Fig. 1). The URL is
[1] |
FU Kun, FU Jiamei, WANG Zhirui, et al. Scattering-keypoint-guided network for oriented ship detection in high-resolution and large-scale SAR images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14: 11162–11178. doi: 10.1109/JSTARS.2021.3109469.
|
[2] |
GUO Qian, WANG Haipeng, and XU Feng. Scattering enhanced attention pyramid network for aircraft detection in SAR images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(9): 7570–7587. doi: 10.1109/TGRS.2020.3027762.
|
[3] |
SHAHZAD M, MAURER M, FRAUNDORFER F, et al. Buildings detection in VHR SAR images using fully convolution neural networks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(2): 1100–1116. doi: 10.1109/TGRS.2018.2864716.
|
[4] |
ZHANG Zhimian, WANG Haipeng, XU Feng, et al. Complex-valued convolutional neural network and its application in polarimetric SAR image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(12): 7177–7188. doi: 10.1109/TGRS.2017.2743222.
|
[5] |
FU Kun, DOU Fangzheng, LI Hengchao, et al. Aircraft recognition in SAR images based on scattering structure feature and template matching[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2018, 11(11): 4206–4217. doi: 10.1109/JSTARS.2018.2872018.
|
[6] |
DU Lan, DAI Hui, WANG Yan, et al. Target discrimination based on weakly supervised learning for high-resolution SAR images in complex scenes[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(1): 461–472. doi: 10.1109/TGRS.2019.2937175.
|
[7] |
CUI Zongyong, LI Qi, CAO Zongjie, et al. Dense attention pyramid networks for multi-scale ship detection in SAR images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 11: 8983–8997. doi: 10.1109/TGRS.2019.2923988.
|
[8] |
ZHANG Jinsong, XING Mengdao, and XIE Yiyuan. FEC: A feature fusion framework for SAR target recognition based on electromagnetic scattering features and deep CNN features[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(3): 2174–2187. doi: 10.1109/TGRS.2020.3003264.
|
[9] |
ZHAO Yan, ZHAO Lingjun, LI Chuyin, et al. Pyramid attention dilated network for aircraft detection in SAR images[J]. IEEE Geoscience and Remote Sensing Letters, 2021, 18(4): 662–666. doi: 10.1109/LGRS.2020.2981255.
|
[10] |
ZHAO Yan, ZHAO Lingjun, LIU Zhong, et al. Attentional feature refinement and alignment network for aircraft detection in SAR imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5220616. doi: 10.1109/TGRS.2021.3139994.
|
[11] |
FU Jiamei, SUN Xian, WANG Zhirui, et al. An anchor-free method based on feature balancing and refinement network for multiscale ship detection in SAR images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(2): 1331–1344. doi: 10.1109/TGRS.2020.3005151.
|
[12] |
SUN Yuanrui, WANG Zhirui, SUN Xian, et al. SPAN: Strong scattering point aware network for ship detection and classification in large-scale SAR imagery[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15: 1188–1204. doi: 10.1109/JSTARS.2022.3142025.
|
[13] |
郭倩, 王海鹏, 徐丰. SAR图像飞机目标检测识别进展[J]. 雷达学报, 2020, 9(3): 497–513. doi: 10.12000/JR20020.
GUO Qian, WANG Haipeng, and XU Feng. Research progress on aircraft detection and recognition in SAR imagery[J]. Journal of Radars, 2020, 9(3): 497–513. doi: 10.12000/JR20020.
|
[14] |
吕艺璇, 王智睿, 王佩瑾, 等. 基于散射信息和元学习的SAR图像飞机目标识别[J]. 雷达学报, 2022, 11(4): 652–665. doi: 10.12000/JR22044.
LYU Yixuan, WANG Zhirui, WANG Peijin, et al. Scattering information and meta-learning based SAR images interpretation for aircraft target recognition[J]. Journal of Radars, 2022, 11(4): 652–665. doi: 10.12000/JR22044.
|
[15] |
KANG Yuzhuo, WANG Zhirui, FU Jiamei, et al. SFR-Net: Scattering feature relation network for aircraft detection in complex SAR images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5218317. doi: 10.1109/TGRS.2021.3130899.
|
[16] |
CHEN Jiehong, ZHANG Bo, and WANG Chao. Backscattering feature analysis and recognition of civilian aircraft in TerraSAR-X images[J]. IEEE Geoscience and Remote Sensing Letters, 2015, 12(4): 796–800. doi: 10.1109/LGRS.2014.2362845.
|
[17] |
SUN Xian, LV Yixuan, WANG Zhirui, et al. SCAN: Scattering characteristics analysis network for few-shot aircraft classification in high-resolution SAR images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5226517. doi: 10.1109/TGRS.2022.3166174.
|
[18] |
KEYDEL E R, LEE S W, and MOORE J T. MSTAR extended operating conditions: A tutorial[C]. The SPIE 2757, Algorithms for Synthetic Aperture Radar Imagery III, Orlando, USA, 1996: 228–242.
|
[19] |
HUANG Lanqing, LIU Bin, LI Boying, et al. OpenSARShip: A dataset dedicated to Sentinel-1 ship interpretation[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2018, 11(1): 195–208. doi: 10.1109/JSTARS.2017.2755672.
|
[20] |
LI Jianwei, QU Changwen, and SHAO Jiaqi. Ship detection in SAR images based on an improved faster R-CNN[C]. 2017 SAR in Big Data Era: Models, Methods and Applications (BIGSARDATA), Beijing, China, 2017: 1–6.
|
[21] |
WANG Yuanyuan, WANG Chao, ZHANG Hong, et al. A SAR dataset of ship detection for deep learning under complex backgrounds[J]. Remote Sensing, 2019, 11(7): 765. doi: 10.3390/rs11070765.
|
[22] |
孙显, 王智睿, 孙元睿, 等. AIR-SARShip-1.0: 高分辨率SAR舰船检测数据集[J]. 雷达学报, 2019, 8(6): 852–862. doi: 10.12000/JR19097.
SUN Xian, WANG Zhirui, SUN Yuanrui, et al. AIR-SARShip-1.0: High-resolution SAR ship detection dataset[J]. Journal of Radars, 2019, 8(6): 852–862. doi: 10.12000/JR19097.
|
[23] |
WEI Shunjun, ZENG Xiangfeng, QU Qizhe, et al. HRSID: A high-resolution SAR images dataset for ship detection and instance segmentation[J]. IEEE Access, 2020, 8: 120234–120254. doi: 10.1109/ACCESS.2020.3005861.
|
[24] |
HOU Xiyue, AO Wei, SONG Qian, et al. FUSAR-Ship: Building a high-resolution SAR-AIS matchup dataset of Gaofen-3 for ship detection and recognition[J]. Science China Information Sciences, 2020, 63(4): 140303. doi: 10.1007/s11432-019-2772-5.
|
[25] |
ZHANG Peng, XU Hao, TIAN Tian, et al. SEFEPNet: Scale expansion and feature enhancement pyramid network for SAR aircraft detection with small sample dataset[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15: 3365–3375. doi: 10.1109/JSTARS.2022.3169339.
|
[26] |
陈杰, 黄志祥, 夏润繁, 等. 大规模多类SAR目标检测数据集-1.0[J/OL]. 雷达学报. https://radars.ac.cn/web/data/getData?dataType=MSAR, 2022.
CHEN Jie, HUANG Zhixiang, XIA Runfan, et al. Large-scale multi-class SAR image target detection dataset-1.0[J/OL]. Journal of Radars. https://radars.ac.cn/web/data/getData?dataType=MSAR, 2022.
|
[27] |
HU Jie, SHEN Li, and SUN Gang. Squeeze-and-excitation networks[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7132–7141.
|
[28] |
SUN Yuanrui, SUN Xian, WANG Zhirui, et al. Oriented ship detection based on strong scattering points network in large-scale SAR images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 5218018. doi: 10.1109/TGRS.2021.3130117.
|
[29] |
HUANG Lichao, YANG Yi, DENG Yafeng, et al. DenseBox: Unifying landmark localization with end to end object detection[J]. arXiv preprint arXiv: 1509.04874, 2015.
|
[30] |
MIKOLAJCZYK K and SCHMID C. Scale & affine invariant interest point detectors[J]. International Journal of Computer Vision, 2004, 60(1): 63–86. doi: 10.1023/B:VISI.0000027790.02288.f2.
|
[31] |
OLUKANMI P O, NELWAMONDO F, and MARWALA T. K-means-MIND: An efficient alternative to repetitive k-means runs[C]. 2020 7th International Conference on Soft Computing & Machine Intelligence (ISCMI), Stockholm, Sweden, 2020: 172–176.
|
[32] |
DAI Jifeng, QI Haozhi, XIONG Yuwen, et al. Deformable convolutional networks[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 764–773.
|
[33] |
FAN Haoqiang, SU Hao, and GUIBAS L. A point set generation network for 3d object reconstruction from a single image[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 2463–2471.
|
[34] |
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2999–3007.
|
[35] |
HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.
|
[36] |
GIRSHICK R. Fast R-CNN[C]. 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1440–1448.
|
[37] |
CAI Zhaowei and VASCONCELOS N. Cascade R-CNN: Delving into high quality object detection[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 6154–6162.
|
[38] |
YANG Ze, LIU Shaohui, HU Han, et al. RepPoints: Point set representation for object detection[C]. 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 2019: 9656–9665.
|
[39] |
XIE Saining, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 5987–5995.
|
[40] |
LIU Ze, LIN Yutong, CAO Yue, et al. Swin transformer: Hierarchical vision transformer using shifted windows[C]. 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 9992–10002.
|
[41] |
TIAN Zhi, SHEN Chunhua, CHEN Hao, et al. FCOS: Fully convolutional one-stage object detection[C]. 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 2019: 9626–9635.
|
[1] | CHEN Xiaolong, RAO Guilin, GUAN Jian, WANG Jinhao, WANG Hongyong, ZHANG Caisheng, YI Jianxin, WAN Xianrong, RAO Yunhua. Passive Radar Low Slow Small Detection Dataset (LSS-PR-1.0) and Multi-domain Feature Extraction and Analysis Methods[J]. Journal of Radars. doi: 10.12000/JR24145 |
[2] | CHEN Xiaolong, Yuan Wang, Du Xiaolin, Yu Gang, He Xiaoyang, Guan Jian, Wang Xinghai. Multiband FMCW Radar LSS-target Detection Dataset (LSS-FMCWR-1.0) and High-resolution Micromotion Feature Extraction Method[J]. Journal of Radars, 2024, 13(3): 539-553. doi: 10.12000/JR23142 |
[3] | WANG Xiang, WANG Yumiao, CHEN Xingyu, ZANG Chuanfei, CUI Guolong. Deep Learning-based Marine Target Detection Method with Multiple Feature Fusion[J]. Journal of Radars, 2024, 13(3): 554-564. doi: 10.12000/JR23105 |
[4] | LUO Ru, ZHAO Lingjun, HE Qishan, JI Kefeng, KUANG Gangyao. Intelligent Technology for Aircraft Detection and Recognition through SAR Imagery: Advancements and Prospects[J]. Journal of Radars, 2024, 13(2): 307-330. doi: 10.12000/JR23056 |
[5] | CHEN Xiang, WANG Liandong, XU Xiong, SHEN Xujian, FENG Yuntian. A Review of Radio Frequency Fingerprinting Methods Based on Raw I/Q and Deep Learning[J]. Journal of Radars, 2023, 12(1): 214-234. doi: 10.12000/JR22140 |
[6] | TIAN Ye, DING Chibiao, ZHANG Fubo, SHI Min’an. SAR Building Area Layover Detection Based on Deep Learning[J]. Journal of Radars, 2023, 12(2): 441-455. doi: 10.12000/JR23033 |
[7] | DU Lan, CHEN Xiaoyang, SHI Yu, XUE Shikun, XIE Meng. MMRGait-1.0: A Radar Time-frequency Spectrogram Dataset for Gait Recognition under Multi-view and Multi-wearing Conditions[J]. Journal of Radars, 2023, 12(4): 892-905. doi: 10.12000/JR22227 |
[8] | DING Zihang, XIE Junwei, WANG Bo. Missing Covariance Matrix Recovery with the FDA-MIMO Radar Using Deep Learning Method[J]. Journal of Radars, 2023, 12(5): 1112-1124. doi: 10.12000/JR23002 |
[9] | HE Mi, PING Qinwen, DAI Ran. Fall Detection Based on Deep Learning Fusing Ultrawideband Radar Spectrograms[J]. Journal of Radars, 2023, 12(2): 343-355. doi: 10.12000/JR22169 |
[10] | CHEN Siwei, CUI Xingchao, LI Mingdian, TAO Chensong, LI Haoliang. SAR Image Active Jamming Type Recognition Based on Deep CNN Model[J]. Journal of Radars, 2022, 11(5): 897-908. doi: 10.12000/JR22143 |
[11] | HUANG Zhongling, YAO Xiwen, HAN Junwei. Progress and Perspective on Physically Explainable Deep Learning for Synthetic Aperture Radar Image Interpretation(in English)[J]. Journal of Radars, 2022, 11(1): 107-125. doi: 10.12000/JR21165 |
[12] | XU Congan, SU Hang, LI Jianwei, LIU Yu, YAO Libo, GAO Long, YAN Wenjun, WANG Taoyang. RSDD-SAR: Rotated Ship Detection Dataset in SAR Images[J]. Journal of Radars, 2022, 11(4): 581-599. doi: 10.12000/JR22007 |
[13] | MA Lin, PAN Zongxu, HUANG Zhongling, HAN Bing, HU Yuxin, ZHOU Xiao, LEI Bin. Multichannel False-target Discrimination in SAR Images Based on Sub-aperture and Full-aperture Feature Learning[J]. Journal of Radars, 2021, 10(1): 159-172. doi: 10.12000/JR20106 |
[14] | WEI Yangkai, ZENG Tao, CHEN Xinliang, DING Zegang, FAN Yujie, WEN Yuhan. Parametric SAR Imaging for Typical Lines and Surfaces[J]. Journal of Radars, 2020, 9(1): 143-153. doi: 10.12000/JR19077 |
[15] | LUO Ying, NI Jiacheng, ZHANG Qun. Synthetic Aperture Radar Learning-imaging Method Based onData-driven Technique and Artificial Intelligence[J]. Journal of Radars, 2020, 9(1): 107-122. doi: 10.12000/JR19103 |
[16] | GUO Qian, WANG Haipeng, XU Feng. Research Progress on Aircraft Detection and Recognition in SAR Imagery[J]. Journal of Radars, 2020, 9(3): 497-513. doi: 10.12000/JR20020 |
[17] | SUN Xian, WANG Zhirui, SUN Yuanrui, DIAO Wenhui, ZHANG Yue, FU Kun. AIR-SARShip-1.0: High-resolution SAR Ship Detection Dataset (in English)[J]. Journal of Radars, 2019, 8(6): 852-863. doi: 10.12000/JR19097 |
[18] | Wang Jun, Zheng Tong, Lei Peng, Wei Shaoming. Study on Deep Learning in Radar[J]. Journal of Radars, 2018, 7(4): 395-411. doi: 10.12000/JR18040 |
[19] | Xu Feng, Wang Haipeng, Jin Yaqiu. Deep Learning as Applied in SAR Target Recognition and Terrain Classification[J]. Journal of Radars, 2017, 6(2): 136-148. doi: 10.12000/JR16130 |
[20] | Ren Xiaozhen, Yang Ruliang. Four-dimensional SAR Imaging Algorithm Based on Iterative Reconstruction of Magnitude and Phase[J]. Journal of Radars, 2016, 5(1): 65-71. doi: 10.12000/JR15135 |
Dataset | Category | Instance | Image | Size | Release year | Task |
MSTAR | 10 | 5,950 | 5,950 | 128×128 | 1998 | Vehicle identification |
OpenSARShip | 17 | 11,346 | 11,346 | 256×256 | 2017 | Ship detection and recognition |
SSDD | 1 | 2,456 | 1,160 | 190~668 | 2017 | Ship detection |
SAR-Ship-Dataset | 1 | 59,535 | 43,819 | 256×256 | 2019 | Ship detection |
AIR-SARShip-1.0 | 1 | 461 | 31 |
|
2019 | Ship detection |
HRSID | 1 | 16,951 | 5,604 | 800×800 | 2020 | Ship detection and segmentation |
FUSAR-Ship | 15 | 16,144 | 16,144 | 512×512 | 2020 | Ship detection and recognition |
SADD | 1 | 7,835 | 2,966 | 224×224 | 2022 | Aircraft detection |
MSAR-1.0 | 4 | 60,396 | 28,449 | 256~2048 | 2022 | Aircraft, oil tanks, bridges, and ships detection |
SAR-AIRcraft-1.0 | 7 | 16,463 | 4,368 | 800~
|
2023 | Aircraft detection and identification |
Detection methods | P | R | F1 | AP 0.5 | AP 0.75 |
Faster R-CNN | 77.6 | 78.1 | 77.8 | 71.6 | 53.6 |
Cascade R-CNN | 89.0 | 79.5 | 84.0 | 77.8 | 59.1 |
RepPoints | 62.7 | 88.7 | 81.2 | 80.3 | 52.9 |
SKG-Net | 57.6 | 88.8 | 69.9 | 79.8 | 51.0 |
SA-Net | 87.5 | 82.2 | 84.8 | 80.4 | 61.4 |
Category | Training set number | Test set number | Total |
A330 | 278 | 31 | 309 |
A320/321 |
|
52 |
|
A220 |
|
460 |
|
ARJ21 | 825 | 362 |
|
Boeing737 | 2007 | 550 |
|
Boeing787 |
|
454 |
|
other |
|
|
|
Total |
|
|
|
Methods | Acc (top-1/top-3) | A330 | A320/321 | A220 | ARJ21 | Boeing737 | Boeing787 | Other |
ResNet-50 | 75.59/89.19 | 74.19 | 90.38 | 78.04 | 73.76 | 61.64 | 78.63 | 80.50 |
ResNet-101 | 78.58/90.37 | 93.55 | 98.08 | 76.96 | 73.76 | 71.82 | 74.67 | 84.82 |
ResNeXt-50 | 80.61/89.46 | 83.87 | 94.23 | 78.91 | 74.86 | 73.27 | 83.04 | 85.40 |
ResNeXt-101 | 82.20/91.83 | 87.10 | 100 | 80.87 | 79.83 | 71.09 | 83.92 | 87.70 |
Swin Trarsformer | 81.29/ 92.51 | 77.42 | 100 | 80.87 | 74.59 | 73.82 | 86.12 | 84.82 |
Category | Faster
R-CNN |
Cascade
R-CNN |
Reppoints | SKG-Net | SA-Net |
A330 | 85.0 | 87.4 | 89.8 | 79.3 | 88.6 |
A320/321 | 97.2 | 97.5 | 97.9 | 78.2 | 94.3 |
A220 | 78.5 | 74.0 | 71.4 | 66.4 | 80.3 |
ARJ21 | 74.0 | 78.0 | 73.0 | 65.0 | 78.6 |
Boeing737 | 55.1 | 54.5 | 55.7 | 65.1 | 59.7 |
Boeing787 | 72.9 | 68.3 | 51.8 | 69.6 | 70.8 |
other | 70.1 | 69.1 | 68.4 | 71.4 | 71.3 |
mAP | 76.1 | 75.7 | 72.6 | 70.7 | 77.7 |
Category | Faster
R-CNN |
Cascade
R-CNN |
Reppoints | SKG-Net | SA-Net |
A330 | 85.0 | 87.4 | 66.4 | 66.4 | 88.6 |
A320/321 | 87.7 | 73.9 | 84.9 | 49.6 | 86.6 |
A220 | 58.7 | 49.1 | 49.4 | 29.8 | 55.0 |
ARJ21 | 55.2 | 59.0 | 50.9 | 37.7 | 59.7 |
Boeing737 | 42.8 | 39.1 | 36.6 | 48.7 | 41.8 |
Boeing787 | 60.5 | 57.6 | 41.8 | 51.6 | 60.4 |
other | 45.4 | 46.1 | 43.1 | 41.1 | 47.7 |
mAP | 62.2 | 58.9 | 53.3 | 46.4 | 62.8 |
Methods | P | R | F1 | AP 0.5 | AP 0.75 |
Baseline | 88.1 | 81.2 | 84.5 | 79.6 | 60.7 |
Baseline+SA-Head | 88.2 | 82.1 | 85.0 | 80.3 | 60.8 |
Baseline+CG-FPN | 88.6 | 81.9 | 85.1 | 80.4 | 60.4 |
SA-Net | 87.5 | 82.2 | 84.8 | 80.4 | 61.4 |
Dataset | Category | Instance | Image | Size | Release year | Task |
MSTAR | 10 | 5,950 | 5,950 | 128×128 | 1998 | Vehicle identification |
OpenSARShip | 17 | 11,346 | 11,346 | 256×256 | 2017 | Ship detection and recognition |
SSDD | 1 | 2,456 | 1,160 | 190~668 | 2017 | Ship detection |
SAR-Ship-Dataset | 1 | 59,535 | 43,819 | 256×256 | 2019 | Ship detection |
AIR-SARShip-1.0 | 1 | 461 | 31 |
|
2019 | Ship detection |
HRSID | 1 | 16,951 | 5,604 | 800×800 | 2020 | Ship detection and segmentation |
FUSAR-Ship | 15 | 16,144 | 16,144 | 512×512 | 2020 | Ship detection and recognition |
SADD | 1 | 7,835 | 2,966 | 224×224 | 2022 | Aircraft detection |
MSAR-1.0 | 4 | 60,396 | 28,449 | 256~2048 | 2022 | Aircraft, oil tanks, bridges, and ships detection |
SAR-AIRcraft-1.0 | 7 | 16,463 | 4,368 | 800~
|
2023 | Aircraft detection and identification |
Detection methods | P | R | F1 | AP 0.5 | AP 0.75 |
Faster R-CNN | 77.6 | 78.1 | 77.8 | 71.6 | 53.6 |
Cascade R-CNN | 89.0 | 79.5 | 84.0 | 77.8 | 59.1 |
RepPoints | 62.7 | 88.7 | 81.2 | 80.3 | 52.9 |
SKG-Net | 57.6 | 88.8 | 69.9 | 79.8 | 51.0 |
SA-Net | 87.5 | 82.2 | 84.8 | 80.4 | 61.4 |
Category | Training set number | Test set number | Total |
A330 | 278 | 31 | 309 |
A320/321 |
|
52 |
|
A220 |
|
460 |
|
ARJ21 | 825 | 362 |
|
Boeing737 | 2007 | 550 |
|
Boeing787 |
|
454 |
|
other |
|
|
|
Total |
|
|
|
Methods | Acc (top-1/top-3) | A330 | A320/321 | A220 | ARJ21 | Boeing737 | Boeing787 | Other |
ResNet-50 | 75.59/89.19 | 74.19 | 90.38 | 78.04 | 73.76 | 61.64 | 78.63 | 80.50 |
ResNet-101 | 78.58/90.37 | 93.55 | 98.08 | 76.96 | 73.76 | 71.82 | 74.67 | 84.82 |
ResNeXt-50 | 80.61/89.46 | 83.87 | 94.23 | 78.91 | 74.86 | 73.27 | 83.04 | 85.40 |
ResNeXt-101 | 82.20/91.83 | 87.10 | 100 | 80.87 | 79.83 | 71.09 | 83.92 | 87.70 |
Swin Trarsformer | 81.29/ 92.51 | 77.42 | 100 | 80.87 | 74.59 | 73.82 | 86.12 | 84.82 |
Category | Faster
R-CNN |
Cascade
R-CNN |
Reppoints | SKG-Net | SA-Net |
A330 | 85.0 | 87.4 | 89.8 | 79.3 | 88.6 |
A320/321 | 97.2 | 97.5 | 97.9 | 78.2 | 94.3 |
A220 | 78.5 | 74.0 | 71.4 | 66.4 | 80.3 |
ARJ21 | 74.0 | 78.0 | 73.0 | 65.0 | 78.6 |
Boeing737 | 55.1 | 54.5 | 55.7 | 65.1 | 59.7 |
Boeing787 | 72.9 | 68.3 | 51.8 | 69.6 | 70.8 |
other | 70.1 | 69.1 | 68.4 | 71.4 | 71.3 |
mAP | 76.1 | 75.7 | 72.6 | 70.7 | 77.7 |
Category | Faster
R-CNN |
Cascade
R-CNN |
Reppoints | SKG-Net | SA-Net |
A330 | 85.0 | 87.4 | 66.4 | 66.4 | 88.6 |
A320/321 | 87.7 | 73.9 | 84.9 | 49.6 | 86.6 |
A220 | 58.7 | 49.1 | 49.4 | 29.8 | 55.0 |
ARJ21 | 55.2 | 59.0 | 50.9 | 37.7 | 59.7 |
Boeing737 | 42.8 | 39.1 | 36.6 | 48.7 | 41.8 |
Boeing787 | 60.5 | 57.6 | 41.8 | 51.6 | 60.4 |
other | 45.4 | 46.1 | 43.1 | 41.1 | 47.7 |
mAP | 62.2 | 58.9 | 53.3 | 46.4 | 62.8 |
Methods | P | R | F1 | AP 0.5 | AP 0.75 |
Baseline | 88.1 | 81.2 | 84.5 | 79.6 | 60.7 |
Baseline+SA-Head | 88.2 | 82.1 | 85.0 | 80.3 | 60.8 |
Baseline+CG-FPN | 88.6 | 81.9 | 85.1 | 80.4 | 60.4 |
SA-Net | 87.5 | 82.2 | 84.8 | 80.4 | 61.4 |