Fast Detection of Ship Targets for Large-scale Remote Sensing Image Based on a Cascade Convolutional Neural Network
-
摘要: 针对大场景遥感图像舰船目标的快速检测问题,该文设计了一种级联型卷积神经网络检测框架。该检测框架由目标预筛选全卷积网络(P-FCN)和目标精确检测全卷积网络(D-FCN)两个全卷积网络级联而成。P-FCN是一个轻量级的图像分类网络,负责对大场景图像中可能的舰船区域进行快速预筛选,其层数少、训练简单,候选框冗余较少,能够减少后续网络的计算负担;D-FCN是一个改进的U-Net网络,通过在传统U-Net结构中加入目标掩膜和舰船朝向估计层以进行多任务的学习,实现任意朝向舰船目标的精细定位。该文分别使用TerraSAR-X雷达遥感图像和从91卫图、DOTA数据集中获得的光学遥感图像对算法进行了测试,结果表明该方法的检测准确率分别为0.928和0.926,与传统滑窗法相当,但目标检测时间仅为滑窗法的1/3左右。该文所提的级联型卷积神经网络检测框架在保持检测精度的前提下能显著提高目标检测效率,可实现大场景遥感图像中舰船目标的快速检测。Abstract: For the fast detection of ships in large-scale remote sensing images, a cascade convolutional neural network is proposed, which is a cascade combination of two Fully Convolutional Neural networks (FCNs), the target FCN for Prescreening (P-FCN), and the target FCN for Detection (D-FCN). The P-FCN is a lightweight image classification network that is responsible for the rapid pre-screening of possible ship areas in large-scale images. The region proposals generated by the P-FCN have less redundancy, which can reduce the computational burden of the D-FCN. The D-FCN is an improved U-Net that can accurately detect arbitrary-oriented ships by adding target masks and ship orientation estimation layers to the traditional U-Net structure for multitask learning. In our experiment, TerraSAR-X remote sensing images and the optical remote sensing images obtained from the 91 satellite map software and the DOTA dataset were used to test the network. The results show that the detection accuracy of our method was 0.928 and 0.926 for synthetic aperture radar images and optical images, respectively, which were close to the performance of the traditional sliding window method. However, the running time of the proposed method was only about 1/3 of that of the sliding window method. Therefore, the cascade convolutional neural network can significantly improve the target detection efficiency while maintaining the detection accuracy and can realize the rapid detection of ship targets in large-scale remote sensing images.
-
表 1 TerraSAR-X数据基本信息
Table 1. The basic information of TerraSAR-X
Satellite 极化方式 分辨率(rg×az)(m) 像元间距(rg×az)(m) TerraSAR-X HH 1.03×1.17 1.25×1.25 表 2 级联型网络结构、滑窗法、YOLO3检测结果
Table 2. The test results of cascade network, sliding window method and YOLO3
精确率 召回率 检测速度(s/1000×1000) 级联网络 0.952 0.928 0.142 滑窗法 0.927 0.931 0.334 YOLO3 0.922 0.753 0.041 表 3 类型图1检测结果统计
Table 3. The test results of type 1 image
TP FN FP 精确率 召回率 检测时间(s) 级联法 382 31 14 0.965 0.925 18.882 滑窗法 388 25 29 0.930 0.939 64.859 YOLO 322 91 28 0.920 0.780 5.860 表 4 类型图2检测结果统计
Table 4. The test results of type 2 image
TP FN FP 精确率 召回率 检测时间(s) 级联法 259 19 16 0.942 0.932 19.853 滑窗法 256 22 19 0.931 0.921 62.208 YOLO 210 68 18 0.921 0.755 5.321 表 5 滑窗法与级联法检测结果对比
Table 5. Comparison of sliding window and cascade method
召回率 检测时间 时间比 级联法 0.926 0.273 3.34 滑窗法 0.918 0.911 -
[1] 刘俊凯, 李健兵, 马梁, 等. 基于矩阵信息几何的飞机尾流目标检测方法[J]. 雷达学报, 2017, 6(6): 699–708. doi: 10.12000/JR17058LIU Junkai, LI Jianbing, MA Liang, et al. Radar target detection method of aircraft wake vortices based on matrix information geometry[J]. Journal of Radars, 2017, 6(6): 699–708. doi: 10.12000/JR17058 [2] 陈小龙, 关键, 黄勇, 等. 雷达低可观测动目标精细化处理及应用[J]. 科技导报, 2017, 35(20): 19–27.CHEN Xiaolong, GUAN Jian, HUANG Yong, et al. Radar refined processing and its applications for low-observable moving target[J]. Science &Technology Review, 2017, 35(20): 19–27. [3] 苏宁远, 陈小龙, 关键, 等. 基于卷积神经网络的海上微动目标检测与分类方法[J]. 雷达学报, 2018, 7(5): 565–574. doi: 10.12000/JR18077SU Ningyuan, CHEN Xiaolong, GUAN Jian, et al. Detection and classification of maritime target with micro-motion based on CNNs[J]. Journal of Radars, 2018, 7(5): 565–574. doi: 10.12000/JR18077 [4] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014. [5] GIRSHICK R. Fast R-CNN[C]. Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 2016. [6] UIJLINGS J R R, VAN DE SANDE K E A, GEVERS T, et al. Selective search for object recognition[J]. International Journal of Computer Vision, 2013, 104(2): 154–171. doi: 10.1007/s11263-013-0620-5 [7] JIANG Huaizu and LEARNED-MILLER E. Face detection with the faster R-CNN[C]. Proceedings of the 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), Washington, USA, 2017: 650-657. [8] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[C]. Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, 2015. [9] REDMON J and FARHADI A. YOLOv3: An incremental improvement[J]. arXiv: 1804. 02767, 2018. [10] ZHOU Xinyu, YAO Cong, WEN He, et al. EAST: An efficient and accurate scene text detector[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, 2017. [11] 伍广明, 陈奇, SHIBASAKI R, 等. 基于U型卷积神经网络的航空影像建筑物检测[J]. 测绘学报, 2018, 47(6): 864–872. doi: 10.11947/j.AGCS.2018.20170651WU Guangming, CHEN Qi, SHIBASAKI R, et al. High precision building detection from aerial imagery using a U-Net like convolutional architecture[J]. Acta Geodaetica et Cartographica Sinica, 2018, 47(6): 864–872. doi: 10.11947/j.AGCS.2018.20170651 [12] ZHANG Zenghui, GUO Weiwei, ZHU Shengnan, et al. Toward arbitrary-oriented ship detection with rotated region proposal and discrimination networks[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15(11): 1745–1749. doi: 10.1109/LGRS.2018.2856921 [13] ZHAO Juanping, GUO Weiwei, ZHANG Zenghui, et al. A coupled convolutional neural network for small and densely clustered ship detection in SAR images[J]. Science China Information Sciences, 2019, 62(4): 42301. doi: 10.1007/s11432-017-9405-6 [14] XIA Guisong, BAI xiang, DING Jian, et al. DOTA: A large-scale dataset for object detection in aerial images[C]. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, 2018. [15] DING Jian, XUE Nan, LONG Yang, et al. Learning RoI transformer for detecting oriented objects in aerial images[C]. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.