海上多模态数据资源体系——船舶红外可见光双模态目标检测数据集

徐从安; 高龙; 张驰; 王金鹏; 王飞; 唐小明; 蔡卓燃

doi:10.12000/JR25144

海上多模态数据资源体系——船舶红外可见光双模态目标检测数据集

DOI: 10.12000/JR25144 CSTR: 32380.14.JR25144

徐从安¹,
高龙^1, ,,
张驰²,
王金鹏²,
王飞³,
唐小明⁴,
蔡卓燃⁵

1.
海军航空大学烟台 264000
2.
哈尔滨工程大学哈尔滨 150000
3.
中国科学院计算技术研究所北京 100190
4.
烟台三航雷达服务技术研究所烟台 264000
5.
烟台大学烟台 264000

基金项目: 国家自然科学基金(62271499)，国家资助博士后研究人员计划(GZC20233554)，指挥控制科学与工程教学立项课题(2024-XKJS-j05)，泰山学者青年专家(tsqn202312258)

详细信息

作者简介:
徐从安，博士，副教授，主要研究方向为多源信息融合、智能信息处理与态势生成、天空基预警探测情报处理

高　龙，博士，讲师，主要研究方向为机器学习、目标检测以及船舶识别

张　驰，硕士生，主要研究方向为遥感图像处理、目标检测

王金鹏，博士生，主要研究方向为目标检测、可见光红外融合目标检测等

王　飞，博士，副研究员，主要研究方向为时空大数据分析挖掘、态势计算系统

唐小明，博士，高级工程师，主要研究方向为多系统人工智能、对海雷达总体及雷达数据分析

蔡卓燃，博士，副教授，主要研究方向为频谱感知、无线信号模式识别

通讯作者:
高龙 gaolong14@nudt.edu.cn

责任主编：王智睿 Corresponding Editor: WANG Zhirui

中图分类号: TN911.73
计量
- 文章访问数:
- HTML全文浏览量:
- PDF下载量:
- 被引次数: 0
出版历程
- 收稿日期: 2025-07-31
- 修回日期: 2026-01-24

Maritime Multimodal Data Resource System—Infrared-visible Dual-modal Dataset for Ship Detection

1.
Naval Aeronautical University, Yantai 264000, China
2.
Harbin Engineering University, Harbin 150000, China
3.
Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
4.
Yantai Sanhang Radar Service & Technology Research Institute, Yantai 264000, China
5.
Yantai University, Yantai 264000, China

Funds: The National Natural Science Foundation of China (62271499), The China Postdoctoral Science Foundation (GZC20233554), Command and Control Science and Engineering Teaching Project (2024-XKJS-j05), and The TaiShan Scholars (tsqn202312258)

More Information

Corresponding author: GAO Long, gaolong14@nudt.edu.cn

摘要

摘要: 海上多模态数据资源体系是支撑雷达、合成孔径雷达(SAR)、光电等多传感器协同探测，进而实现目标精细感知的基础，对推动算法落地应用、提高海上目标监视能力具有重要意义。为此，以渤海某港口附近海域为试验区域，利用岸基、空基等平台搭载的SAR、雷达、可见光、红外摄像头等设备，采集海上目标多源数据，并通过自动关联配准与人工修正相结合的方式进行标注，针对不同任务特点整编形成了多个多模态关联数据集，以期构建面向任务的海上多模态数据资源体系。本文所发布多模态船舶图像数据集(DMSD)是该体系的重要组成部分，共包含可见光与红外两类模态图像2163对，涵盖云雨雾、逆光等多种条件，且通过仿射变换实现了模态间的图像配准。基于该数据集，该文在YOLO, CFT等算法上进行了实验验证，实验结果表明，该文数据集在YOLOv8算法上mAP50约为0.65，CFT算法上mAP50约为0.63，能够支撑相关学者开展双模态融合策略优化、复杂场景鲁棒性提升等研究。
- 船舶数据集 /
- 公开数据集 /
- 多模态数据资源体系 /
- 船舶检测 /
- 深度学习
Abstract: A maritime multimodal data resource system provides a foundation for multisensor collaborative detection using radar, Synthetic Aperture Radar (SAR), and electro-optical sensors, enabling fine-grained target perception. Such systems are essential for advancing the practical application of detection algorithms and improving maritime target surveillance capabilities. To this end, this study constructs a maritime multimodal data resource system using multisource data collected from the sea area near a port in the Bohai Sea. Data were acquired using SAR, radar, visible-light cameras, infrared cameras, and other sensors mounted on shore-based and airborne platforms. The data were labeled by performing automatic correlation registration and manual correction. According to the requirements of different tasks, multiple task-oriented multimodal associated datasets were compiled. This paper focuses on one subset of the overall resource system, namely the Dual-Modal Ship Detection, which consists exclusively of visible-light and infrared image pairs. The dataset contains 2163 registered image pairs, with intermodal alignment achieved through an affine transformation. All images were collected in real maritime environments and cover diverse sea conditions and backgrounds, including cloud, rain, fog, and backlighting. The dataset was evaluated using representative algorithms, including YOLO and CFT. Experimental results show that the dataset achieves an mAP@50 of approximately 0.65 with YOLOv8 and 0.63 with CFT, demonstrating its effectiveness in supporting research on optimizing bimodal fusion strategies and enhancing detection robustness in complex maritime scenarios.
- Ship dataset /
- Public dataset /
- Multimodal data resource system /
- Ship detection /
- Deep learning

HTML全文

图 1 船舶可见光与红外图像

Figure 1. RGB images and infrared images of ships

下载: 全尺寸图片幻灯片

图 2 不同海况不同目标条件下的图像

Figure 2. Images under different sea conditions and target conditions

下载: 全尺寸图片幻灯片

图 3 图像采集设备

Figure 3. Image acquisition equipment

下载: 全尺寸图片幻灯片

图 4 红外图像配准前后对比

Figure 4. Comparison of infrared image registration before and after

下载: 全尺寸图片幻灯片

图 5 labelimg界面

Figure 5. Interface of labelimg software

下载: 全尺寸图片幻灯片

图 6 两种标签格式

Figure 6. Two label formats

下载: 全尺寸图片幻灯片

图 7 数据集文件夹结构

Figure 7. Dataset folder structure

下载: 全尺寸图片幻灯片

表 1 部分公开数据集统计

Table 1. Statistics from some publicly available datasets

数据集	实例数量	图像数量	图像分辨率	模态	目标类型
HRSC2016	2976	1070	300×300到1500×900	可见光	船舶
ShipRSImageNet	3435	3435	930×930	可见光	船舶
FGSD	5634	2612	930×930	可见光	船舶
ISDD	3061	1284	768×512～5056×5056	红外	船舶
MassMIND	22364	2900	640×512	红外	船舶
TNO Image Fusion Dataset	261	261对	640×480或720×576	双模态	城市场景
LLVIP	约14000	30976	1920×1080、1280×720	双模态	行人
DMSD(本文)	19567	2163对	1920×1080、640×512	双模态	船舶

下载: 导出CSV

表 2 可见光与热成像相机参数

Table 2. Parameters of visible light and thermal imaging cameras

设备类型	可见光相机	热成像相机
分辨率(像素)	1920×1080	640×512
视场角(°)	66.6～4	40.6
帧率(fps)	30	30
镜头焦距(mm)	6.83～119.94	13.5
镜头光圈	f/2.8～f/11	f/1.0
波长范围(μm)	可见光	8～14

下载: 导出CSV

表 3 不同算法实验结果

Table 3. Experimental results of different algorithms

算法模型	Precision	Recall	mAP50	mAP50-95	推理速度(fps)
YOLOv5n	0.657	0.509	0.532	0.190	26.88
CFT(YOLOv5l)	0.705	0.644	0.635	0.216	17.25
CFT(YOLOv5s)	0.725	0.502	0.624	0.263	25.75
FFODNet	0.730	0.536	0.636	0.269	16.25
SuperYOLO	0.699	0.627	0.604	0.203	17.50
YOLOv8n	0.753	0.529	0.654	0.280	23.25
YOLOv8x	0.724	0.549	0.646	0.279	18.75

下载: 导出CSV

表 4 相同算法下不同模态实验结果

Table 4. Experimental results of different modes under the same algorithm

算法模型	可见光mAP50	红外mAP50	双模态mAP50
YOLOv5n	0.489	0.602	0.532
YOLOv8n	0.531	0.657	0.654
SuperYOLO	0.450	0.607	0.604

下载: 导出CSV

参考文献(17)

[1]	LIU Zikun, YUAN Liu, WENG Lubin, et al. A high resolution optical satellite image dataset for ship recognition and some new baselines[C]. The 6th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2017), Porto, Portugal, 2017: 324–331. doi: 10.5220/0006120603240331.
[2]	ZHANG Zhengning, ZHANG Lin, WANG Yue, et al. ShipRSImageNet: A large-scale fine-grained dataset for ship detection in high-resolution optical remote sensing images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14: 8458–8472. doi: 10.1109/JSTARS.2021.3104230.
[3]	CHEN Kaiyan, WU Ming, LIU Jiaming, et al. FGSD: A dataset for fine-grained ship detection in high resolution satellite images[EB/OL]. https://arxiv. org/abs/2003.06832, 2020.
[4]	HAN Yaqi, LIAO Jingwen, LU Tianshu, et al. KCPNet: Knowledge-driven context perception networks for ship detection in infrared imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5000219. doi: 10.1109/TGRS.2022.3233401.
[5]	NIRGUDKAR S, DEFILIPPO M, SACARNY M, et al. MassMIND: Massachusetts maritime INfrared dataset[J]. The International Journal of Robotics Research, 2023, 42(1/2): 21–32. doi: 10.1177/02783649231153020.
[6]	TOET A. TNO image fusion dataset[EB/OL]. https://figshare.com/articles/dataset/TNO_Image_Fusion_Dataset/1008029, 2022.
[7]	JIA Xinyu, ZHU Chuang, LI Minzhen, et al. LLVIP: A visible-infrared paired dataset for low-light vision[C]. IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, Canada, 2021: 3489–3497. doi: 10.1109/ICCVW54120.2021.00389.
[8]	LI Yiming, LI Zhiheng, CHEN Nuo, et al. Multiagent multitraversal multimodal self-driving: Open MARS dataset[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2024: 22041–22051. doi: 10.1109/CVPR52733.2024.02081.
[9]	HWANG S, PARK J, KIM N, et al. Multispectral pedestrian detection: Benchmark dataset and baseline[C]. The 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, USA, 2015: 1037–1045. doi: 10.1109/CVPR.2015.7298706.
[10]	苏丽, 崔世豪, 张雯. 基于改进暗通道先验的海上低照度图像增强算法[J]. 海军航空大学学报, 2024, 39(5): 576–586. doi: 10.7682/j.issn.2097-1427.2024.05.007. SU Li, CUI Shihao, and ZHANG Wen. An algorithm for enhancing low-light images at sea based on improved dark channel priors[J]. Journal of Naval Aviation University, 2024, 39(5): 576–586. doi: 10.7682/j.issn.2097-1427.2024.05.007.
[11]	曾文锋, 李树山, 王江安. 基于仿射变换模型的图像配准中的平移、旋转和缩放[J]. 红外与激光工程, 2001, 30(1): 18–20, 17. doi: 10.3969/j.issn.1007-2276.2001.01.006. ZENG Wenfeng, LI Shushan, and WANG Jiang’an. Translation, rotation and scaling changes in image registration based affine transformation model[J]. Infrared and Laser Engineering, 2001, 30(1): 18–20, 17. doi: 10.3969/j.issn.1007-2276.2001.01.006.
[12]	于乐凯, 曹政, 孙艳丽, 等. 海上舰船目标可见光/红外图像匹配方法[J]. 海军航空大学学报, 2024, 39(6): 755–764,772. doi: 10.7682/j.issn.2097-1427.2024.06.013. YU Lekai, CAO Zheng, SUN Yanli, et al. Visible and infrared images matching method for maritime ship targets[J]. Journal of Naval Aviation University, 2024, 39(6): 755–764,772. doi: 10.7682/j.issn.2097-1427.2024.06.013.
[13]	GOYAL P, DOLLÁR P, GIRSHICK R, et al. Accurate, large minibatch SGD: Training ImageNet in 1 hour[EB/OL]. https://arxiv.org/abs/1706.02677, 2018.
[14]	FANG Qingyun, HAN Dapeng, WANG Zhaokui. Cross-modality fusion transformer for multispectral object detection[EB/OL]. https://arxiv.org/abs/2111.00273, 2022.
[15]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: Optimal speed and accuracy of object detection[EB/OL]. https://arxiv.org/abs/2004.10934, 2020.
[16]	WANG Jinpeng, XU Cong’an, ZHAO Chunhui, et al. Multimodal object detection of UAV remote sensing based on joint representation optimization and specific information enhancement[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17: 12364–12373. doi: 10.1109/JSTARS.2024.3373816.
[17]	ZHANG Jiaqing, LEI Jie, XIE Weiying, et al. SuperYOLO: Super resolution assisted object detection in multimodal remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5605415. doi: 10.1109/TGRS.2023.3258666.