雷达目标识别评估中的数据可分性度量方法

姜卫东 薛玲艳 张新禹

姜卫东, 薛玲艳, 张新禹. 雷达目标识别评估中的数据可分性度量方法[J]. 雷达学报, 2023, 12(4): 860–881. doi: 10.12000/JR23125
引用本文: 姜卫东, 薛玲艳, 张新禹. 雷达目标识别评估中的数据可分性度量方法[J]. 雷达学报, 2023, 12(4): 860–881. doi: 10.12000/JR23125
JIANG Weidong, XUE Lingyan, and ZHANG Xinyu. Data separability metric to evaluate radar target recognition[J]. Journal of Radars, 2023, 12(4): 860–881. doi: 10.12000/JR23125
Citation: JIANG Weidong, XUE Lingyan, and ZHANG Xinyu. Data separability metric to evaluate radar target recognition[J]. Journal of Radars, 2023, 12(4): 860–881. doi: 10.12000/JR23125

雷达目标识别评估中的数据可分性度量方法

DOI: 10.12000/JR23125
基金项目: 国家自然科学基金(61921001)
详细信息
    作者简介:

    姜卫东,博士,研究员,博士生导师,主要研究方向为雷达系统、雷达信号处理、雷达目标识别

    薛玲艳,硕士生,主要研究方向为雷达目标识别、机器学习

    张新禹,博士,副教授,硕士生导师,主要研究方向为雷达信号处理、雷达目标识别

    通讯作者:

    张新禹 zhangxinyu90111@163.com

  • 责任主编:杜兰 Corresponding Editor: DU Lan
  • 中图分类号: TN956

Data Separability Metric to Evaluate Radar Target Recognition

Funds: The National Natural Science Foundation of China (61921001)
More Information
  • 摘要: 以机器学习为主的雷达目标识别模型性能由模型与数据共同决定。当前雷达目标识别评估依赖于准确性评估指标,缺乏数据质量对识别性能影响的评估指标。数据可分性描述了属于不同类别样本的混合程度。数据可分性指标独立于模型识别过程,将其引入识别评估过程,可以量化数据识别难度,预先为识别结果提供评判基准。因此该文基于率失真理论提出一种数据可分性度量,通过仿真数据验证所提度量能够衡量多维高斯分布数据的可分性优劣。进一步结合高斯混合模型,设计的度量方法能够突破率失真函数的局限性,捕捉数据局部特性,提高对数据整体可分性的评估精度。接着将所提度量应用于实测数据识别难度评估中,验证了其与平均识别率的强相关性。而在卷积神经网络模块效能评估实验中,首先在测试阶段量化分析了各卷积模块提取特征的可分性变化趋势,进一步在训练阶段将所提度量作为特征可分性损失参与网络优化过程,引导网络提取更可分的特征,该文从特征可分性角度为神经网络识别性能的评估与提升提供新思路。

     

  • 图  1  层级式模型识别性能评估示意图

    Figure  1.  Hierarchical model recognition performance evaluation schematic

    图  2  二维数据的率失真编码示意图

    Figure  2.  Schematic of 2D data’s rate distortion coding

    图  3  不同奇异值下的数据可分性示意图

    Figure  3.  Schematic of data separability under different singular values

    图  4  数据可分性度量构建流程

    Figure  4.  Data separability construction process

    图  5  非高斯分布数据可分性度量构建示意图

    Figure  5.  Construction schematic of data separability measure under non-Gaussian condition

    图  6  二维数据特征图

    Figure  6.  2D data’s feature map

    图  7  不同类间重叠度的高斯分布数据集

    Figure  7.  Gaussian data with different class overlap

    图  8  不同类间重叠度的非高斯分布数据集

    Figure  8.  Non-Gaussian data with different class overlap

    图  9  不同类间重叠度的数据可分性度量结果

    Figure  9.  Separability measures for datasets with different class overlap and dimensions

    图  10  整体分布散度与类内分布散度变化趋势

    Figure  10.  The trend of overall and intra-class distribution divergence

    图  11  不同信噪比条件模型识别率表现

    Figure  11.  Different models’ accuracy under various condition

    图  12  加噪数据可分性度量结果

    Figure  12.  Separability measure results of noisy data

    图  13  可分性度量与平均错误率相关性矩阵

    Figure  13.  Correlation matrix between separability measures and average recognition error

    图  14  可分性度量与平均错误率曲线

    Figure  14.  Curves between separability measure and average recognition error

    图  15  特征映射及其可分性分析过程

    Figure  15.  Feature mapping and its separability analysis process

    图  16  网络在MSTAR数据上的识别准确性能

    Figure  16.  Recognition accuracy performance of the network on MSTAR datasets

    图  17  MSTAR数据集特征可分性度量结果

    Figure  17.  Feature separability measure results for MSTAR dataset

    图  18  特征可分性度量约束下的卷积神经网络

    Figure  18.  Convolutional neural networks with feature separability constraints

    图  19  训练集上交叉熵损失表现

    Figure  19.  Cross-entropy loss performance on the training set

    图  20  测试集上的识别率表现

    Figure  20.  Accuracy performance on test set

    图  21  最终层输出特征样本间余弦相似度矩阵

    Figure  21.  The cosine similarity matrix between feature samples output from the final layer

    图  22  最终层输出特征t-SNE图

    Figure  22.  The t-SNE visualization of the feature output from the final layer

    1  平均意义下的数据可分性度量计算过程

    1.   Average data separability measure computing

     1. 输入: 数据${\boldsymbol{X}} = \left\{ {{{\boldsymbol{X}}^c}} \right\}_{c = 1}^k$,聚类数${\boldsymbol{k}} = \left\{ {{k^c}} \right\}_{c = 1}^k$
     2. for $c = 1:k$执行
     3. ${\text{GMM}}\left( {{{\boldsymbol{X}}^c},{k^c}} \right)$得到高斯子类,
     $\left\{ {{{\boldsymbol{X}}^{1{c^1}}}} \right\}_{{c^1} = 1}^{{k^1}}$,$\left\{ {{{\boldsymbol{X}}^{2{c^2}}}} \right\}_{{c^2} = 1}^{{k^2}}$,···,$\left\{ {{{\boldsymbol{X}}^{k{c^k}}}} \right\}_{{c^k} = 1}^{{k^k}}$;
     4. end for
     5. for $c = 1:k - 1$执行
     6.  for $t = c + 1:k$执行
     7.  for $i = 1:{k^c}$执行
     8.  for $j = 1:{k^t}$执行
     9.   计算${M_{{\rm{citj}}} } = M\left( { { {\boldsymbol{X} }^{ci} },{ {\boldsymbol{X} }^{tj} } } \right)$ (根据式(15));
     10.   计算${N_{{\rm{citj}}} } = \left| { { {\boldsymbol{X} }^{ci} } } \right| + \left| { { {\boldsymbol{X} }^{tj} } } \right|$ ($\left| {\boldsymbol{X}} \right|$表示集合X的样本量);
     11.  end for
     12.  end for
     13.  end for
     14. end for
     15. 计算$\bar M = { { {\text{sum} }\left( { {M_{{\rm{citj}}} }{N_{{\rm{citj}}} } } \right)} \mathord{\left/ {\vphantom { { {\text{sum} }\left( { {M_{citj} }{N_{citj} } } \right)} { {\text{sum} }\left( { {N_{citj} } } \right)} } } \right. } { {\text{sum} }\left( { {N_{{\rm{citj}}} } } \right)} }$ (根据式(16));
     16. 输出: $\bar M$
    下载: 导出CSV

    表  1  典型特征图可分性度量结果

    Table  1.   Separability measure results of typical feature

    度量方法双簇形双月形环形异或形螺旋形混合形
    DSI0.00080.64590.55320.77750.94200.9941
    N20.01100.02460.04610.05980.05690.4928
    Density0.53790.83220.86610.88570.87030.9282
    LSC0.50120.83620.91340.91480.97270.9994
    $\bar M$(${k^1} = {k^2} = 1$)0.03710.33050.99990.99950.96690.9990
    $\bar M$(${k^1} = {k^2} = 3$)0.01750.05920.17480.22160.56010.4823
    $\bar M$(${k^1} = {k^2} = 5$)0.01140.02620.06330.14490.33880.3461
    $\bar M$(${k^1} = {k^2} = 8$)0.00870.01640.03340.09110.20440.2482
    下载: 导出CSV

    表  2  识别模型识别率结果

    Table  2.   Recognition accuracy results of recognition models

    数据集样本数(正例/负例)特征数SVM(RBF)KNNLSVMLR平均识别率($\overline {{\text{Acc}}} $)
    Banknote1372(762/610)51.00001.00000.9879×0.9879×0.9939
    Wisconsin683(444/239)90.9512×0.95610.95610.9854√0.9622
    WDBC569(357/212)300.93560.9591√0.9591√0.9181×0.9430
    Fire244(138/106)100.9189×0.91900.9730√0.91900.9324
    Ionosphere351(225/126)330.9623√0.8585×0.87730.87730.8939
    Spambase4601(2788/1813)570.67990.78200.8639√0.6618×0.7469
    Sonar208(111/97)600.82540.8413√0.80950.6984×0.7936
    Risk776(486/290)170.74250.84120.9184√0.6867×0.7972
    Mammographic830(427/403)50.75900.8032√0.7349×0.73890.7590
    Magic19020(12332/6688)100.8274√0.80980.69720.6008×0.7338
    Hill valley606(305/301)1000.4780×0.53850.9670√0.4780×0.6154
    Blood748(570/178)40.7378√0.70220.73330.6000×0.6933
    ILPD583(416/167)100.73140.69710.7428√0.6343×0.7014
    Haberman306(225/81)60.71740.7826√0.7065×0.73910.7364
    注:“√”表示其在对应行数据上识别率最高;“×”表示最低
    下载: 导出CSV

    表  3  实测数据集可分性度量结果

    Table  3.   Separability measure results of real data

    数据集样本数(正例/负例)N2LSCDensityDSIM平均错误率($1 - \overline {{\text{Acc}}} $)
    Banknote1372(762/610)0.08830.90220.80270.76490.13550.0061
    Wisconsin683(444/239)0.24910.66360.62030.31570.15670.0378
    WDBC569(357/212)0.13800.91400.56300.58330.24970.0570
    Fire244(138/106)0.27660.91030.71500.70580.28870.0676
    Ionosphere351(225/126)0.38840.91180.85530.71770.38070.1061
    Spambase4601(2788/1813)0.25530.99670.47800.82960.44020.2531
    Sonar208(111/97)0.40790.97540.93330.94570.44970.2064
    Risk776(486/290)0.20820.94710.71400.78940.50230.2028
    Mammographic830(427/403)0.26810.99470.80450.88070.58130.2410
    Magic19020(12332/6688)0.67420.2662
    Hill valley606(305/301)0.48570.99810.62600.97270.74080.3846
    Blood748(570/178)0.43790.99450.62350.92860.86780.3067
    ILPD583(416/167)0.29880.99100.67320.80160.88000.2986
    Haberman306(225/81)0.44280.98860.82790.94820.91020.2636
    下载: 导出CSV

    表  4  SAR图像数据集

    Table  4.   SAR image datasets

    数据集类别训练集测试集数据集类别训练集测试集
    MSTARBMP2233195FUSAR其余类别200275
    BTR70233196
    T72232196
    BTR60256195
    2S1299274渔船200119
    BRDM2298274
    D7299274
    T62299273货船200156
    ZIL131299274
    ZSU234299274
    下载: 导出CSV

    表  5  不同可分性系数下最优识别率表现(%)

    Table  5.   Optimal accuracy performance with different separability factors (%)

    系数训练集识别率测试集识别率
    $\alpha = \beta = 0$(基准网络)94.0079.09
    $\beta = 0.01$94.0078.90
    $\beta = 0.10$98.6779.64
    $\beta = 0.25$99.6780.00
    $\beta = 0.50$100.0080.00
    $\alpha = 0.01$93.8379.45
    $\alpha = 0.10$91.5079.63
    $\alpha = 0.25$98.1782.18
    $\alpha = 0.50$99.8382.36
    下载: 导出CSV
  • [1] 付强, 何峻. 自动目标识别评估方法及应用[M]. 北京, 科学出版社, 2013: 16–19.

    FU Qiang and HE Jun. Automatic Target Recognition Evaluation Method and its Application[M]. Beijing, Science Press, 2013: 16–19.
    [2] 郁文贤. 自动目标识别的工程视角述评[J]. 雷达学报, 2022, 11(5): 737–752. doi: 10.12000/JR22178

    YU Wenxian. Automatic target recognition from an engineering perspective[J]. Journal of Radars, 2022, 11(5): 737–752. doi: 10.12000/JR22178
    [3] HOSSIN M and SULAIMAN M N. A review on evaluation metrics for data classification evaluations[J]. International Journal of Data Mining & Knowledge Management Process (IJDKP), 2015, 5(2): 1–11. doi: 10.5281/zenodo.3557376
    [4] ZHANG Chiyuan, BENGIO S, HARDT M, et al. Understanding deep learning (still) requires rethinking generalization[J]. Communications of the ACM, 2021, 64(3): 107–115. doi: 10.1145/3446776
    [5] OPREA M. A general framework and guidelines for benchmarking computational intelligence algorithms applied to forecasting problems derived from an application domain-oriented survey[J]. Applied Soft Computing, 2020, 89: 106103. doi: 10.1016/J.ASOC.2020.106103
    [6] YU Shuang, LI Xiongfei, FENG Yuncong, et al. An instance-oriented performance measure for classification[J]. Information Sciences, 2021, 580: 598–619. doi: 10.1016/J.INS.2021.08.094
    [7] FERNÁNDEZ A, GARCÍA S, GALAR M, et al. Learning from Imbalanced Data Sets[M]. Cham: Springer, 2018: 253–277.
    [8] BELLO M, NÁPOLES G, VANHOOF K, et al. Data quality measures based on granular computing for multi-label classification[J]. Information Sciences, 2021, 560: 51–57. doi: 10.1016/J.INS.2021.01.027
    [9] CANO J R. Analysis of data complexity measures for classification[J]. Expert Systems with Applications, 2013, 40(12): 4820–4831. doi: 10.1016/J.ESWA.2013.02.025
    [10] METZNER C, SCHILLING A, TRAXDORF M, et al. Classification at the accuracy limit: Facing the problem of data ambiguity[J]. Scientific Reports, 2022, 12(1): 22121. doi: 10.1038/S41598-022-26498-Z
    [11] 徐宗本. 人工智能的10个重大数理基础问题[J]. 中国科学: 信息科学, 2021, 51(12): 1967–1978. doi: 10.1360/SSI-2021-0254

    XU Zongben. Ten fundamental problems for artificial intelligence: Mathematical and physical aspects[J]. SCIENTIA SINICA Informationis, 2021, 51(12): 1967–1978. doi: 10.1360/SSI-2021-0254
    [12] MISHRA A K. Separability indices and their use in radar signal based target recognition[J]. IEICE Electronics Express, 2009, 6(14): 1000–1005. doi: 10.1587/ELEX.6.1000
    [13] GUAN Shuyue and LOEW M. A novel intrinsic measure of data separability[J]. Applied Intelligence, 2022, 52(15): 17734–17750. doi: 10.1007/S10489-022-03395-6
    [14] BRUN A L, BRITTO A S JR, OLIVEIRA L S, et al. A framework for dynamic classifier selection oriented by the classification problem difficulty[J]. Pattern Recognition, 2018, 76: 175–190. doi: 10.1016/J.PATCOG.2017.10.038
    [15] CHARTE D, CHARTE F, and HERRERA F. Reducing data complexity using autoencoders with class-informed loss functions[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(12): 9549–9560. doi: 10.1109/TPAMI.2021.3127698
    [16] LORENA A C, GARCIA L P F, LEHMANN J, et al. How complex is your classification problem?: A survey on measuring classification complexity[J]. ACM Computing Surveys, 2020, 52(5): 107. doi: 10.1145/3347711
    [17] FERRARO M B and GIORDANI P. A review and proposal of (fuzzy) clustering for nonlinearly separable data[J]. International Journal of Approximate Reasoning, 2019, 115: 13–31. doi: 10.1016/J.IJAR.2019.09.004
    [18] SHANNON C E. A mathematical theory of communication[J]. The Bell System Technical Journal, 1948, 27(3): 379–423. doi: 10.1002/J.1538-7305.1948.TB01338.X
    [19] COVER T M and THOMAS J A. Elements of Information Theory[M]. New York: Wiley, 1991: 301–332.
    [20] MADIMAN M, HARRISON M, and KONTOYIANNIS I. Minimum description length vs. maximum likelihood in lossy data compression[C]. 2004 International Symposium on Information Theory, Chicago, USA, 2004: 461.
    [21] MA Yi, DERKSEN H, HONG Wei, et al. Segmentation of multivariate mixed data via lossy data coding and compression[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(9): 1546–1562. doi: 10.1109/TPAMI.2007.1085
    [22] MACDONALD J, WÄLDCHEN S, HAUCH S, et al. A rate-distortion framework for explaining neural network decisions[J]. arXiv: 1905.11092, 2019.
    [23] HAN Xiaotian, JIANG Zhimeng, LIU Ninghao, et al. Geometric graph representation learning via maximizing rate reduction[C]. ACM Web Conference, Lyon, France, 2022: 1226–1237.
    [24] CHOWDHURY S B R and CHATURVEDI S. Learning fair representations via rate-distortion maximization[J]. Transactions of the Association for Computational Linguistics, 2022, 10: 1159–1174. doi: 10.1162/TACL_A_00512
    [25] LICHMAN M E A. Uci machine learning reposit[EB/OL]. https://archive.ics.uci.edu/datasets, 2023.
    [26] HO T K and BASU M. Complexity measures of supervised classification problems[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(3): 289–300. doi: 10.1109/34.990132
    [27] LEYVA E, GONZÁLEZ A, and PÉREZ R. A set of complexity measures designed for applying meta-learning to instance selection[J]. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(2): 354–367. doi: 10.1109/TKDE.2014.2327034
    [28] GARCIA L P F, DE CARVALHO A C P L F, and LORENA A C. Effect of label noise in the complexity of classification problems[J]. Neurocomputing, 2015, 160: 108–119. doi: 10.1016/J.NEUCOM.2014.10.085
    [29] AGGARWAL C C, HINNEBURG A, and KEIM D A. On the surprising behavior of distance metrics in high dimensional space[C]. 8th International Conference on Database Theory, London, UK, 2001: 420–434.
    [30] MILLER K, MAURO J, SETIADI J, et al. Graph-based active learning for semi-supervised classification of SAR data[C]. SPIE 12095, Algorithms for Synthetic Aperture Radar Imagery XXIX, Orlando, United States, 2022: 120950C.
    [31] 雷禹, 冷祥光, 孙忠镇, 等. 宽幅SAR海上大型运动舰船目标数据集构建及识别性能分析[J]. 雷达学报, 2022, 11(3): 347–362. doi: 10.12000/JR21173

    LEI Yu, LENG Xiangguang, SUN Zhongzhen, et al. Construction and recognition performance analysis of wide-swath SAR maritime large moving ships dataset[J]. Journal of Radars, 2022, 11(3): 347–362. doi: 10.12000/JR21173
    [32] HOU Xiyue, AO Wei, SONG Qian, et al. FUSAR-Ship: Building a high-resolution SAR-AIS matchup dataset of Gaofen-3 for ship detection and recognition[J]. Science China Information Sciences, 2020, 63(4): 140303. doi: 10.1007/s11432-019-2772-5
    [33] KEYDEL E R, LEE S W, and MOORE J T. MSTAR extended operating conditions: A tutorial[C]. SPIE 2757, Algorithms for Synthetic Aperture Radar Imagery III, Orlando, USA, 1996: 228–242.
    [34] CHEN Sizhe, WANG Haipeng, XU Feng, et al. Target classification using the deep convolutional networks for SAR images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(8): 4806–4817. doi: 10.1109/TGRS.2016.2551720
    [35] ZHANG Tianwen, ZHANG Xiaoling, KE Xiao, et al. HOG-ShipCLSNet: A novel deep learning network with HOG feature fusion for SAR ship classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5210322. doi: 10.1109/TGRS.2021.3082759
  • 加载中
图(22) / 表(6)
计量
  • 文章访问数:  899
  • HTML全文浏览量:  478
  • PDF下载量:  213
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-07-11
  • 修回日期:  2023-07-27
  • 网络出版日期:  2023-08-14
  • 刊出日期:  2023-08-28

目录

    /

    返回文章
    返回