高级检索

  • ISSN 1006-3080
  • CN 31-1691/TQ

基于子空间多尺度特征融合的试卷语义分割

夏源祥 刘渝 楚程钱 万永菁 蒋翠玲

夏源祥, 刘渝, 楚程钱, 万永菁, 蒋翠玲. 基于子空间多尺度特征融合的试卷语义分割[J]. 华东理工大学学报(自然科学版). doi: 10.14135/j.cnki.1006-3080.20220117001
引用本文: 夏源祥, 刘渝, 楚程钱, 万永菁, 蒋翠玲. 基于子空间多尺度特征融合的试卷语义分割[J]. 华东理工大学学报(自然科学版). doi: 10.14135/j.cnki.1006-3080.20220117001
Xia Yuanxiang, Liu Yu, Chu Chengqian, Wan Yongjing, Jiang Cuiling. Semantic segmentation of test papers based on subspace multi-scale feature fusion[J]. Journal of East China University of Science and Technology. doi: 10.14135/j.cnki.1006-3080.20220117001
Citation: Xia Yuanxiang, Liu Yu, Chu Chengqian, Wan Yongjing, Jiang Cuiling. Semantic segmentation of test papers based on subspace multi-scale feature fusion[J]. Journal of East China University of Science and Technology. doi: 10.14135/j.cnki.1006-3080.20220117001

基于子空间多尺度特征融合的试卷语义分割

doi: 10.14135/j.cnki.1006-3080.20220117001
基金项目: 国家自然科学基金(61872143)
详细信息
    作者简介:

    夏源祥(1995—),男,贵州黔西人,硕士研究生,主要研究方向:图像分割、深度学习。E-mail:yxxia@ecust.edu.cn

    通讯作者:

    万永菁, E-mail:wanyongjing@ecust.edu.cn

  • 中图分类号: TP391.4

Semantic segmentation of test papers based on subspace multi-scale feature fusion

  • 摘要: 分离印刷体和手写体区域是实现试卷语义分割的关键步骤,为了提升试卷语义分割的效果,提出一种基于MaskRCNN网络的注意力改进算法。该算法将子空间多尺度特征融合(Subspace Multiscale Feature Fusion, SMFF)模块嵌入MaskRCNN网络的特征金字塔结构中,SMFF模块基于子空间计算注意力特征,减少特征图中的空间和通道冗余;通过多尺度特征融合,有效提取不同大小文本区域的特征并增强特征间的关联性。实验结果表明,在试卷图像数据集的目标检测和语义分割任务上,基于SMFF模块的MaskRCNN网络模型比MaskRCNN原网络模型的平均准确率提高了15.8%和10.2%,比基于常用注意力模块的MaskRCNN网络也有较大的性能提升。

     

  • 图  1  基于SMFF的试卷语义分割算法

    Figure  1.  Algorithm for Semantic Segmentation of Exercises Based on SMFF

    图  2  SMFF模块示意图

    Figure  2.  Subsapce Multiscale Feature Fusion Module

    图  3  SAM示意图

    Figure  3.  Subspace Attention Module

    图  4  多尺度卷积核示意图

    Figure  4.  Multiscale Convolution Kernel

    图  5  UM与SMFF的分组对比

    Figure  5.  Grouping comparison of UM and SMFF

    图  6  实验结果对比图

    Figure  6.  Comparison of experimental results

    图  7  算法效果对比图

    Figure  7.  Comparison of algorithm effects

    图  8  实验效果展示

    Figure  8.  Present of experimental results

    表  1  注意力模型的效果对比

    Table  1.   Comparison of the effect of attention model

    Methodreg/%seg/%Time
    (sec)
    APAP50AP75APmAPlAPAP50AP75APmAPl
    MR 57.9 89.8 66.4 40.7 55.0 49.4 90.6 41.1 44.4 46.7 6.654
    MR-CC 65.6 93.9 77.7 45.9 64.4 53.3 93.8 46.9 47.3 51.4 6.765
    MR-SE 67.6 94.9 79.6 48.4 66.4 55.8 96.1 53.0 56.4 53.8 6.656
    MR-sc 68.2 95.7 82.1 47.8 67.9 55.4 96.5 49.5 49.1 53.8 6.656
    MR-ECA 68.8 96.4 82.6 48.1 68.9 56.2 96.7 52.0 52.0 55.1 6.655
    MR-UM 70.3 96.2 85.2 54.7 70.0 56.5 96.1 52.8 55.6 55.0 9.588
    SMFF 73.7 97.5 89.0 55.2 74.7 59.6 97.8 61.3 54.9 58.9 8.709
    下载: 导出CSV

    表  2  特征融合模型效果对比

    Table  2.   Comparison of the effects of feature fusion model

    Methodreg/%seg/%Time(sec)
    APAP50AP75APmAPlAPAP50AP75APmAPl
    MR 57.9 89.8 66.4 40.7 55.0 49.4 90.6 41.1 44.4 46.7 6.654
    MR-DW 59.4 89.6 70.2 42.2 57.3 49.8 90.9 42.1 43.1 47.5 8.042
    MR-Mix 65.0 93.8 78.5 44.8 63.7 52.4 94.0 44.7 46.6 49.7 7.857
    MR-PS 64.6 93.1 77.0 44.2 63.3 53.5 93.3 47.3 47.1 51.0 8.118
    SMFF 73.7 97.5 89.0 55.2 74.7 59.6 97.8 61.3 54.9 58.9 8.709
    下载: 导出CSV

    表  3  SMFF模块的消融实验

    Table  3.   Ablation experiment of SFM module

    Back-BoneSub-spaceFeature-fusionreg/%seg/%Time
    (sec)
    APAP50AP75APmAPlAPAP50AP75APmAPl
    MR 57.9 89.8 66.4 40.7 55.0 49.4 90.6 41.1 44.4 46.7 6.654
    70.0 95.8 84.5 52.8 70.9 56.4 96.5 54.3 58.1 55.7 7.619
    73.7 97.5 89.0 55.2 74.7 59.6 97.8 61.3 54.9 58.9 8.709
    下载: 导出CSV

    表  4  印刷体和手写体分类算法处理时间对比

    Table  4.   Comparison of processing time between printed and handwritten classification algorithms

    MethodTime (sec)
    Literature [5]3.258
    Literature [6]9.228
    Literature [7]9.313
    Literature [8]9.156
    Ours8.709
    下载: 导出CSV
  • [1] ZHENG Y, LI H, DOERMANN D. Machine printed text and handwriting identification in noisy document images[J]. IEEE transactions on pattern analysis and machine intelligence, 2004, 26(3): 337-353. doi: 10.1109/TPAMI.2004.1262324
    [2] 丁红, 张晓峰. 非均匀光照图像中粘连手写体和印刷体的辨别[J]. 计算机工程与设计, 2012, 33(12): 4634-4638. doi: 10.3969/j.issn.1000-7024.2012.12.042
    [3] KAVALLIERATOU E, STAMATATOS S. Discrimination of machine-printed from handwritten text using simple structural characteristics[C]// Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004. IEEE, 2004, 1: 437-440.
    [4] SHIRDHONKAR M S, KOKARE M B. Discrimination between Printed and Handwritten Text in Documents[J]. International Journal of Computer Applications, 2010, RTIPPR(3): 131-134.
    [5] KOYAMA J, HIROSE A, KATO M. Local-spectrum-based distinction between handwritten and machine-printed characters[C]//2008 15th IEEE International Conference on Image Processing. IEEE, 2008: 1021-1024.
    [6] GARLAPATI B M, CHALAMALA S R. A system for handwritten and printed text classification[C]//2017 UKSim-AMSS 19th International Conference on Computer Modelling & Simulation (UKSim). IEEE, 2017: 50-54.
    [7] PENG X, SETLUR S, GOVINDARAJU V, et al. Handwritten text separation from annotated machine printed documents using Markov Random Fields[J]. International Journal on Document Analysis and Recognition (IJDAR), 2013, 16(1): 1-16. doi: 10.1007/s10032-011-0179-z
    [8] 林琴, 夏俊峰, 涂铮铮, 等. 基于帧特征及维特比解码的手写体与印刷体分类[J]. 激光与光电子学进展, 2019, 56(06): 123-129.
    [9] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 3431-3440.
    [10] RONNEBERGER O, FISCHER P, BROX T. U-net: Convolutional networks for biomedical image segmentation[C]//International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015: 234-241.
    [11] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[J]. arXiv preprint arXiv: 1706.05587, 2017.
    [12] CHEN L C, ZHU Y, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 801-818.
    [13] NIRKIN Y, WOLF L, HASSNER T. Hyperseg: Patch-wise hypernetwork for real-time semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 4061-4070.
    [14] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.
    [15] WANG Q, WU B, ZHU P, et al. ECA-Net: efficient channel attention for deep convolutional neural networks, 2020 IEEE[C]//CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. 2020.
    [16] SAINI R, JHA N K, DAS B, et al. Ulsam: Ultra-lightweight subspace attention module for compact convolutional neural networks[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2020: 1627-1636.
    [17] ROY A G, NAVAB N, WACHINGER C. Concurrent spatial and channel ‘squeeze & excitation’ in fully convolutional networks[C]// International conference on medical image computing and computer -assisted intervention. Springer, Cham, 2018: 421-429.
    [18] HE K, GKIOXARI G, DOLLÁR P, et al. Mask r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2961-2969.
    [19] ROSS G. Fast R-CNN[C]//IEEE International Conferre- nce on Computer Vision(ICCV), 2015: 1440-1448
    [20] HUANG Z, WANG X, HUANG L, et al. Ccnet: Criss-cross attention for semantic segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 603-612.
    [21] LI D, YAO A, CHEN Q. Psconv: Squeezing feature pyramid into one compact poly-scale convolutional layer[C]//Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16. Springer International Publishing, 2020: 615-632.
    [22] CHOLLET F. Xception: Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1251-1258.
    [23] TAN M, LE Q V. Mixconv: Mixed depthwise convolutional kernels[J]. arXiv preprint arXiv: 1907. 09595, 2019.
  • 加载中
图(8) / 表(4)
计量
  • 文章访问数:  15
  • HTML全文浏览量:  14
  • PDF下载量:  3
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-01-17
  • 网络出版日期:  2022-04-27

目录

    /

    返回文章
    返回