基于子空间多尺度特征融合的试卷语义分割

夏源祥; 刘渝; 楚程钱; 万永菁; 蒋翠玲

doi:10.14135/j.cnki.1006-3080.20220117001

基于子空间多尺度特征融合的试卷语义分割

Semantic Segmentation of Test Papers Based on Subspace Multi-Scale Feature Fusion

摘要

摘要: 分离印刷体和手写体区域是实现试卷语义分割的关键步骤，为了提升试卷语义分割的效果，提出一种基于MaskRCNN网络的注意力改进算法。该算法将子空间多尺度特征融合(Subspace Multiscale Feature Fusion, SMFF)模块嵌入MaskRCNN网络的特征金字塔结构中，SMFF模块基于子空间计算注意力特征，减少特征图中的空间和通道冗余；通过多尺度特征融合，有效提取不同大小文本区域的特征并增强特征间的关联性。实验结果表明，在试卷图像数据集的目标检测和语义分割任务上，基于SMFF模块的MaskRCNN网络模型比MaskRCNN原网络模型的平均准确率分别提高了15.8%和10.2%，比基于常用注意力模块的MaskRCNN网络也有较大的性能提升。

Abstract: A key step for achieving the semantic segmentation of the test paper is to separate the printed and handwritten regions. In order to improve the effect of the semantic segmentation of the test paper, this paper proposes an improved attention algorithm based on the MaskRCNN network. By embedding the Subspace Multiscale Feature Fusion (SMFF) module into the feature pyramid structure of the MaskRCNN network, the attention features are calculated via the subspace such that the spatial and channel redundancy in the feature map can be reduced. By multi-scale feature fusion, the features of text regions with different sizes can be effectively extracted and the correlation between features can be enhanced. The experimental results show that for the target detection and semantic segmentation tasks of the test paper image dataset, the MaskRCNN network model based on the SMFF module can increase the average accuracy by 15.8% and 10.2% higher than that of the original MaskRCNN network model. Moreover, it also has greater performance improvement than the MaskRCNN based on the commonly used attention module.

HTML全文

参考文献(23)

施引文献

资源附件(0)