高级检索

    全志楠, 林家骏. 文本无关的小样本手写汉字笔迹鉴别方法[J]. 华东理工大学学报(自然科学版), 2018, 44(6): 882-886. DOI: 10.14135/j.cnki.1006-3080.20170915001
    引用本文: 全志楠, 林家骏. 文本无关的小样本手写汉字笔迹鉴别方法[J]. 华东理工大学学报(自然科学版), 2018, 44(6): 882-886. DOI: 10.14135/j.cnki.1006-3080.20170915001
    QUAN Zhi-nan, LIN Jia-jun. Text-Independent Writer Identification Method Based on Chinese Handwriting of Small Samples[J]. Journal of East China University of Science and Technology, 2018, 44(6): 882-886. DOI: 10.14135/j.cnki.1006-3080.20170915001
    Citation: QUAN Zhi-nan, LIN Jia-jun. Text-Independent Writer Identification Method Based on Chinese Handwriting of Small Samples[J]. Journal of East China University of Science and Technology, 2018, 44(6): 882-886. DOI: 10.14135/j.cnki.1006-3080.20170915001

    文本无关的小样本手写汉字笔迹鉴别方法

    Text-Independent Writer Identification Method Based on Chinese Handwriting of Small Samples

    • 摘要: 针对已有的笔迹鉴别方法对笔迹版式要求比较严格,且在小样本数据情况下,鉴别性能水平较低的问题,提出了邻环结构特征方法。首先对笔迹轮廓图像随机采样,然后利用网格窗口提取笔迹的邻环结构特征,最后利用主成分分析和线性鉴别分析方法对特征降维,利用深度置信网络对特征进行训练和鉴别。本文方法与文本无关,简单易行,在手写笔迹字符数量平均为45个的小样本上仍能有效表征作者风格信息。在HIT-MW笔迹鉴别数据库上的测试结果表明,本文方法达到了与使用较大样本的其他笔迹鉴别方法相近的鉴别效果。

       

      Abstract: Most of existing writer identification methods usually require the samples of strict format or many handwriting characters (average more than 150 characters). However, these conditions cannot be always attained in practical applications. When the samples have fewer characters and looser handwriting pattern conditions, these existing methods have lower identification performances. Aiming at the above shortcoming, this paper proposes an adjacent ring structure (ARS) feature algorithm. The reason is introduced for utilizing the principal component analysis (PCA) and linear discriminant analysis (LDA) method and the working procedure of deep belief network (DBN) is stated. The performance comparisons from different aspects are made. In the proposed identification method, the first step is to preprocess the handwriting Chinese character images by taking a random sample of handwriting contour images to get patches of the same size. And then, the proposed ARS algorithm is used on the patches for extracting features whose multiple patch features represent the stylistic information of one writer. Finally, both PCA and LDA are utilized to reduce the feature dimensions so that the dimensionality curse can be avoided. Besides, DBN is used to train the identification models of different writers and count the correct identification rate. This proposed method is text-independent, simple, and easily realized. On the small samples with average 45 Chinese characters per sample, the proposed method can still effectively represent the stylistic information of different writers. It is shown from the experiments on HIT-MW handwriting identification database that the proposed method can achieve similar performance to other identification methods using large amount of characters.

       

    /

    返回文章
    返回