高级检索

    基于深度森林算法的慢性胃炎中医证候分类

    Syndrome Classification of Chronic Gastritis Based on Multi-grained Cascade Forest

    • 摘要: 针对中医问诊复杂性和非线性的特点,采用深度森林算法(gcForest)构建慢性胃炎中医问诊证候分类模型。利用gcForest分析慢性胃炎问诊数据,建立证候分类模型,并与DBN和DBM两种深度学习算法以及ML-KNN、BSVM、ECC、RankSVM、LIFT这5种多标记学习算法构建的模型进行比较。实验结果表明,该模型在多标记评价指标和单个证型的分类准确率上都优于其他算法,能有效地解决慢性胃炎中医问诊证候分类问题,通过该算法建立的模型分类效果良好,可以为慢性胃炎证候量化诊断研究提供参考。

       

      Abstract: The standardization and objectification of traditional Chinese medicine (TCM) inquiry has been becoming hot issues in machine learning fields. However, TCM inquiry data has complex relation between the symptoms and syndromes as well as among symptoms such that most of machine learning algorithms cannot effectively deal with the complexity and non-linearity of TCM inquiry data. In this paper, we propose a model of syndrome classification of chronic gastritis (CG) with multi-grained cascade forest (gcForest). TCM inquiry is a typical multi-label learning problem, that is, a patient may have two or more syndromes at the same time. Firstly, we convert the multi-label problem into binary classification via transformation method. And then, the classification model is made for each syndrome via gcForest algorithm. The gcForest is a novel decision tree ensemble method based on deep learning and is composed of two independent parts, cascade forest and multi-grained scanning. The proposed algorithm is compared with two deep learning algorithms, Deep Belief Nets (DBN) and Deep Boltzmann Machine (DBM), and other five multi-label algorithms, ML-KNN, BSVM, ECC, RankSVM, and LIFT. It is shown from the experiment results that the proposes model can outperform these algorithms based on multi-label metrics and classification accuracy of each syndrome overall. The general accuracy reaches up to 0.834, and the classification precision of 6 syndromes is 0.906, 0.818, 0.764, 0.966, 0.840, 0.912, respectively. Besides, we also analyze the effect of hyper-parameter on model performance, whose results verify its robustness. The gcForest exhibits hierarchical and abstract traits during the data process that is consistent with TCM syndrome diagnosis. Therefore, gcForest can effectively solve the TCM inquiry syndrome classification of CG and provide the reference for the research of CG quantitative diagnosis.

       

    /

    返回文章
    返回