高级检索

  • ISSN 1006-3080
  • CN 31-1691/TQ

基于极值随机森林的慢性胃炎中医证候分类

颜建军 胡宗杰 刘国萍 王忆勤 付晶晶 郭睿 钱鹏

颜建军, 胡宗杰, 刘国萍, 王忆勤, 付晶晶, 郭睿, 钱鹏. 基于极值随机森林的慢性胃炎中医证候分类[J]. 华东理工大学学报(自然科学版), 2017, (5): 698-703. doi: 10.14135/j.cnki.1006-3080.2017.05.015
引用本文: 颜建军, 胡宗杰, 刘国萍, 王忆勤, 付晶晶, 郭睿, 钱鹏. 基于极值随机森林的慢性胃炎中医证候分类[J]. 华东理工大学学报(自然科学版), 2017, (5): 698-703. doi: 10.14135/j.cnki.1006-3080.2017.05.015
YAN Jian-jun, HU Zong-jie, LIU Guo-ping, WANG Yi-qin, FU Jing-jing, GUO Rui, QIAN Peng. Syndrome Classification of Chronic Gastritis Based on Extremely Randomized Forest Algorithm[J]. Journal of East China University of Science and Technology, 2017, (5): 698-703. doi: 10.14135/j.cnki.1006-3080.2017.05.015
Citation: YAN Jian-jun, HU Zong-jie, LIU Guo-ping, WANG Yi-qin, FU Jing-jing, GUO Rui, QIAN Peng. Syndrome Classification of Chronic Gastritis Based on Extremely Randomized Forest Algorithm[J]. Journal of East China University of Science and Technology, 2017, (5): 698-703. doi: 10.14135/j.cnki.1006-3080.2017.05.015

基于极值随机森林的慢性胃炎中医证候分类

doi: 10.14135/j.cnki.1006-3080.2017.05.015
基金项目: 

国家自然科学基金(81270050,81302913,30901897,81173199)

Syndrome Classification of Chronic Gastritis Based on Extremely Randomized Forest Algorithm

  • 摘要: 大多数机器学习算法能得到较好的分类效果,但模型却无法解释;而随机森林等模型有良好的可解释性,却无法处理中医数据中兼证的情况。本文利用极值随机森林算法对慢性胃炎中医数据进行证候分类研究,其中决策树的叶节点能输出多个标签,通过加权机制综合分量来处理兼证问题。与已有多标记学习算法和C4.5、CART等基于决策树的算法进行比较,实验结果表明,极值随机森林算法无论在6个证型的分类准确率上,还是在多标记评价指标上都具有更好的效果,而且模型中得到的规则基本符合中医理论。

     

  • [1] 赵悦.概率图模型学习理论及其应用[M].北京:清华大学出版社,2012.
    [2] 王勇.一种诊断外周神经系统疾病的专家系统[J].重庆大学学报,1994,17(4):104-109.
    [3] 徐蕾,贺佳,孟虹,等.基于信息熵的决策树在慢性胃炎中医辨证中的应用[J].第二军医大学学报,2004,25(9):1009-1012.
    [4] 查青林,何羿婷,喻建平,等.基于决策树分析方法探索类风湿性关节炎证病信息与疗效的相关关系[J].中国中西医结合杂志,2006,26(10):871-878.
    [5] 廖晓威,马利庄,王彦.ES-ID3算法及其在中医辨证中的应用[J].计算机工程与应用,2008,44(32):191-195.
    [6] BREIMAN L.Random forests[J].Machine Learning,2001,45(1):5-32.
    [7] 刘永春,宋宏.基于随机森林的乳腺肿瘤诊断研究[J].电视技术,2014,38(15):253-255.
    [8] 聂斌,王卓,杜建强,等.基于粗糙集和随机森林算法辅助糖尿病并发症分类研究[J].江西师范大学学报,2014,38(3):278-282.
    [9] 范昕,赵桂新,孙萌,等.使用随机森林判别分析法预测黑加仑油胶囊治疗高血脂的效果[J].中医药信息,2012,29(4):43-47.
    [10] 何志芬,杨明,刘会东.多标记分类和标记相关性的联合学习[J].软件学报,2014,25(9):1967-1981.
    [11] DIMITROVSKI I,KOCEV D,LOSKOVSKA S,et al. Hierarchical classification of diatom images using ensembles of predictive clustering trees[J].Ecological Informatics,2012,7(1):19-29.
    [12] VENS C,STRUYF J,SCHIETGAT L,et al.Decision trees for hierarchical multi-label classification[J].Machine Learning,2008,73(2):185-214.
    [13] ZHOU T,TAO D.Multi-label subspace ensemble[C]//15th International Conference on Artificial Intelligence and Statistics.Berlin:Springer-Verlag,2012:1444-1452.
    [14] JOLY A,GEURTS P,WEHENKEL L.Random forests with random projections of the output space for high dimensional multi-label classification[J].Lecture Notes in Computer Science,2014,8724:607-622.
    [15] TAN S,SIM K C,GALES M.Improving the interpretability of deep neural networks with stimulated learning[J].IEEE Transactions on Neural Networks,2015,10:617-623.
    [16] SHUKLA P K,TRIPATHI S P.A Survey on interpretability-accuracy (I-A) trade-off in evolutionary fuzzy systems[C]//Fifth International Conference on Genetic and Evolutionary Computing.New Jersey:IEEE press,2011:97-101.
    [17] OTERO F E B,FREITAS A A.Improving the interpretability of classification rules discovered by an ant colony algorithm:Extended results[J].Evolutionary Computation,2016,24(3):385-409.
    [19] MAISTO D,ESPOSITO M.Improving accuracy and interpretability of clinical decision support systems through possibilistic constrained evolutionary optimization[C]//Eighth International Conference on Signal Image Technology and Internet Based Systems.Sorrento:Institute of Electrical and Electronics Engineers,2012:474-481.
    [20] GEURTS P,ERNST D,WEHENKEL L.Extremely randomized trees[J].Machine Learning,2006,63(1):3-42.
    [21] ZHANG M L,ZHOU Z H.Multi-label neural networks with applications to functional genomics and text categorization[J].IEEE Transactions on Knowledge and Data Engineering,2006,18(10):1338-1351.
    [22] ZHANG M L,ZHOU Z H.ML-kNN:A lazy learning approach to multi-label learning[J].Pattern Recognition,2007,40(7):2038-2048.
    [23] TSOUMAKAS G,KATAKIS I,VLAHAVAS I.Data Mining and Knowledge Discovery Handbook[M].Berlin:Springer-Verlag,2010.
    [24] BOUTELL M R,LUO J,SHEN X,et al.Learning multi-label scene classification[J].Pattern Recognition,2004,37(9):1757-1771.
    [26] ZHANG M L.LIFT:Multi-label learning with label-specific features[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2011,37(1):107-120.
    [27] TRENDOWICZ A,JEFFERY R.Classification and Regression Trees[M].Berlin:Springer-Verlag,2014.
    [28] ZHANG M L,ZHOU Z H.A review on multi-label learning algorithms[J].IEEE Transactions on Knowledge and Data Engineering,2014,26(8):1819-1837.
    [29] DAVID J.Measuring classifier performance:A coherent alternative to the area under the ROC curve[J].Machine Learning,2009,77(1):103-123.
    [35] READ J,PFAHRINGER B,HOLMES G,et al.Classifier chains for multi-label classification[J].Machine Learning and Knowledge Discovery in Databases,2009,11:254-269.
  • 加载中
图(1)
计量
  • 文章访问数:  1420
  • HTML全文浏览量:  226
  • PDF下载量:  556
  • 被引次数: 0
出版历程
  • 收稿日期:  2016-12-30
  • 刊出日期:  2017-10-28

目录

    /

    返回文章
    返回