高级检索

    曹雅茜, 黄海燕. 基于代价敏感大间隔分布机的不平衡数据分类算法[J]. 华东理工大学学报(自然科学版), 2019, 45(4): 606-613. DOI: 10.14135/j.cnki.1006-3080.20180515001
    引用本文: 曹雅茜, 黄海燕. 基于代价敏感大间隔分布机的不平衡数据分类算法[J]. 华东理工大学学报(自然科学版), 2019, 45(4): 606-613. DOI: 10.14135/j.cnki.1006-3080.20180515001
    CAO Yaxi, HUANG Haiyan. Imbalanced Data Classification Based on Cost-Sensitive Large Margin Distribution Machine[J]. Journal of East China University of Science and Technology, 2019, 45(4): 606-613. DOI: 10.14135/j.cnki.1006-3080.20180515001
    Citation: CAO Yaxi, HUANG Haiyan. Imbalanced Data Classification Based on Cost-Sensitive Large Margin Distribution Machine[J]. Journal of East China University of Science and Technology, 2019, 45(4): 606-613. DOI: 10.14135/j.cnki.1006-3080.20180515001

    基于代价敏感大间隔分布机的不平衡数据分类算法

    Imbalanced Data Classification Based on Cost-Sensitive Large Margin Distribution Machine

    • 摘要: 大间隔分布学习机(LDM)在应用于不平衡据分类时,由于忽略类别不均衡,会使少数类样本的识别率较低。针对这一不足,结合代价敏感思想提出了一种不平衡代价敏感大间隔分布算法(ICS-LDM)。首先,在计算间隔均值和间隔方差时,结合数据集的不平衡因子和样本错分代价参数,调整不同类别的间隔分布权重;其次,将可以快速收敛的循环对偶坐标下降法应用于求解目标函数;最后,通过逐渐提高少数类的间隔分布,可以实现间隔分布在各类别平衡且总体最大。在虚拟数据集和UCI公开数据集上的实验结果表明,ICS-LDM可以有效提高少数类的分类精度,平衡各类的分类性能。

       

      Abstract: In recent years, it has been theoretically verified that, compared with a single margin, margin distribution is more critical to the generalization performance. Although large margin distribution machine (LDM) can get superior classification and stronger generalization performance by maximizing the margin mean and minimizing the margin variance simultaneously, classifiers may be overwhelmed by the majority classes such that the minority class could have a lower detection rate due to ignoring the class imbalance. This is apparently contradict to the needs of high detection rate on the minority class in many real applications. Aiming at the above problem, this paper proposes an imbalanced cost-sensitive large margin distribution machine (ICS-LDM) to improve the detection rate of the minority class. First, when calculating the margin mean and margin variance, different weights are chosen on the sample margin between different types. And then, the objective function is optimized effectively by means of the cyclic dual coordinate descent method (Cyclic-DCD). Thus, a balanced distribution and maximum total margin is obtained by gradually increasing the margin distribution of the minority class. Finally, it is shown from experimental results that the proposed ICS-LDM can improve the classification accuracy of minority class and obtain more balanced detection rates on virtual dataset and UCI datasets.

       

    /

    返回文章
    返回