基于多行为交互的变维协同进化特征选择方法

李腾飞; 冯翔; 虞慧群

doi:10.14135/j.cnki.1006-3080.20201207001

基于多行为交互的变维协同进化特征选择方法

Co-Evolutionary Feature Selection Algorithm Based on Variable-Length Particle and Multi-Behavior Interaction

摘要

摘要: 针对大规模数据集上的特征选择问题，一种变长表示的粒子群特征选择方法（VLPSO）表现出了良好的性能。然而，其完全随机的粒子生成方式导致初始化阶段具有一定的盲目性。同时，VLPSO单一的更新机制和种群间的信息隔离也影响了模型的分类性能。为了解决VLPSO的缺陷，提出了一种基于多行为交互的变维协同进化特征选择方法(M-CVLPSO)。首先，为了改善随机初始化带来的盲目性，采用连续空间上的层次初始化策略，从期望上缩短了初始解与最优解之间的距离。其次，将粒子根据适应度分为领导者、追随者与淘汰者，在迭代过程中采用多种更新策略动态平衡算法的多样性和收敛性。同时，将维度缩减指标加入到适应度函数中，进一步增强了算法在部分数据集上的性能。从理论上证明了该算法的收敛性，并基于11个大规模特征选择数据集在分类精度、维度缩减和计算时间上进行实验分析。实验结果表明，本文算法相较于4种对比算法具有更好的综合表现。

Abstract: A variable-length particle swarm optimization (VLPSO) shows good performance for feature selection on large data sets. However, its completely random particle initialization will result in certain blindness in the initial stage. Meanwhile, the single updating mechanism of VLPSO and the information isolation among subpopulations also affect the classification performance. In order to cope with the defect of VLPSO, this paper proposes a co-evolutionary feature selection method based on variable-length particle and multi-behavior interaction(M-CVLPSO). Firstly, to improve the blindness caused by random initialization, the multidirectional initialization strategy in continuous space is adopted to shorten the distance between the initial solution and the optimal solution from the perspective of expectation. Secondly, particles are divided into leaders, followers, and weeders according to fitness, and then multiple updating strategies are adopted in the process of iteration to balance the diversity and convergence of dynamic algorithms. At the same time, the dimension reduction index is integrated into the fitness function to further enhance the performance of the algorithm on some datasets. The convergence of the proposed algorithm is proved theoretically. Finally, the experimental analysis is carried out on the classification accuracy, dimension reduction and calculation time based on 11 large-scale feature selection data sets, which show that the proposed model has better comprehensive performance than the four comparison algorithms.

HTML全文

参考文献(22)

施引文献

资源附件(0)