›› 2020, Vol. 26 ›› Issue (第1): 152-160.DOI: 10.13196/j.cims.2020.01.016

Previous Articles     Next Articles

Optimized down sampling SVM classification method based on potential function clustering

  

  • Online:2020-01-31 Published:2020-01-31
  • Supported by:
    Project supported by the National Natural Science Foundation,China(No.61741111),the Natural Science Foundation of Fujian Province,China(No.2019J01815,2019J01816),the Natural Science Foundation of Jiangxi Province,China(No.20181BAB202011),the Educational Research Program for Young and Middle-Aged Teachers in Fujian Province,China(No.JT180486,JAT170504),the Science and Technology Bureau of Putian City,China(No.2018RP4004,2018ZP10),and the Introduction of Talents to Start Scientific Research Program in Putian University,China(No.2018088).

势函数聚类的优化下采样SVM分类方法

闻辉1,贾冬顺2,严涛1,陈德礼1,林元模1   

  1. 1.莆田学院信息工程学院
    2.东方地球物理公司辽河物探处
  • 基金资助:
    国家自然科学基金资助项目(61741111);福建省自然科学基金资助项目(2019J01815,2019J01816);江西省自然科学基金资助项目(20181BAB202011);福建省中青年教师教育科研资助项目(JT180486,JAT170504);莆田市科技局资助项目(2018RP4004,2018ZP10);莆田学院引进人才科研启动资助项目(2018088)。

Abstract: To improve the training efficiency and the generalization performance of Support Vector Machines (SVM) in large sample sets,a novel algorithm was presented,which had used the strategy of combining sampling optimization with classifier optimization.By constructing the potential function to measure the density of original sample space,the Gaussian kernels with different parameters were established to cover different regions of sample space step by step,and the down sampling sample set was generated with an incremental learning method.The down sampling set was utilized to train the initial SVM,and the boundary samples near the classifier in the initial training set was selected to further optimize the SVM.The presented algorithm was applied to the artificial data set and benchmark data sets,and the results showed that it could improve the training efficiency and guarantee the generalization performance of the classifier.

Key words: support vector machines, down sampling set, large sample set, potential function, classification

摘要: 为了改善大样本集下支持向量机(SVM)的训练效率和泛化性能,提出一种新算法。该算法运用采样优化和学习器优化相结合的策略,通过构建势函数对原始样本空间进行密度度量,建立了不同参数的高斯核,以实现对样本空间不同区域的逐次覆盖,并以增量学习的方式生成下采样集。然后,在所获取的下采样集上进行SVM初始训练,通过寻找原始训练集中的边界样本,进行SVM二次优化。最后,将新算法应用于人工数据集及基准数据集,结果表明,该算法在有效改善训练效率的同时,保证了分类器的泛化性能。

关键词: 支持向量机, 下采样集, 大样本集, 势函数, 分类

CLC Number: