应用主动学习SVM的网络流量分类方法

1.西安科技大学 计算机科学与技术学院,陕西 西安 710054; 2.西北农林科技大学 信息工程学院,陕西 杨凌 712100

网络流量分类; 主动学习; 多类支持向量机; 参数选取

Network traffic classification method based on active learning support vector machine
LI Yuan-cheng1,LIU Bin2

(1.College of Computer Science and Engineering,Xi'an University of Science and Technology,Xi'an 710054,China; 2.College of Information Engineering,Northwest A&F University,Yangling 712100,China)

network traffic classification; active learning; multi-class support vector machine; parameter selection

DOI: 10.13800/j.cnki.xakjdxxb.2017.0522

备注

针对传统网络流量分类方法准确率不高、开销较大且应用领域受限等诸多问题,文中提出一种基于主动学习支持向量机的网络流量分类方法。该方法采用基于OVA方法的多类支持向量机来进行分类,首先,针对支持向量机参数选择,提出了一种改进的网格搜索法来寻求最优参数; 然后,为了降低需要标注的样本数,提出一个改进的启发式主动学习样本查询准则; 最后,基于上述方法构造基于主动学习的多类支持向量机分类器。结果 表明,该方法可以在需要标注的样本数非常少的情况下明显提高网络流量分类的准确率和效率,仅需传统方法所需11%的样本数即可达到98.7%的分类准确率。

Aiming to solve the problems such as low accuracy,large overhead and limitation of applications for traditional network traffic classification,this paper presents a novel network traffic classification method based on active learning support vector machine(ALSVM).This method applied the OVA method to build multi-class support vector machine.Firstly,we propose an improved grid search method to seek the optimal parameter for ALSVM.Then,an improved heuristic active learning sample query criteria is proposed to reduce the number of label samples.Lastly,an active multi-class support vector machine classifier is constructed for network traffic classification.Experimental results show that,this method can significantly improve the accuracy and efficiency of network traffic classification with much fewer label samples.We can gain 98.7% classification accuracy with 11% of the conventional method required number of label samples.