[1]李远成,刘 斌.应用主动学习SVM的网络流量分类方法[J].西安科技大学学报,2017,(05):742-749.[doi:10.13800/j.cnki.xakjdxxb.2017.0522]
 LI Yuan-cheng,LIU Bin.Network traffic classification method based on active learning support vector machine[J].Journal of Xi'an University of Science and Technology,2017,(05):742-749.[doi:10.13800/j.cnki.xakjdxxb.2017.0522]
点击复制

应用主动学习SVM的网络流量分类方法(/HTML)
分享到:

西安科技大学学报[ISSN:1672-9315/CN:61-1434/N]

卷:
期数:
2017年05期
页码:
742-749
栏目:
出版日期:
2017-09-30

文章信息/Info

Title:
Network traffic classification method based on active learning support vector machine
文章编号:
1672-9315(2017)05-0742-08
作者:
李远成1刘 斌2
1.西安科技大学 计算机科学与技术学院,陕西 西安 710054; 2.西北农林科技大学 信息工程学院,陕西 杨凌 712100
Author(s):
LI Yuan-cheng1LIU Bin2
1.College of Computer Science and Engineering,Xi'an University of Science and Technology,Xi'an 710054,China; 2.College of Information Engineering,Northwest A&F University,Yangling 712100,China
关键词:
网络流量分类 主动学习 多类支持向量机 参数选取
Keywords:
Key words:network traffic classification active learning multi-class support vector machine parameter selection
分类号:
TP 393.07
DOI:
10.13800/j.cnki.xakjdxxb.2017.0522
文献标志码:
A
摘要:
针对传统网络流量分类方法准确率不高、开销较大且应用领域受限等诸多问题,文中提出一种基于主动学习支持向量机的网络流量分类方法。该方法采用基于OVA方法的多类支持向量机来进行分类,首先,针对支持向量机参数选择,提出了一种改进的网格搜索法来寻求最优参数; 然后,为了降低需要标注的样本数,提出一个改进的启发式主动学习样本查询准则; 最后,基于上述方法构造基于主动学习的多类支持向量机分类器。结果表明,该方法可以在需要标注的样本数非常少的情况下明显提高网络流量分类的准确率和效率,仅需传统方法所需11%的样本数即可达到98.7%的分类准确率。
Abstract:
Abstract:Aiming to solve the problems such as low accuracy,large overhead and limitation of applications for traditional network traffic classification,this paper presents a novel network traffic classification method based on active learning support vector machine(ALSVM).This method applied the OVA method to build multi-class support vector machine.Firstly,we propose an improved grid search method to seek the optimal parameter for ALSVM.Then,an improved heuristic active learning sample query criteria is proposed to reduce the number of label samples.Lastly,an active multi-class support vector machine classifier is constructed for network traffic classification.Experimental results show that,this method can significantly improve the accuracy and efficiency of network traffic classification with much fewer label samples.We can gain 98.7% classification accuracy with 11% of the conventional method required number of label samples.

参考文献/References:

[1] Li W,CaninM,Moore A W.Efficient application identification and the temporal and spatial stability of classification schema[J].Computer Networks,2009,53(6):790-809.
[2] 张 宾,杨家海,吴建平. Internet流量模型分析与评述[J].软件学报,2011,22(1):115-131. ZHANG Bin,YANG Jia-hai,WU Jian-ping.Survey and analysis on the internet traffic model internet[J].Journal of Software,2011,22(1):115-131.
[3] Nguyen T,Armitage G,Branch P,et al.Timely and continuous machine-learning-based classification for interactive IP traffic[J].IEEE/ACM Transactions on Networking,2012,20(6):1 880-1 894.
[4] Tongaonkar A,Torres R,LLiofotou M,et al.Towards self adaptive network traffic classification[J].Computer Communications,2015,56(1):35-46.
[5] Xie G,LLiofotou M,Keralapura R,et al.SubFlow:Towards practical flow-level traffic classification[C]//Proceedings of the IEEE International Conference on Computer Communications.Orlando,USA,2012:2 541-2 145.
[6] Wang Y,Xiang Y,Zhang J,et al.Internet traffic classification using constrained clustering[J].IEEE Transactions on Parallel and Distributed Systems,2014,25(11):2 932-2 943.
[7] Nanatha V N,Sureshkumar V,Rajeswari A.Automatic traffic classification using machine learning algorithm for policy-based routing in UMTS-WLAN interworking[J].2014,324:305-312.
[8] 王安义,郭世坤.最小二乘支持向量机在信道均衡中的应用[J].西安科技大学学报,2014,34(5):591-595. WANG An-yi,GUO Shi-kun.Application of least squares support vector machine in channel equalization[J].Journal of Xi'an University of Science and Technology,2014,34(5):591-595.
[9] Moore AW,Zuev D.Internet traffic classification using bayesian analysis techniques[J].ACM SIGMETRICS Performance Evaluation Review,2005,33(1):50-60.
[10]Kira K,Rendell LA.The feature selection problem:traditional methods and a new algorithm[C]//Proceedings of the 10# National Conference on Artificial Intelligence.San Jose,USA,1992:129-134.
[11]Fahad A,Tari Z,Khalil L,et al.Toward an efficient and scalable feature selection approach for internet traffic classification[J].Computer Networks,2013,57(9):2 040-2 057.
[12]Bergstra J,Bengio Y.Random search for hyper-parameter optimization[J].The Journal of Machine Learning Research,2012,13(1):281-305.
[13]Lewis D,Gale W.A sequential algorithm for training text classifiers[C]//Proceedings of the Seventeenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval,Dublin,Ireland,1994:148-156.
[14]Vlachos A.Active learning with support vector machines[J].Data Mining and Knowledge Discovery,2004,4(4):1 130-1 137.
[15]Zafra A,Pechenizkiy M,Ventura S.ReliefF-MI:an extension of ReliefF to multiple instance learning[J].Neurocomputing,2012,75(1):210-218.
[16]Hsu C,Lin C.A comparison of methods for multiclass support vector machines[J].IEEE Transactions on Neural Networks,2002(13):415-425.
[17]Zuev D,Moore AW. Traffic Classification Using a Statistical Approach[C]//Proceedings of the International Conference on Passive and Active Network Measurement.Boston,USA,2005:321-324.
[18]Auld T,Moore A W,Gull S F.Bayesian neural networks for internet traffic classification[J].IEEE Transactions on Neural Networks,2007,18(1):223-239.

备注/Memo

备注/Memo:
基金项目:陕西省教育厅自然基金(2013JK1187); 中央高校基本科研业务费专项资金(2452015194) 通讯作者:李远成(1981-),男,河南杞县人,博士,讲师,E-mail:yuanch_li@126.com
更新日期/Last Update: 2017-11-08