Skip to main content
Top
Published in: Peer-to-Peer Networking and Applications 5/2015

01-09-2015

Active learning for P2P traffic identification

Authors: San-Min Liu, Zhi-Xin Sun

Published in: Peer-to-Peer Networking and Applications | Issue 5/2015

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

P2P traffic identification methods by using machine learning have been provided in a great number of works, which suffer from a large and representative labeled sample set. To overcome the sample labeling problem, a new P2P traffic identification approach by active learning called P2PTIAL is presented. P2PTIAL is composed of two parts: support vector machine as learner and uncertainty selection based on distance. In order to improve the effectiveness of P2PTIAL, we add filtering policy and balanced policy to P2PTIAL. Firstly, we use support vector data description (SVDD) theory to filter some unlabeled samples, which have little contribution on active learning, and so it can save computation cost and storage space. Secondly, we use the unlabeled sample’s pre-labeled information to develop balanced policy, which can keep balanced learning. Lastly, we support our design with extensive simulation experiments, and our results show P2PTIAL is feasible.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Chen ZQ, Delis A, Wei P (2008) Identification and management of sessions generated by instant messaging and peer-to-peer systems. International Journal of Cooperative Information Systems 3:1–50CrossRefMATH Chen ZQ, Delis A, Wei P (2008) Identification and management of sessions generated by instant messaging and peer-to-peer systems. International Journal of Cooperative Information Systems 3:1–50CrossRefMATH
2.
go back to reference Karagiannis T, Broido A, Faloutsos M, Claffy K (2004) Transport layer identification of P2P traffic. IMC’04, Taormina, sicily, Italy, pp 121–134 Karagiannis T, Broido A, Faloutsos M, Claffy K (2004) Transport layer identification of P2P traffic. IMC’04, Taormina, sicily, Italy, pp 121–134
3.
go back to reference Sen S, Spatscheck O, Wang D (2004) Accurate scalable in-network identification of P2P traffic using application signatures. In: Proc. of the 13th international conference on WWW, New York, USA, pp 512–521 Sen S, Spatscheck O, Wang D (2004) Accurate scalable in-network identification of P2P traffic using application signatures. In: Proc. of the 13th international conference on WWW, New York, USA, pp 512–521
4.
go back to reference Moore AW, Papagiannaki K (2005) Toward the accurate identification of network applications. Springer-Verlag, Heidelberg, pp 41–54 Moore AW, Papagiannaki K (2005) Toward the accurate identification of network applications. Springer-Verlag, Heidelberg, pp 41–54
5.
go back to reference Satoshi O, Yoichi H, Matsuaki T, Konosuke K (2005) A traffic identification method and evaluations for a pure P2P application. PAM 2005, LNCS 3431, pp 55–68 Satoshi O, Yoichi H, Matsuaki T, Konosuke K (2005) A traffic identification method and evaluations for a pure P2P application. PAM 2005, LNCS 3431, pp 55–68
6.
go back to reference Thomas K, Andre B, Michalis F, Kimberly C, Claffy (2004) Transport layer identification of P2P traffic. In: Proc. of the 4th ACM SIGCOMM conference on Internet measurement, Sicily, Italy: ACM Press, pp 121–134 Thomas K, Andre B, Michalis F, Kimberly C, Claffy (2004) Transport layer identification of P2P traffic. In: Proc. of the 4th ACM SIGCOMM conference on Internet measurement, Sicily, Italy: ACM Press, pp 121–134
7.
go back to reference Kyoungwon S, Figueiredo DR, Kurose J, Don T (2006) Characterizing and detecting skype-relayed traffic. In:Proc. of IEEE Conference on computer Communications, pp 1–12 Kyoungwon S, Figueiredo DR, Kurose J, Don T (2006) Characterizing and detecting skype-relayed traffic. In:Proc. of IEEE Conference on computer Communications, pp 1–12
8.
go back to reference Karagiannis T, Papagiannaki K, Faloutsos M (2005) BLINC: multilevel traffic classification in the dark. ACM SIGCOMM, pp 229–240 Karagiannis T, Papagiannaki K, Faloutsos M (2005) BLINC: multilevel traffic classification in the dark. ACM SIGCOMM, pp 229–240
9.
go back to reference Mdhukar A, Wiliamson C. (2006) A longitudinal study of P2P traffic classification. The 14th IEEE Int’1 Symp on Modeling, Analysis, Simulation of Computer and Telecommunication Systems, pp 179–188 Mdhukar A, Wiliamson C. (2006) A longitudinal study of P2P traffic classification. The 14th IEEE Int’1 Symp on Modeling, Analysis, Simulation of Computer and Telecommunication Systems, pp 179–188
10.
go back to reference Chen ZQ, Zhang Y, Chen ZR, Delis A (2009) A digest and pattern matching-based intrusion detection engine. Comput J 3:1–25 Chen ZQ, Zhang Y, Chen ZR, Delis A (2009) A digest and pattern matching-based intrusion detection engine. Comput J 3:1–25
11.
go back to reference Marco M, Antonio P, Luca S (2009) Traffic classification and its applications to modern networks. Comput Netw 53(6):759–76CrossRef Marco M, Antonio P, Luca S (2009) Traffic classification and its applications to modern networks. Comput Netw 53(6):759–76CrossRef
12.
go back to reference McGregor A, Hall M, Lorier P, Brunskill J (2004) Flow clustering using machine learning techniques. In: Proc. of 5th passive Active measurement Workshop (PAM), pp 205–214 McGregor A, Hall M, Lorier P, Brunskill J (2004) Flow clustering using machine learning techniques. In: Proc. of 5th passive Active measurement Workshop (PAM), pp 205–214
13.
go back to reference Bernaille L, Teixeira R, Salamatian K (2006) Early application identification. Proc. of 2006 ACM CoNEXT, Lisboa, pp 1–12 Bernaille L, Teixeira R, Salamatian K (2006) Early application identification. Proc. of 2006 ACM CoNEXT, Lisboa, pp 1–12
14.
go back to reference Erman J, Arlitt M, Mahanti A (2006) Traffic classification using clustering algorithms. Proc. of the SIGCOMM Workshop on Mining Network Data, Pisa, pp 281–286 Erman J, Arlitt M, Mahanti A (2006) Traffic classification using clustering algorithms. Proc. of the SIGCOMM Workshop on Mining Network Data, Pisa, pp 281–286
15.
go back to reference Zuev D, Moore AW (2005) Traffic classification using a statistical approach. Springer-Verlag, Heidelberg, pp 321–324 Zuev D, Moore AW (2005) Traffic classification using a statistical approach. Springer-Verlag, Heidelberg, pp 321–324
16.
go back to reference Moore AW, Zuev D (2005) Internet traffic classification using Bayesian analysis techniques. In: Proc. of the 2005 ACM SIGMETRICS, pp 50–60 Moore AW, Zuev D (2005) Internet traffic classification using Bayesian analysis techniques. In: Proc. of the 2005 ACM SIGMETRICS, pp 50–60
17.
go back to reference Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proc. of the 20th International Conference on Machine Learning, pp 856–863 Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proc. of the 20th International Conference on Machine Learning, pp 856–863
18.
go back to reference Auld T, Moore AW, Gull SF (2007) Bayesian neural networks for internet traffic classification. IEEE Transaction on Neural Network 18(1):223–239CrossRef Auld T, Moore AW, Gull SF (2007) Bayesian neural networks for internet traffic classification. IEEE Transaction on Neural Network 18(1):223–239CrossRef
19.
go back to reference Alice E, Francesco G, Luca S (2009) Support vector machines for TCP traffic classification. Comput Netw 53:2476–2490CrossRefMATH Alice E, Francesco G, Luca S (2009) Support vector machines for TCP traffic classification. Comput Netw 53:2476–2490CrossRefMATH
20.
go back to reference Li Z, Yuan RX, Guan XH (2007) Accurate classification of the internet traffic based on the SVM method. In:Proc. of IEEE Int Conference on Communications, Glasgow, Scotland, pp 1373–1378 Li Z, Yuan RX, Guan XH (2007) Accurate classification of the internet traffic based on the SVM method. In:Proc. of IEEE Int Conference on Communications, Glasgow, Scotland, pp 1373–1378
21.
go back to reference Li X, Feng Q, Xu D, Qiu XS (2011) An internet traffic classification method based on semi-supervised support vector machine. In:Proc. of IEEE Int. Conference on Communications, Kyoto, Japan, pp 1–5 Li X, Feng Q, Xu D, Qiu XS (2011) An internet traffic classification method based on semi-supervised support vector machine. In:Proc. of IEEE Int. Conference on Communications, Kyoto, Japan, pp 1–5
22.
go back to reference Yang G, Yuan L, He Y (2012) Timely traffic identification on P2P streaming media. Journal of China University of Posts and Telecommunications 19(2):67–73CrossRef Yang G, Yuan L, He Y (2012) Timely traffic identification on P2P streaming media. Journal of China University of Posts and Telecommunications 19(2):67–73CrossRef
23.
go back to reference Huang NF, Jai GY, Chao HC, Tzang YJ, Chang HY (2013) Application traffic classification at the early stage by characterizing application rounds. Inf Sci 232:130–142CrossRef Huang NF, Jai GY, Chao HC, Tzang YJ, Chang HY (2013) Application traffic classification at the early stage by characterizing application rounds. Inf Sci 232:130–142CrossRef
24.
25.
go back to reference Burr S (2009) Active learning literature survey. Computer sciences technical report 1648 Burr S (2009) Active learning literature survey. Computer sciences technical report 1648
27.
go back to reference Chen ZQ, Roussopoulos M, Liang ZY, Zhang Y, Chen ZR, Delis A (2012) Malware characteristics and threats on the internet ecosystem. J Syst Softw 85(7):1650–1672CrossRef Chen ZQ, Roussopoulos M, Liang ZY, Zhang Y, Chen ZR, Delis A (2012) Malware characteristics and threats on the internet ecosystem. J Syst Softw 85(7):1650–1672CrossRef
28.
go back to reference Sotiris K, Dimitris K, Panayiotis P (2006) Handing imbalanced datasets: a review. International Transactions on Computer Science and Engineering 30:1–12 Sotiris K, Dimitris K, Panayiotis P (2006) Handing imbalanced datasets: a review. International Transactions on Computer Science and Engineering 30:1–12
Metadata
Title
Active learning for P2P traffic identification
Authors
San-Min Liu
Zhi-Xin Sun
Publication date
01-09-2015
Publisher
Springer US
Published in
Peer-to-Peer Networking and Applications / Issue 5/2015
Print ISSN: 1936-6442
Electronic ISSN: 1936-6450
DOI
https://doi.org/10.1007/s12083-014-0281-3

Other articles of this Issue 5/2015

Peer-to-Peer Networking and Applications 5/2015 Go to the issue

Premium Partner