Skip to main content
Top

2017 | OriginalPaper | Chapter

Multi-stage Feature Selection for On-Line Flow Peer-to-Peer Traffic Identification

Authors : Bushra Mohammed Ali Abdalla, Haitham A. Jamil, Mosab Hamdan, Joseph Stephen Bassi, Ismahani Ismail, Muhammad Nadzir Marsono

Published in: Modeling, Design and Simulation of Systems

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Classification of bandwidth-heavy Internet traffic is important for network administrators to throttle network of heavy-bandwidth applications traffic. Statistical methods have been previously proposed as promising method to identify Internet traffic based on packet statistical features. The selection of statistical features still plays an important role for accurate and timely classification. In this work, we propose an approach based on feature selection methods and analytic methods (scatter, one-way analysis of variance) in order to provide optimal features for on-line P2P traffic detection. Feature selection algorithms and machine learning algorithms were implemented using WEKA tool for available traces from University of Brescia, University of Aalborg and University of Cambridge. Experimental results show that the proposed method is able to achieve up to 99.5% accuracy with just six on-line statistical features. These results perform better than other existing approaches in term of accuracy and the number of features.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Auld, T., Moore, A.W., Gull, S.F.: Bayesian neural networks for internet traffic classification. IEEE Trans. Neural Netw. 18(1), 223–239 (2007)CrossRef Auld, T., Moore, A.W., Gull, S.F.: Bayesian neural networks for internet traffic classification. IEEE Trans. Neural Netw. 18(1), 223–239 (2007)CrossRef
2.
go back to reference Carela-Español, V., Bujlow, T., Barlet-Ros, P.: Is our ground-truth for traffic classification reliable? In: Faloutsos, M., Kuzmanovic, A. (eds.) PAM 2014. LNCS, vol. 8362, pp. 98–108. Springer, Cham (2014). doi:10.1007/978-3-319-04918-2_10 CrossRef Carela-Español, V., Bujlow, T., Barlet-Ros, P.: Is our ground-truth for traffic classification reliable? In: Faloutsos, M., Kuzmanovic, A. (eds.) PAM 2014. LNCS, vol. 8362, pp. 98–108. Springer, Cham (2014). doi:10.​1007/​978-3-319-04918-2_​10 CrossRef
3.
go back to reference Gringoli, F., Salgarelli, L., Dusi, M., Cascarano, N., Risso, F., et al.: GT: picking up the truth from the ground for internet traffic. ACM SIGCOMM Comput. Commun. Rev. 39(5), 12–18 (2009)CrossRef Gringoli, F., Salgarelli, L., Dusi, M., Cascarano, N., Risso, F., et al.: GT: picking up the truth from the ground for internet traffic. ACM SIGCOMM Comput. Commun. Rev. 39(5), 12–18 (2009)CrossRef
4.
go back to reference Henchiri, O., Japkowicz, N.: A feature selection and evaluation scheme for computer virus detection. In: 2006 Sixth International Conference on Data Mining, ICDM 2006, pp. 891–895. IEEE (2006) Henchiri, O., Japkowicz, N.: A feature selection and evaluation scheme for computer virus detection. In: 2006 Sixth International Conference on Data Mining, ICDM 2006, pp. 891–895. IEEE (2006)
5.
go back to reference Jamil, H.A., Mohammed, A., Hamza, A., Nor, S.M., Marsono, M.N.: Selection of on-line features for peer-to-peer network traffic classification. In: Thampi, S., Abraham, A., Pal, S., Rodriguez, J. (eds.) Recent Advances in Intelligent Informatics. AISC, vol. 235, pp. 379–390. Springer, Cham (2014). doi:10.1007/978-3-319-01778-5_39 CrossRef Jamil, H.A., Mohammed, A., Hamza, A., Nor, S.M., Marsono, M.N.: Selection of on-line features for peer-to-peer network traffic classification. In: Thampi, S., Abraham, A., Pal, S., Rodriguez, J. (eds.) Recent Advances in Intelligent Informatics. AISC, vol. 235, pp. 379–390. Springer, Cham (2014). doi:10.​1007/​978-3-319-01778-5_​39 CrossRef
6.
go back to reference Johnson, D.L., Belding, E.M., Van Stam, G.: Network traffic locality in a rural african village. In: Proceedings of the Fifth International Conference on Information and Communication Technologies and Development, pp. 268–277. ACM (2012) Johnson, D.L., Belding, E.M., Van Stam, G.: Network traffic locality in a rural african village. In: Proceedings of the Fifth International Conference on Information and Communication Technologies and Development, pp. 268–277. ACM (2012)
7.
go back to reference Jun, L., Shunyi, Z., Shidong, L., Ye, X.: P2P traffic identification technique. In: 2007 International Conference on Computational Intelligence and Security, pp. 37–41. IEEE (2007) Jun, L., Shunyi, Z., Shidong, L., Ye, X.: P2P traffic identification technique. In: 2007 International Conference on Computational Intelligence and Security, pp. 37–41. IEEE (2007)
9.
go back to reference Kögel, J.: One-way delay measurement based on flow data in large enterprise networks. University of Stuttgart, Institut für Kommunikationsnetze und Rechnersysteme (2013) Kögel, J.: One-way delay measurement based on flow data in large enterprise networks. University of Stuttgart, Institut für Kommunikationsnetze und Rechnersysteme (2013)
10.
go back to reference Kupper, L.L.: Applied Regression Analysis and Other Multivariate Methods. Duxbury Press, Pacific Grove (1978)MATH Kupper, L.L.: Applied Regression Analysis and Other Multivariate Methods. Duxbury Press, Pacific Grove (1978)MATH
11.
go back to reference Liu, H., Setiono, R.: Chi2: feature selection and discretization of numeric attributes. In: 1995 Proceedings of the Seventh International Conference on Tools with Artificial Intelligence, pp. 388–391. IEEE (1995) Liu, H., Setiono, R.: Chi2: feature selection and discretization of numeric attributes. In: 1995 Proceedings of the Seventh International Conference on Tools with Artificial Intelligence, pp. 388–391. IEEE (1995)
12.
go back to reference Loo, H.R., Marsono, M.N.: Online network traffic classification with incremental learning. Evol. Syst. 7(2), 129–143 (2016)CrossRef Loo, H.R., Marsono, M.N.: Online network traffic classification with incremental learning. Evol. Syst. 7(2), 129–143 (2016)CrossRef
13.
go back to reference Monemi, A., Zarei, R., Marsono, M.N.: Online NetFPGA decision tree statistical traffic classifier. Comput. Commun. 36(12), 1329–1340 (2013)CrossRef Monemi, A., Zarei, R., Marsono, M.N.: Online NetFPGA decision tree statistical traffic classifier. Comput. Commun. 36(12), 1329–1340 (2013)CrossRef
14.
go back to reference Moore, A., Zuev, D., Crogan, M.: Discriminators for use in flow-based classification. Queen Mary and Westfield College, Department of Computer Science (2005) Moore, A., Zuev, D., Crogan, M.: Discriminators for use in flow-based classification. Queen Mary and Westfield College, Department of Computer Science (2005)
16.
go back to reference Moore, A.W., Zuev, D.: Internet traffic classification using Bayesian analysis techniques. In: ACM SIGMETRICS Performance Evaluation Review, vol. 33, pp. 50–60. ACM (2005) Moore, A.W., Zuev, D.: Internet traffic classification using Bayesian analysis techniques. In: ACM SIGMETRICS Performance Evaluation Review, vol. 33, pp. 50–60. ACM (2005)
17.
go back to reference Moskovitch, R., Stopel, D., Feher, C., Nissim, N., Japkowicz, N., Elovici, Y.: Unknown malcode detection and the imbalance problem. J. Comput. Virol. 5(4), 295–308 (2009)CrossRef Moskovitch, R., Stopel, D., Feher, C., Nissim, N., Japkowicz, N., Elovici, Y.: Unknown malcode detection and the imbalance problem. J. Comput. Virol. 5(4), 295–308 (2009)CrossRef
18.
go back to reference Nguyen, T.T., Armitage, G.: A survey of techniques for internet traffic classification using machine learning. IEEE Commun. Surv. Tutor. 10(4), 56–76 (2008)CrossRef Nguyen, T.T., Armitage, G.: A survey of techniques for internet traffic classification using machine learning. IEEE Commun. Surv. Tutor. 10(4), 56–76 (2008)CrossRef
19.
go back to reference Qu, B., Zhang, Z., Zhu, X., Meng, D.: An empirical study of morphing on behavior-based network traffic classification. Secur. Commun. Netw. 8(1), 68–79 (2015)CrossRef Qu, B., Zhang, Z., Zhu, X., Meng, D.: An empirical study of morphing on behavior-based network traffic classification. Secur. Commun. Netw. 8(1), 68–79 (2015)CrossRef
20.
go back to reference Schultz, M.G., Eskin, E., Zadok, F., Stolfo, S.J.: Data mining methods for detection of new malicious executables. In: Proceedings of the 2001 IEEE Symposium on Security and Privacy, S&P 2001, pp. 38–49. IEEE (2001) Schultz, M.G., Eskin, E., Zadok, F., Stolfo, S.J.: Data mining methods for detection of new malicious executables. In: Proceedings of the 2001 IEEE Symposium on Security and Privacy, S&P 2001, pp. 38–49. IEEE (2001)
21.
go back to reference Tahan, G., Rokach, L., Shahar, Y.: Mal-id: automatic malware detection using common segment analysis and meta-features. J. Mach. Learn. Res. 13(Apr), 949–979 (2012)MathSciNetMATH Tahan, G., Rokach, L., Shahar, Y.: Mal-id: automatic malware detection using common segment analysis and meta-features. J. Mach. Learn. Res. 13(Apr), 949–979 (2012)MathSciNetMATH
22.
go back to reference Torres, R.D., Hajjat, M.Y., Rao, S.G., Mellia, M., Munafò, M.M.: Inferring undesirable behavior from P2P traffic analysis. In: ACM SIGMETRICS Performance Evaluation Review, vol. 37, pp. 25–36. ACM (2009) Torres, R.D., Hajjat, M.Y., Rao, S.G., Mellia, M., Munafò, M.M.: Inferring undesirable behavior from P2P traffic analysis. In: ACM SIGMETRICS Performance Evaluation Review, vol. 37, pp. 25–36. ACM (2009)
23.
go back to reference Van Der Putten, P., Van Someren, M.: A bias-variance analysis of a real world learning problem: the coil challenge 2000. Mach. Learn. 57(1–2), 177–195 (2004)CrossRefMATH Van Der Putten, P., Van Someren, M.: A bias-variance analysis of a real world learning problem: the coil challenge 2000. Mach. Learn. 57(1–2), 177–195 (2004)CrossRefMATH
24.
go back to reference Wang, W., Zhang, X., Gombault, S.: Constructing attribute weights from computer audit data for effective intrusion detection. J. Syst. Softw. 82(12), 1974–1981 (2009)CrossRef Wang, W., Zhang, X., Gombault, S.: Constructing attribute weights from computer audit data for effective intrusion detection. J. Syst. Softw. 82(12), 1974–1981 (2009)CrossRef
26.
go back to reference Yang, Y.X., Wang, R., Liu, Y., Zhou, X.Y.: Solving P2P traffic identification problems via optimized support vector machines. In: 2007 IEEE/ACS International Conference on Computer Systems and Applications, AICCSA 2007, pp. 165–171. IEEE (2007) Yang, Y.X., Wang, R., Liu, Y., Zhou, X.Y.: Solving P2P traffic identification problems via optimized support vector machines. In: 2007 IEEE/ACS International Conference on Computer Systems and Applications, AICCSA 2007, pp. 165–171. IEEE (2007)
27.
go back to reference Zhang, H., Lu, G., Qassrawi, M.T., Zhang, Y., Yu, X.: Feature selection for optimizing traffic classification. Comput. Commun. 35(12), 1457–1471 (2012)CrossRef Zhang, H., Lu, G., Qassrawi, M.T., Zhang, Y., Yu, X.: Feature selection for optimizing traffic classification. Comput. Commun. 35(12), 1457–1471 (2012)CrossRef
28.
go back to reference Zhao, J.J., Huang, X.H., Qiong, S., Yan, M.: Real-time feature selection in traffic classification. J. China Univ. Posts Telecommun. 15, 68–72 (2008)CrossRef Zhao, J.J., Huang, X.H., Qiong, S., Yan, M.: Real-time feature selection in traffic classification. J. China Univ. Posts Telecommun. 15, 68–72 (2008)CrossRef
29.
go back to reference Zhen, L., Qiong, L.: A new feature selection method for internet traffic classification using ML. Phys. Procedia 33, 1338–1345 (2012)CrossRef Zhen, L., Qiong, L.: A new feature selection method for internet traffic classification using ML. Phys. Procedia 33, 1338–1345 (2012)CrossRef
Metadata
Title
Multi-stage Feature Selection for On-Line Flow Peer-to-Peer Traffic Identification
Authors
Bushra Mohammed Ali Abdalla
Haitham A. Jamil
Mosab Hamdan
Joseph Stephen Bassi
Ismahani Ismail
Muhammad Nadzir Marsono
Copyright Year
2017
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-6502-6_44

Premium Partner