Skip to main content
Erschienen in: Journal of Computer Virology and Hacking Techniques 2/2017

27.01.2016 | Original Paper

Clustering for malware classification

verfasst von: Swathi Pai, Fabio Di Troia, Corrado Aaron Visaggio, Thomas H. Austin, Mark Stamp

Erschienen in: Journal of Computer Virology and Hacking Techniques | Ausgabe 2/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this research, we apply clustering techniques to the malware classification problem. We compute clusters using the well-known K-means and Expectation Maximization algorithms, with the underlying scores based on Hidden Markov Models. We compare the results obtained from these two clustering approaches and we carefully consider the interplay between the dimension (i.e., number of models used for clustering), and the number of clusters, with respect to the accuracy of the clustering.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Al-Zoubi, M.B., Rawi, M.A.: An efficient approach for computing silhouette coefficients. J. Comput. Sci. 4(3), 252–255 (2008)CrossRef Al-Zoubi, M.B., Rawi, M.A.: An efficient approach for computing silhouette coefficients. J. Comput. Sci. 4(3), 252–255 (2008)CrossRef
3.
Zurück zum Zitat Annachhatre, C., Austin, T.H., Stamp, M.: Hidden Markov model for malware classification. J. Comput. Virol. Hack. Tech. 11(2), 59–73 (2014)CrossRef Annachhatre, C., Austin, T.H., Stamp, M.: Hidden Markov model for malware classification. J. Comput. Virol. Hack. Tech. 11(2), 59–73 (2014)CrossRef
4.
Zurück zum Zitat Austin, T.H., Filiol, E., Josse, S., Stamp, M.: Exploring hidden Markov models for virus analysis: a semantic approach. In: Proceedings of 46th Hawaii International Conference on System Sciences (HICSS 2013), pp. 5039–5048 (2013) Austin, T.H., Filiol, E., Josse, S., Stamp, M.: Exploring hidden Markov models for virus analysis: a semantic approach. In: Proceedings of 46th Hawaii International Conference on System Sciences (HICSS 2013), pp. 5039–5048 (2013)
5.
Zurück zum Zitat Aycock, J.: Computer Viruses and Malware. Springer, Heidelberg (2006) Aycock, J.: Computer Viruses and Malware. Springer, Heidelberg (2006)
6.
Zurück zum Zitat Babu, A.R., Markandeyulu, M., Nagarjuna, B.V.R.R.: Pattern clustering with similarity measures. Int. J. Comput. Technol. Appl. 3(1), 365–369 (2012) Babu, A.R., Markandeyulu, M., Nagarjuna, B.V.R.R.: Pattern clustering with similarity measures. Int. J. Comput. Technol. Appl. 3(1), 365–369 (2012)
7.
Zurück zum Zitat Bailey, M., Oberheide, J., Andersen, J., Morley Mao, Z., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Proceedings of the 10th International Conference on Recent Advances in Intrusion Detection (RAID ’07), pp. 178–197 (2007) Bailey, M., Oberheide, J., Andersen, J., Morley Mao, Z., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Proceedings of the 10th International Conference on Recent Advances in Intrusion Detection (RAID ’07), pp. 178–197 (2007)
8.
Zurück zum Zitat Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 30(7), 1145–1159 (1997)CrossRef Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 30(7), 1145–1159 (1997)CrossRef
9.
Zurück zum Zitat Denning, D.E.: An intrusion-detection model. IEEE Trans. Softw. Eng. 13(2), 222–232 (1987)CrossRef Denning, D.E.: An intrusion-detection model. IEEE Trans. Softw. Eng. 13(2), 222–232 (1987)CrossRef
10.
Zurück zum Zitat Do, C.B., Batzoglou, S.: What is the expectation maximization algorithm? Nat. Biotechnol. 26(8), 897–899 (2008)CrossRef Do, C.B., Batzoglou, S.: What is the expectation maximization algorithm? Nat. Biotechnol. 26(8), 897–899 (2008)CrossRef
15.
Zurück zum Zitat Kolter, J., Maloof, M.: Learning to detect malicious executables in the wild. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 470–478 (2004) Kolter, J., Maloof, M.: Learning to detect malicious executables in the wild. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 470–478 (2004)
16.
Zurück zum Zitat Kong, D., Yan, G.: Discriminant malware distance learning on structural information for automated malware classification. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1357–1365 (2013) Kong, D., Yan, G.: Discriminant malware distance learning on structural information for automated malware classification. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1357–1365 (2013)
17.
Zurück zum Zitat MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967) MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)
19.
Zurück zum Zitat Nappa, A., Zubair Rafique, M., Caballero, J.: Driving in the cloud: an analysis of drive-by download operations and abuse reporting. In: Proceedings of the 10th Conference on Detection of Intrusions and Malware and Vulnerability Assessment, Berlin, Germany, July (2013) Nappa, A., Zubair Rafique, M., Caballero, J.: Driving in the cloud: an analysis of drive-by download operations and abuse reporting. In: Proceedings of the 10th Conference on Detection of Intrusions and Malware and Vulnerability Assessment, Berlin, Germany, July (2013)
20.
21.
Zurück zum Zitat Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 7(2), 257–286 (1989)CrossRef Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 7(2), 257–286 (1989)CrossRef
22.
Zurück zum Zitat Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011)CrossRef Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011)CrossRef
23.
Zurück zum Zitat Schultz, M., Eskin, E., Zadok, F., Stolfo, S.: Data mining methods for detection of new malicious executables. In: Proceedings of 2001 IEEE Symposium on Security and Privacy, pp. 38–49 (2001) Schultz, M., Eskin, E., Zadok, F., Stolfo, S.: Data mining methods for detection of new malicious executables. In: Proceedings of 2001 IEEE Symposium on Security and Privacy, pp. 38–49 (2001)
27.
Zurück zum Zitat Stamp, M.: Information Security: Principles and Practice, 2nd edn. Wiley, New York (2011)CrossRef Stamp, M.: Information Security: Principles and Practice, 2nd edn. Wiley, New York (2011)CrossRef
28.
Zurück zum Zitat Stamp, M.: Machine learning with applications in information security (unpublished manuscript) Stamp, M.: Machine learning with applications in information security (unpublished manuscript)
32.
Zurück zum Zitat Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Comput. Virol. 2(3), 211–229 (2006)CrossRef Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Comput. Virol. 2(3), 211–229 (2006)CrossRef
Metadaten
Titel
Clustering for malware classification
verfasst von
Swathi Pai
Fabio Di Troia
Corrado Aaron Visaggio
Thomas H. Austin
Mark Stamp
Publikationsdatum
27.01.2016
Verlag
Springer Paris
Erschienen in
Journal of Computer Virology and Hacking Techniques / Ausgabe 2/2017
Elektronische ISSN: 2263-8733
DOI
https://doi.org/10.1007/s11416-016-0265-3

Weitere Artikel der Ausgabe 2/2017

Journal of Computer Virology and Hacking Techniques 2/2017 Zur Ausgabe

Premium Partner