Skip to main content
Erschienen in: Journal of Computer Virology and Hacking Techniques 1/2019

27.08.2018 | Original Paper

Using convolutional neural networks for classification of malware represented as images

verfasst von: Daniel Gibert, Carles Mateu, Jordi Planes, Ramon Vicens

Erschienen in: Journal of Computer Virology and Hacking Techniques | Ausgabe 1/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The number of malicious files detected every year are counted by millions. One of the main reasons for these high volumes of different files is the fact that, in order to evade detection, malware authors add mutation. This means that malicious files belonging to the same family, with the same malicious behavior, are constantly modified or obfuscated using several techniques, in such a way that they look like different files. In order to be effective in analyzing and classifying such large amounts of files, we need to be able to categorize them into groups and identify their respective families on the basis of their behavior. In this paper, malicious software is visualized as gray scale images since its ability to capture minor changes while retaining the global structure helps to detect variations. Motivated by the visual similarity between malware samples of the same family, we propose a file agnostic deep learning approach for malware categorization to efficiently group malicious software into families based on a set of discriminant patterns extracted from their visualization as images. The suitability of our approach is evaluated against two benchmarks: the MalImg dataset and the Microsoft Malware Classification Challenge dataset. Experimental comparison demonstrates its superior performance with respect to state-of-the-art techniques.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ahmadi, M., Giacinto, G., Ulyanov, D., Semenov, S., Trofimov, M.: Novel feature extraction, selection and fusion for effective malware family classification. CoRR abs/1511.04317 (2015) Ahmadi, M., Giacinto, G., Ulyanov, D., Semenov, S., Trofimov, M.: Novel feature extraction, selection and fusion for effective malware family classification. CoRR abs/1511.04317 (2015)
3.
Zurück zum Zitat Bat-Erdene, M., Park, H., Li, H., Lee, H., Choi, M.S.: Entropy analysis to classify unknown packing algorithms for malware detection. Int. J. Inf. Secur. 16(3), 227–248 (2017)CrossRef Bat-Erdene, M., Park, H., Li, H., Lee, H., Choi, M.S.: Entropy analysis to classify unknown packing algorithms for malware detection. Int. J. Inf. Secur. 16(3), 227–248 (2017)CrossRef
4.
Zurück zum Zitat Billar, D.: Opcodes as predictor for malware. Int. J. Electron. Secur. Digit. Forensics 1, 156–168 (2007)CrossRef Billar, D.: Opcodes as predictor for malware. Int. J. Electron. Secur. Digit. Forensics 1, 156–168 (2007)CrossRef
5.
Zurück zum Zitat Chandrasekar Ravi, R.M.: Malware detection using windows API sequence and machine learning. Int. J. Comput. Appl. 43, 12–16 (2012) Chandrasekar Ravi, R.M.: Malware detection using windows API sequence and machine learning. Int. J. Comput. Appl. 43, 12–16 (2012)
7.
Zurück zum Zitat Gandotra, E., Bansal, D., Sofat, S.: Malware analysis and classification: a survey. J. Inf. Secur. 5, 56–64 (2014) Gandotra, E., Bansal, D., Sofat, S.: Malware analysis and classification: a survey. J. Inf. Secur. 5, 56–64 (2014)
8.
Zurück zum Zitat Ghiasi, M., Sami, A., Salehi, Z.: Dynamic VSA: a framework for malware detection based on register contents. Eng. Appl. Artif. Intell. 44, 111–122 (2015)CrossRef Ghiasi, M., Sami, A., Salehi, Z.: Dynamic VSA: a framework for malware detection based on register contents. Eng. Appl. Artif. Intell. 44, 111–122 (2015)CrossRef
9.
Zurück zum Zitat Gibert, D., Bejar, J., Mateu, C., Planes, J., Solis, D., Vicens, R.: Convolutional neural networks for classification of malware assembly code. In: International Conference of the Catalan Association for Artificial Intelligence, pp. 221–226 (2017). https://doi.org/10.3233/978-1-61499-806-8-221 Gibert, D., Bejar, J., Mateu, C., Planes, J., Solis, D., Vicens, R.: Convolutional neural networks for classification of malware assembly code. In: International Conference of the Catalan Association for Artificial Intelligence, pp. 221–226 (2017). https://​doi.​org/​10.​3233/​978-1-61499-806-8-221
10.
Zurück zum Zitat Gibert, D., Mateu, C., Planes, J., Vicens, R.: Classification of malware by using structural entropy on convolutional neural networks. In: AAAI Conference on Artificial Intelligence (2018) Gibert, D., Mateu, C., Planes, J., Vicens, R.: Classification of malware by using structural entropy on convolutional neural networks. In: AAAI Conference on Artificial Intelligence (2018)
11.
Zurück zum Zitat Haralick, R.M., Shanmugam, K., Dinstein, I.: Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. SMC–3(6), 610–621 (1973)CrossRef Haralick, R.M., Shanmugam, K., Dinstein, I.: Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. SMC–3(6), 610–621 (1973)CrossRef
12.
Zurück zum Zitat Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psych. 24, 417–441 (1933)CrossRefMATH Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psych. 24, 417–441 (1933)CrossRefMATH
13.
Zurück zum Zitat Hubel, D.H., Wiesel, T.N.: Receptive fields and functional architecture of monkey striate cortex. J. Physiol. (Lond.) 195, 215–243 (1968)CrossRef Hubel, D.H., Wiesel, T.N.: Receptive fields and functional architecture of monkey striate cortex. J. Physiol. (Lond.) 195, 215–243 (1968)CrossRef
14.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS’12, pp. 1097–1105. Curran Associates Inc., USA (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS’12, pp. 1097–1105. Curran Associates Inc., USA (2012)
15.
Zurück zum Zitat Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp. 2278–2324 (1998) Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp. 2278–2324 (1998)
17.
Zurück zum Zitat Lyda, R., Hamrock, J.: Using entropy analysis to find encrypted and packed malware. IEEE Secur. Anal. 5, 40–45 (2007)CrossRef Lyda, R., Hamrock, J.: Using entropy analysis to find encrypted and packed malware. IEEE Secur. Anal. 5, 40–45 (2007)CrossRef
18.
Zurück zum Zitat Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, pp. 807–814. Omnipress, USA (2010) Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, pp. 807–814. Omnipress, USA (2010)
19.
Zurück zum Zitat Narayanan, B.N., Djaneye-Boundjou, O., Kebede, T.M.: Performance analysis of machine learning and pattern recognition algorithms for malware classification. In: Aerospace and Electronics Conference (NAECON) and Ohio Innovation Summit (OIS), 2016 IEEE National, pp. 338–342. IEEE (2016) Narayanan, B.N., Djaneye-Boundjou, O., Kebede, T.M.: Performance analysis of machine learning and pattern recognition algorithms for malware classification. In: Aerospace and Electronics Conference (NAECON) and Ohio Innovation Summit (OIS), 2016 IEEE National, pp. 338–342. IEEE (2016)
20.
Zurück zum Zitat Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, VizSec ’11, pp. 4:1–4:7. ACM, New York, NY, USA (2011) Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, VizSec ’11, pp. 4:1–4:7. ACM, New York, NY, USA (2011)
21.
Zurück zum Zitat Ojala, T., Pietikainen, M., Harwood, D.: Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition, 1994. Vol. 1—Conference A: Computer Vision amp; Image Processing, vol. 1 (1994) Ojala, T., Pietikainen, M., Harwood, D.: Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition, 1994. Vol. 1—Conference A: Computer Vision amp; Image Processing, vol. 1 (1994)
22.
Zurück zum Zitat Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)CrossRefMATH Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)CrossRefMATH
23.
Zurück zum Zitat Ranvee, S., Hiray, S.: Comparative analysis of feature extraction methods of malware detection. Int. J. Comput. Appl. 120, 1–7 (2015) Ranvee, S., Hiray, S.: Comparative analysis of feature extraction methods of malware detection. Int. J. Comput. Appl. 120, 1–7 (2015)
24.
Zurück zum Zitat Salehi, Z., Sami, A., Ghiasi, M.: MAAR: robust features to detect malicious activity based on api calls, their arguments and return values. Eng. Appl. Artif. Intell. 59, 93–102 (2017)CrossRef Salehi, Z., Sami, A., Ghiasi, M.: MAAR: robust features to detect malicious activity based on api calls, their arguments and return values. Eng. Appl. Artif. Intell. 59, 93–102 (2017)CrossRef
27.
Zurück zum Zitat Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)MathSciNetMATH Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)MathSciNetMATH
29.
Zurück zum Zitat Tesauro, G., Kephart, J., Sorkin, G.B.: Neural networks for computer virus recognition. In: IEEE International Conference on Intelligence and Security Informatics, vol. 11 (1996) Tesauro, G., Kephart, J., Sorkin, G.B.: Neural networks for computer virus recognition. In: IEEE International Conference on Intelligence and Security Informatics, vol. 11 (1996)
30.
Zurück zum Zitat Turkowski, K.: Filters for common resampling tasks. In: Glassner, A.S. (ed.) Graphics Gems, pp. 147–165. Academic Press Professional Inc., San Diego, CA (1990)CrossRef Turkowski, K.: Filters for common resampling tasks. In: Glassner, A.S. (ed.) Graphics Gems, pp. 147–165. Academic Press Professional Inc., San Diego, CA (1990)CrossRef
31.
Zurück zum Zitat Wojnowicz, M., Chisholm, G., Wolff, M.: Suspiciously structured entropy: wavelet decomposition of software entropy reveals symptoms of malware in the energy spectrum. In: Florida Artificial Intelligence Research Society Conference (2016) Wojnowicz, M., Chisholm, G., Wolff, M.: Suspiciously structured entropy: wavelet decomposition of software entropy reveals symptoms of malware in the energy spectrum. In: Florida Artificial Intelligence Research Society Conference (2016)
Metadaten
Titel
Using convolutional neural networks for classification of malware represented as images
verfasst von
Daniel Gibert
Carles Mateu
Jordi Planes
Ramon Vicens
Publikationsdatum
27.08.2018
Verlag
Springer Paris
Erschienen in
Journal of Computer Virology and Hacking Techniques / Ausgabe 1/2019
Elektronische ISSN: 2263-8733
DOI
https://doi.org/10.1007/s11416-018-0323-0

Weitere Artikel der Ausgabe 1/2019

Journal of Computer Virology and Hacking Techniques 1/2019 Zur Ausgabe