Skip to main content
Erschienen in: Neural Computing and Applications 2/2019

25.07.2017 | Original Article

Malware detection based on deep learning algorithm

verfasst von: Ding Yuxin, Zhu Siyi

Erschienen in: Neural Computing and Applications | Ausgabe 2/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this study we represent malware as opcode sequences and detect it using a deep belief network (DBN). Compared with traditional shallow neural networks, DBNs can use unlabeled data to pretrain a multi-layer generative model, which can better represent the characteristics of data samples. We compare the performance of DBNs with that of three baseline malware detection models, which use support vector machines, decision trees, and the k-nearest neighbor algorithm as classifiers. The experiments demonstrate that the DBN model provides more accurate detection than the baseline models. When additional unlabeled data are used for DBN pretraining, the DBNs perform better than the other detection models. We also use the DBNs as an autoencoder to extract the feature vectors of executables. The experiments indicate that the autoencoder can effectively model the underlying structure of input data and significantly reduce the dimensions of feature vectors.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ahmed F, Hameed H, Shafiq MZ, Farooq M (2009) Using spatio-temporal information in API calls with machine learning algorithms for malware detection. In: AISec ‘09 Proceedings of the 2nd ACM workshop on Security and artificial intelligence, pp 55–62 Ahmed F, Hameed H, Shafiq MZ, Farooq M (2009) Using spatio-temporal information in API calls with machine learning algorithms for malware detection. In: AISec ‘09 Proceedings of the 2nd ACM workshop on Security and artificial intelligence, pp 55–62
2.
Zurück zum Zitat Christodorescu M, Jha S (2004) Testing malware detectors. In: ACM SIGSOFT international symposium on software testing and analysis (ISSTA ‘04), Boston, USA, pp 34–44 Christodorescu M, Jha S (2004) Testing malware detectors. In: ACM SIGSOFT international symposium on software testing and analysis (ISSTA ‘04), Boston, USA, pp 34–44
3.
Zurück zum Zitat Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pretrained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20(1):30–41CrossRef Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pretrained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20(1):30–41CrossRef
4.
Zurück zum Zitat Ding Y, Dai W, Yan S et al (2014) Control flow-based opcode behavior analysis for malware detection. Comput Secur 44(1):64–82 Ding Y, Dai W, Yan S et al (2014) Control flow-based opcode behavior analysis for malware detection. Comput Secur 44(1):64–82
5.
Zurück zum Zitat Elhadi AAE, Maarof MA, Barry BIA, Hamza H (2014) Enhancing the detection of metamorphic malware using call graphs. Comput Secur 46:62–78CrossRef Elhadi AAE, Maarof MA, Barry BIA, Hamza H (2014) Enhancing the detection of metamorphic malware using call graphs. Comput Secur 46:62–78CrossRef
6.
Zurück zum Zitat Erhan D, Bengio Y, Courville A, Manzagol P, Vincent P (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11:625–660MathSciNetMATH Erhan D, Bengio Y, Courville A, Manzagol P, Vincent P (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11:625–660MathSciNetMATH
7.
Zurück zum Zitat Eskandari M, Hashemi S (2012) A graph mining approach for detecting unknown malwares. J Visu Lang Comput 23(3):154–162CrossRef Eskandari M, Hashemi S (2012) A graph mining approach for detecting unknown malwares. J Visu Lang Comput 23(3):154–162CrossRef
9.
Zurück zum Zitat Henchiri O, Japkowicz N (2006) A feature selection and evaluation scheme for computer virus detection. In: Proceedings ofICDM-2006, Hong Kong, pp 891–895 Henchiri O, Japkowicz N (2006) A feature selection and evaluation scheme for computer virus detection. In: Proceedings ofICDM-2006, Hong Kong, pp 891–895
10.
Zurück zum Zitat Hinton G et al (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag 29(6):82–97CrossRef Hinton G et al (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag 29(6):82–97CrossRef
11.
12.
Zurück zum Zitat Islam R et al (2013) Classification of malware based on integrated static and dynamic features. J Netw Comput Appl 36:646–656CrossRef Islam R et al (2013) Classification of malware based on integrated static and dynamic features. J Netw Comput Appl 36:646–656CrossRef
13.
Zurück zum Zitat Kolter JZ, Maloof MA (2004) Learning to detect malicious executables in the wild. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, New York, NY, pp 470–478 Kolter JZ, Maloof MA (2004) Learning to detect malicious executables in the wild. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, New York, NY, pp 470–478
14.
Zurück zum Zitat Manuel E, Theodoor S, Engin K, Christopher K (2012) A survey on automated dynamic malware-analysis techniques and tools. ACM Comput Surv 44(2):1–42 Manuel E, Theodoor S, Engin K, Christopher K (2012) A survey on automated dynamic malware-analysis techniques and tools. ACM Comput Surv 44(2):1–42
15.
Zurück zum Zitat Mitchell TM (1997) Machine learning. McGraw-Hill, New York. ISBN: 0070428077 Mitchell TM (1997) Machine learning. McGraw-Hill, New York. ISBN: 0070428077
16.
Zurück zum Zitat Moskovitch R, Feher C, Zachar N, Berger E, Gitelman M, Dolev S, et al (2008a) Unknown malcode detection using OPCODE representation. In: European conference on intelligence and security informatics 2008 (EuroISI08), Esbjerg, Denmark, pp 204–215 Moskovitch R, Feher C, Zachar N, Berger E, Gitelman M, Dolev S, et al (2008a) Unknown malcode detection using OPCODE representation. In: European conference on intelligence and security informatics 2008 (EuroISI08), Esbjerg, Denmark, pp 204–215
17.
Zurück zum Zitat Moskovitch R, Stopel D, Feher C, Nissim N, Elovici Y (2008b) Unknown malcode detection via text categorization and the imbalance problem. In: IEEE intelligence and security informatics, Taiwan, pp 156–161 Moskovitch R, Stopel D, Feher C, Nissim N, Elovici Y (2008b) Unknown malcode detection via text categorization and the imbalance problem. In: IEEE intelligence and security informatics, Taiwan, pp 156–161
19.
20.
Zurück zum Zitat Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24:513–523CrossRef Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24:513–523CrossRef
21.
Zurück zum Zitat Santos I, Brezo F, Ugarte-pedrero X, Bringas PG (2013) Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf Sci 231:64–82MathSciNetCrossRef Santos I, Brezo F, Ugarte-pedrero X, Bringas PG (2013) Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf Sci 231:64–82MathSciNetCrossRef
22.
Zurück zum Zitat Sarikaya R, Hinton GE, Deoras A (2014) Application of deep belief networks for natural language understanding. IEEE Trans Audio Speech Lang Process 22(4):778–784CrossRef Sarikaya R, Hinton GE, Deoras A (2014) Application of deep belief networks for natural language understanding. IEEE Trans Audio Speech Lang Process 22(4):778–784CrossRef
23.
Zurück zum Zitat Saxe J, Berlin K (2015) Deep neural network based malware detection using two dimensional binary program features. In: International conference on malicious & unwanted software, pp 11–20 Saxe J, Berlin K (2015) Deep neural network based malware detection using two dimensional binary program features. In: International conference on malicious & unwanted software, pp 11–20
24.
Zurück zum Zitat Shabtai A, Moskovitch R, Elovici Y, Glezer C (2009) Detection of malicious code by applying machine learning classifiers on static features—a state-of-the-art survey. Inf Secur Tech Rep 14(1):16–29CrossRef Shabtai A, Moskovitch R, Elovici Y, Glezer C (2009) Detection of malicious code by applying machine learning classifiers on static features—a state-of-the-art survey. Inf Secur Tech Rep 14(1):16–29CrossRef
25.
Zurück zum Zitat Schultz MG, Eskin E, Zadok E, Stolfo SJ (2001) Data mining methods for detection of new malicious executables. In: Proceedings of the IEEE symposium on security and privacy, Oakland USA, pp 38–49 Schultz MG, Eskin E, Zadok E, Stolfo SJ (2001) Data mining methods for detection of new malicious executables. In: Proceedings of the IEEE symposium on security and privacy, Oakland USA, pp 38–49
26.
Zurück zum Zitat Stopel D, Boger Z, Moskovitch R, Shahar Y, Elovici Y (2006a) Application of Artificial Neural Networks Techniques to Computer Worm Detections. In: Proceedings of IEEE international joint conference on neural networks, Vancouver Stopel D, Boger Z, Moskovitch R, Shahar Y, Elovici Y (2006a) Application of Artificial Neural Networks Techniques to Computer Worm Detections. In: Proceedings of IEEE international joint conference on neural networks, Vancouver
27.
Zurück zum Zitat Stopel D, Boger Z, Moskovitch R, Shahar Y, Elovici Y (2006b) Improving worm detection with artificial neural networks through feature selection and temporal analysis techniques. In: Proceedings of the third international conference on neural networks, Barcelona Stopel D, Boger Z, Moskovitch R, Shahar Y, Elovici Y (2006b) Improving worm detection with artificial neural networks through feature selection and temporal analysis techniques. In: Proceedings of the third international conference on neural networks, Barcelona
28.
Zurück zum Zitat Tian R, Islam R, Batten L, Versteeg S (2010) Differentiating malware from cleanware using behavioral analysis. In: Proceedings of the 5th international conference on malicious and unwanted software: MALWARE 2010, pp 23–30 Tian R, Islam R, Batten L, Versteeg S (2010) Differentiating malware from cleanware using behavioral analysis. In: Proceedings of the 5th international conference on malicious and unwanted software: MALWARE 2010, pp 23–30
29.
Zurück zum Zitat Yeung DY, Ding Y (2003) Host-based intrusion detection using dynamic and static behavioral models. Pattern Recognit 36(1):229–243CrossRefMATH Yeung DY, Ding Y (2003) Host-based intrusion detection using dynamic and static behavioral models. Pattern Recognit 36(1):229–243CrossRefMATH
30.
Zurück zum Zitat Yuan MY (2014) Data mining and machine learning: WEKA applied technology and practice. Tsinghua University Press. ISBN: 978-7302371748 Yuan MY (2014) Data mining and machine learning: WEKA applied technology and practice. Tsinghua University Press. ISBN: 978-7302371748
31.
Zurück zum Zitat Zhao Z, Wang J, Bai J (2014) Malware detection method based on the control-flow construct feature of software. Inf Secur IET 8(1):18–24CrossRef Zhao Z, Wang J, Bai J (2014) Malware detection method based on the control-flow construct feature of software. Inf Secur IET 8(1):18–24CrossRef
Metadaten
Titel
Malware detection based on deep learning algorithm
verfasst von
Ding Yuxin
Zhu Siyi
Publikationsdatum
25.07.2017
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 2/2019
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-017-3077-6

Weitere Artikel der Ausgabe 2/2019

Neural Computing and Applications 2/2019 Zur Ausgabe