Skip to main content
Top

2020 | OriginalPaper | Chapter

Stronger Targeted Poisoning Attacks Against Malware Detection

Authors : Shintaro Narisada, Shoichiro Sasaki, Seira Hidano, Toshihiro Uchibayashi, Takuo Suganuma, Masahiro Hiji, Shinsaku Kiyomoto

Published in: Cryptology and Network Security

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Attacks on machine learning systems such as malware detectors and recommendation systems are becoming a major threat. Data poisoning attacks are the primary method used; they inject a small amount of poisoning points into a training set of the machine learning model, aiming to degrade the overall accuracy of the model. Targeted data poisoning is a variant of data poisoning attacks that injects malicious data into the model to cause a misclassification of the targeted input data while keeping almost the same overall accuracy as the unpoisoned model. Sasaki et al. first applied targeted data poisoning to malware detection and proposed an algorithm to generate poisoning points to misclassify targeted malware as goodware. Their algorithm achieved \(85\%\) an attack success rate by adding \(15\%\) poisoning points for malware dataset with continuous variables while restricting the increase in the test error on nontargeted data to at most \(10\%\). In this paper, we consider common defensive methods called data sanitization defenses, against targeted data poisoning and propose a defense-aware attack algorithm. Moreover, we propose a stronger targeted poisoning algorithm based on the theoretical analysis of the optimal attack strategy proposed by Steinhardt et al. The computational cost of our algorithm is much less than that of existing targeted poisoning algorithms. As a result, our new algorithm achieves a \(91\%\) attack success rate for malware dataset with continuous variables by adding the same \(15\%\) poisoning points and is approximately \(10^3\) times faster in terms of the computational time needed to generate poison data than Sasaki’s algorithm.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Amos, B., Turner, H., White, J.: Applying machine learning classifiers to dynamic android malware detection at scale. In: 2013 9th International Wireless Communications and Mobile Computing Conference (IWCMC), pp. 1666–1671 (2013) Amos, B., Turner, H., White, J.: Applying machine learning classifiers to dynamic android malware detection at scale. In: 2013 9th International Wireless Communications and Mobile Computing Conference (IWCMC), pp. 1666–1671 (2013)
2.
go back to reference Anderson, H.S., Roth, P.: EMBER: an open dataset for training static PE malware machine learning models. arXiv preprint arXiv:1804.04637 (2018) Anderson, H.S., Roth, P.: EMBER: an open dataset for training static PE malware machine learning models. arXiv preprint arXiv:​1804.​04637 (2018)
3.
go back to reference Anindya, I.C., Kantarcioglu, M.: Adversarial anomaly detection using centroid-based clustering. In: 2018 IEEE International Conference on Information Reuse and Integration (IRI), pp. 1–8. IEEE (2018) Anindya, I.C., Kantarcioglu, M.: Adversarial anomaly detection using centroid-based clustering. In: 2018 IEEE International Conference on Information Reuse and Integration (IRI), pp. 1–8. IEEE (2018)
4.
go back to reference Baracaldo, N., Chen, B., Ludwig, H., Safavi, J.A.: Mitigating poisoning attacks on machine learning models: A data provenance based approach. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 103–110 (2017) Baracaldo, N., Chen, B., Ludwig, H., Safavi, J.A.: Mitigating poisoning attacks on machine learning models: A data provenance based approach. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 103–110 (2017)
6.
7.
go back to reference Bleha, S., Slivinsky, C., Hussien, B.: Computer-access security systems using keystroke dynamics. IEEE Trans. Pattern Anal. Mach. Intell. 12(12), 1217–1222 (1990)CrossRef Bleha, S., Slivinsky, C., Hussien, B.: Computer-access security systems using keystroke dynamics. IEEE Trans. Pattern Anal. Mach. Intell. 12(12), 1217–1222 (1990)CrossRef
8.
go back to reference Cen, L., Gates, C.S., Si, L., Li, N.: A probabilistic discriminative model for android malware detection with decompiled source code. IEEE Trans. Dependable Secure Comput. 12(4), 400–412 (2014)CrossRef Cen, L., Gates, C.S., Si, L., Li, N.: A probabilistic discriminative model for android malware detection with decompiled source code. IEEE Trans. Dependable Secure Comput. 12(4), 400–412 (2014)CrossRef
9.
go back to reference Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526 (2017) Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:​1712.​05526 (2017)
10.
go back to reference Cretu, G.F., Stavrou, A., Locasto, M.E., Stolfo, S.J., Keromytis, A.D.: Casting out demons: Sanitizing training data for anomaly sensors. In: 2008 IEEE Symposium on Security and Privacy (Sp 2008), pp. 81–95. IEEE (2008) Cretu, G.F., Stavrou, A., Locasto, M.E., Stolfo, S.J., Keromytis, A.D.: Casting out demons: Sanitizing training data for anomaly sensors. In: 2008 IEEE Symposium on Security and Privacy (Sp 2008), pp. 81–95. IEEE (2008)
11.
go back to reference Dai, J., Chen, C., Li, Y.: A backdoor attack against LSTM-based text classification systems. IEEE Access 7, 138872–138878 (2019)CrossRef Dai, J., Chen, C., Li, Y.: A backdoor attack against LSTM-based text classification systems. IEEE Access 7, 138872–138878 (2019)CrossRef
12.
go back to reference Diamond, S., Boyd, S.: CVXPY: a python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 17(83), 1–5 (2016)MathSciNetMATH Diamond, S., Boyd, S.: CVXPY: a python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 17(83), 1–5 (2016)MathSciNetMATH
13.
go back to reference Du, M., Jia, R., Song, D.: Robust anomaly detection and backdoor attack detection via differential privacy. arXiv preprint arXiv:1911.07116 (2019) Du, M., Jia, R., Song, D.: Robust anomaly detection and backdoor attack detection via differential privacy. arXiv preprint arXiv:​1911.​07116 (2019)
14.
go back to reference Firdausi, I., lim, C., Erwin, A., Nugroho, A.S.: Analysis of machine learning techniques used in behavior-based malware detection. In: 2010 Second International Conference on Advances in Computing, Control, and Telecommunication Technologies, pp. 201–203 (2010) Firdausi, I., lim, C., Erwin, A., Nugroho, A.S.: Analysis of machine learning techniques used in behavior-based malware detection. In: 2010 Second International Conference on Advances in Computing, Control, and Telecommunication Technologies, pp. 201–203 (2010)
15.
go back to reference Gavriluţ, D., Cimpoeşu, M., Anton, D., Ciortuz, L.: Malware detection using machine learning. In: 2009 International Multiconference on Computer Science and Information Technology, pp. 735–741 (2009) Gavriluţ, D., Cimpoeşu, M., Anton, D., Ciortuz, L.: Malware detection using machine learning. In: 2009 International Multiconference on Computer Science and Information Technology, pp. 735–741 (2009)
16.
go back to reference Ham, H.S., Choi, M.J.: Analysis of android malware detection performance using machine learning classifiers. In: 2013 international conference on ICT Convergence (ICTC), pp. 490–495. IEEE (2013) Ham, H.S., Choi, M.J.: Analysis of android malware detection performance using machine learning classifiers. In: 2013 international conference on ICT Convergence (ICTC), pp. 490–495. IEEE (2013)
17.
go back to reference Hardy, W., Chen, L., Hou, S., Ye, Y., Li, X.: DL4MD: a deep learning framework for intelligent malware detection. In: Proceedings of the International Conference on Data Mining (DMIN), p. 61 (2016) Hardy, W., Chen, L., Hou, S., Ye, Y., Li, X.: DL4MD: a deep learning framework for intelligent malware detection. In: Proceedings of the International Conference on Data Mining (DMIN), p. 61 (2016)
18.
go back to reference Hofmeyr, S.A., Forrest, S., Somayaji, A.: Intrusion detection using sequences of system calls. J. Comput. Secur. 6(3), 151–180 (1998)CrossRef Hofmeyr, S.A., Forrest, S., Somayaji, A.: Intrusion detection using sequences of system calls. J. Comput. Secur. 6(3), 151–180 (1998)CrossRef
19.
go back to reference Jung, W., Kim, S., Choi, S.: Poster: deep learning for zero-day flash malware detection. In: 36th IEEE Symposium on Security and Privacy, vol. 10, pp. 2809695–2817880 (2015) Jung, W., Kim, S., Choi, S.: Poster: deep learning for zero-day flash malware detection. In: 36th IEEE Symposium on Security and Privacy, vol. 10, pp. 2809695–2817880 (2015)
21.
go back to reference Koh, P.W., Liang, P.: Understanding black-box predictions via influence functions. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70. pp. 1885–1894. (2017) JMLR. org Koh, P.W., Liang, P.: Understanding black-box predictions via influence functions. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70. pp. 1885–1894. (2017) JMLR. org
22.
go back to reference Koh, P.W., Steinhardt, J., Liang, P.: Stronger data poisoning attacks break data sanitization defenses. arXiv preprint arXiv:1811.00741 (2018) Koh, P.W., Steinhardt, J., Liang, P.: Stronger data poisoning attacks break data sanitization defenses. arXiv preprint arXiv:​1811.​00741 (2018)
23.
go back to reference Kolosnjaji, B., et al.: Adversarial malware binaries: Evading deep learning for malware detection in executables. In: 2018 26th European Signal Processing Conference (EUSIPCO), pp. 533–537. IEEE (2018) Kolosnjaji, B., et al.: Adversarial malware binaries: Evading deep learning for malware detection in executables. In: 2018 26th European Signal Processing Conference (EUSIPCO), pp. 533–537. IEEE (2018)
24.
go back to reference Konečnỳ, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016) Konečnỳ, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:​1610.​05492 (2016)
25.
go back to reference Kumar, B.J., Naveen, H., Kumar, B.P., Sharma, S.S., Villegas, J.: Logistic regression for polymorphic malware detection using anova f-test. In: 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp. 1–5. IEEE (2017) Kumar, B.J., Naveen, H., Kumar, B.P., Sharma, S.S., Villegas, J.: Logistic regression for polymorphic malware detection using anova f-test. In: 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp. 1–5. IEEE (2017)
26.
go back to reference Kwon, J., Lee, H.: Bingraph: Discovering mutant malware using hierarchical semantic signatures. In: 2012 7th International Conference on Malicious and Unwanted Software, pp. 104–111 (2012) Kwon, J., Lee, H.: Bingraph: Discovering mutant malware using hierarchical semantic signatures. In: 2012 7th International Conference on Malicious and Unwanted Software, pp. 104–111 (2012)
27.
go back to reference Laskov, P., Schäfer, C., Kotenko, I., Müller, K.R.: Intrusion detection in unlabeled data with quarter-sphere support vector machines. PIK-praxis der Informationsverarbeitung und Kommunikation 27(4), 228–236 (2004)CrossRef Laskov, P., Schäfer, C., Kotenko, I., Müller, K.R.: Intrusion detection in unlabeled data with quarter-sphere support vector machines. PIK-praxis der Informationsverarbeitung und Kommunikation 27(4), 228–236 (2004)CrossRef
28.
go back to reference Li, W.J., Wang, K., Stolfo, S.J., Herzog, B.: Fileprints: identifying file types by n-gram analysis. In: Proceedings from the Sixth Annual IEEE SMC Information Assurance Workshop, pp. 64–71. IEEE (2005) Li, W.J., Wang, K., Stolfo, S.J., Herzog, B.: Fileprints: identifying file types by n-gram analysis. In: Proceedings from the Sixth Annual IEEE SMC Information Assurance Workshop, pp. 64–71. IEEE (2005)
30.
go back to reference Liu, Y., et al.: Trojaning attack on neural networks (2017) Liu, Y., et al.: Trojaning attack on neural networks (2017)
31.
go back to reference McLaughlin, N. et al.: Deep android malware detection. In: Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, CODASPY 2017, p. 301-308. Association for Computing Machinery, New York (2017) McLaughlin, N. et al.: Deep android malware detection. In: Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, CODASPY 2017, p. 301-308. Association for Computing Machinery, New York (2017)
32.
go back to reference Muñoz-González, L., et al.: Towards poisoning of deep learning algorithms with back-gradient optimization. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 27–38 (2017) Muñoz-González, L., et al.: Towards poisoning of deep learning algorithms with back-gradient optimization. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 27–38 (2017)
35.
go back to reference Rieck, K., Laskov, P.: Language models for detection of unknown attacks in network traffic. J. Comput. Virol. 2(4), 243–256 (2007)CrossRef Rieck, K., Laskov, P.: Language models for detection of unknown attacks in network traffic. J. Comput. Virol. 2(4), 243–256 (2007)CrossRef
36.
go back to reference Ronen, R., Radu, M., Feuerstein, C., Yom-Tov, E., Ahmadi, M.: Microsoft malware classification challenge. arXiv preprint arXiv:1802.10135 (2018) Ronen, R., Radu, M., Feuerstein, C., Yom-Tov, E., Ahmadi, M.: Microsoft malware classification challenge. arXiv preprint arXiv:​1802.​10135 (2018)
37.
go back to reference Santos, I., Devesa, J., Brezo, F., Nieves, J., Bringas, P.G.: OPEM: a static-dynamic approach for machine-learning-based malware detection. In: Herrero, À. et al. (eds.) International Joint Conference CISIS’ 12-ICEUTE’ 12-SOCO’ 12 Special Sessions. Advances in Intelligent Systems and Computing, vol. 189. Springer, Berlin (2013) https://doi.org/10.1007/978-3-642-33018-6_28 Santos, I., Devesa, J., Brezo, F., Nieves, J., Bringas, P.G.: OPEM: a static-dynamic approach for machine-learning-based malware detection. In: Herrero, À. et al. (eds.) International Joint Conference CISIS’ 12-ICEUTE’ 12-SOCO’ 12 Special Sessions. Advances in Intelligent Systems and Computing, vol. 189. Springer, Berlin (2013) https://​doi.​org/​10.​1007/​978-3-642-33018-6_​28
38.
go back to reference Sasaki, S., Hidano, S., Uchibayashi, T., Suganuma, T., Hiji, M., Kiyomoto, S.: On embedding backdoor in malware detectors using machine learning. In: 2019 17th International Conference on Privacy, Security and Trust (PST), pp. 1–5. IEEE (2019) Sasaki, S., Hidano, S., Uchibayashi, T., Suganuma, T., Hiji, M., Kiyomoto, S.: On embedding backdoor in malware detectors using machine learning. In: 2019 17th International Conference on Privacy, Security and Trust (PST), pp. 1–5. IEEE (2019)
39.
go back to reference Sgandurra, D., Muñoz-González, L., Mohsen, R., Lupu, E.C.: Automated dynamic analysis of ransomware: Benefits, limitations and use for detection. arXiv preprint arXiv:1609.03020 (2016) Sgandurra, D., Muñoz-González, L., Mohsen, R., Lupu, E.C.: Automated dynamic analysis of ransomware: Benefits, limitations and use for detection. arXiv preprint arXiv:​1609.​03020 (2016)
40.
go back to reference Shafahi, A., et al.: Poison frogs! targeted clean-label poisoning attacks on neural networks. In: Advances in Neural Information Processing Systems, pp. 6103–6113 (2018) Shafahi, A., et al.: Poison frogs! targeted clean-label poisoning attacks on neural networks. In: Advances in Neural Information Processing Systems, pp. 6103–6113 (2018)
41.
go back to reference Siddiqui, M., Wang, M.C., Lee, J.: Data mining methods for malware detection using instruction sequences. In: Artificial Intelligence and Applications, pp. 358–363 (2008) Siddiqui, M., Wang, M.C., Lee, J.: Data mining methods for malware detection using instruction sequences. In: Artificial Intelligence and Applications, pp. 358–363 (2008)
42.
go back to reference Steinhardt, J., Koh, P.W.W., Liang, P.S.: Certified defenses for data poisoning attacks. In: Advances in Neural Information Processing Systems, pp. 3517–3529 (2017) Steinhardt, J., Koh, P.W.W., Liang, P.S.: Certified defenses for data poisoning attacks. In: Advances in Neural Information Processing Systems, pp. 3517–3529 (2017)
43.
go back to reference Tran, B., Li, J., Madry, A.: Spectral signatures in backdoor attacks. In: Advances in Neural Information Processing Systems, pp. 8000–8010 (2018) Tran, B., Li, J., Madry, A.: Spectral signatures in backdoor attacks. In: Advances in Neural Information Processing Systems, pp. 8000–8010 (2018)
44.
go back to reference Wang, B., et al.: Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 707–723. IEEE (2019) Wang, B., et al.: Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 707–723. IEEE (2019)
45.
go back to reference Xiao, H., Xiao, H., Eckert, C.: Adversarial label flips attack on support vector machines. In: ECAI, pp. 870–875 (2012) Xiao, H., Xiao, H., Eckert, C.: Adversarial label flips attack on support vector machines. In: ECAI, pp. 870–875 (2012)
46.
go back to reference Yuan, Z., Lu, Y., Wang, Z., Xue, Y.: Droid-sec: deep learning in android malware detection. In: Proceedings of the 2014 ACM conference on SIGCOMM, pp. 371–372 (2014) Yuan, Z., Lu, Y., Wang, Z., Xue, Y.: Droid-sec: deep learning in android malware detection. In: Proceedings of the 2014 ACM conference on SIGCOMM, pp. 371–372 (2014)
Metadata
Title
Stronger Targeted Poisoning Attacks Against Malware Detection
Authors
Shintaro Narisada
Shoichiro Sasaki
Seira Hidano
Toshihiro Uchibayashi
Takuo Suganuma
Masahiro Hiji
Shinsaku Kiyomoto
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-65411-5_4

Premium Partner