Skip to main content

2020 | OriginalPaper | Buchkapitel

Stronger Targeted Poisoning Attacks Against Malware Detection

verfasst von : Shintaro Narisada, Shoichiro Sasaki, Seira Hidano, Toshihiro Uchibayashi, Takuo Suganuma, Masahiro Hiji, Shinsaku Kiyomoto

Erschienen in: Cryptology and Network Security

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Attacks on machine learning systems such as malware detectors and recommendation systems are becoming a major threat. Data poisoning attacks are the primary method used; they inject a small amount of poisoning points into a training set of the machine learning model, aiming to degrade the overall accuracy of the model. Targeted data poisoning is a variant of data poisoning attacks that injects malicious data into the model to cause a misclassification of the targeted input data while keeping almost the same overall accuracy as the unpoisoned model. Sasaki et al. first applied targeted data poisoning to malware detection and proposed an algorithm to generate poisoning points to misclassify targeted malware as goodware. Their algorithm achieved \(85\%\) an attack success rate by adding \(15\%\) poisoning points for malware dataset with continuous variables while restricting the increase in the test error on nontargeted data to at most \(10\%\). In this paper, we consider common defensive methods called data sanitization defenses, against targeted data poisoning and propose a defense-aware attack algorithm. Moreover, we propose a stronger targeted poisoning algorithm based on the theoretical analysis of the optimal attack strategy proposed by Steinhardt et al. The computational cost of our algorithm is much less than that of existing targeted poisoning algorithms. As a result, our new algorithm achieves a \(91\%\) attack success rate for malware dataset with continuous variables by adding the same \(15\%\) poisoning points and is approximately \(10^3\) times faster in terms of the computational time needed to generate poison data than Sasaki’s algorithm.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Amos, B., Turner, H., White, J.: Applying machine learning classifiers to dynamic android malware detection at scale. In: 2013 9th International Wireless Communications and Mobile Computing Conference (IWCMC), pp. 1666–1671 (2013) Amos, B., Turner, H., White, J.: Applying machine learning classifiers to dynamic android malware detection at scale. In: 2013 9th International Wireless Communications and Mobile Computing Conference (IWCMC), pp. 1666–1671 (2013)
2.
Zurück zum Zitat Anderson, H.S., Roth, P.: EMBER: an open dataset for training static PE malware machine learning models. arXiv preprint arXiv:1804.04637 (2018) Anderson, H.S., Roth, P.: EMBER: an open dataset for training static PE malware machine learning models. arXiv preprint arXiv:​1804.​04637 (2018)
3.
Zurück zum Zitat Anindya, I.C., Kantarcioglu, M.: Adversarial anomaly detection using centroid-based clustering. In: 2018 IEEE International Conference on Information Reuse and Integration (IRI), pp. 1–8. IEEE (2018) Anindya, I.C., Kantarcioglu, M.: Adversarial anomaly detection using centroid-based clustering. In: 2018 IEEE International Conference on Information Reuse and Integration (IRI), pp. 1–8. IEEE (2018)
4.
Zurück zum Zitat Baracaldo, N., Chen, B., Ludwig, H., Safavi, J.A.: Mitigating poisoning attacks on machine learning models: A data provenance based approach. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 103–110 (2017) Baracaldo, N., Chen, B., Ludwig, H., Safavi, J.A.: Mitigating poisoning attacks on machine learning models: A data provenance based approach. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 103–110 (2017)
6.
Zurück zum Zitat Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines. arXiv preprint arXiv:1206.6389 (2012) Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines. arXiv preprint arXiv:​1206.​6389 (2012)
7.
Zurück zum Zitat Bleha, S., Slivinsky, C., Hussien, B.: Computer-access security systems using keystroke dynamics. IEEE Trans. Pattern Anal. Mach. Intell. 12(12), 1217–1222 (1990)CrossRef Bleha, S., Slivinsky, C., Hussien, B.: Computer-access security systems using keystroke dynamics. IEEE Trans. Pattern Anal. Mach. Intell. 12(12), 1217–1222 (1990)CrossRef
8.
Zurück zum Zitat Cen, L., Gates, C.S., Si, L., Li, N.: A probabilistic discriminative model for android malware detection with decompiled source code. IEEE Trans. Dependable Secure Comput. 12(4), 400–412 (2014)CrossRef Cen, L., Gates, C.S., Si, L., Li, N.: A probabilistic discriminative model for android malware detection with decompiled source code. IEEE Trans. Dependable Secure Comput. 12(4), 400–412 (2014)CrossRef
9.
Zurück zum Zitat Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526 (2017) Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:​1712.​05526 (2017)
10.
Zurück zum Zitat Cretu, G.F., Stavrou, A., Locasto, M.E., Stolfo, S.J., Keromytis, A.D.: Casting out demons: Sanitizing training data for anomaly sensors. In: 2008 IEEE Symposium on Security and Privacy (Sp 2008), pp. 81–95. IEEE (2008) Cretu, G.F., Stavrou, A., Locasto, M.E., Stolfo, S.J., Keromytis, A.D.: Casting out demons: Sanitizing training data for anomaly sensors. In: 2008 IEEE Symposium on Security and Privacy (Sp 2008), pp. 81–95. IEEE (2008)
11.
Zurück zum Zitat Dai, J., Chen, C., Li, Y.: A backdoor attack against LSTM-based text classification systems. IEEE Access 7, 138872–138878 (2019)CrossRef Dai, J., Chen, C., Li, Y.: A backdoor attack against LSTM-based text classification systems. IEEE Access 7, 138872–138878 (2019)CrossRef
12.
Zurück zum Zitat Diamond, S., Boyd, S.: CVXPY: a python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 17(83), 1–5 (2016)MathSciNetMATH Diamond, S., Boyd, S.: CVXPY: a python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 17(83), 1–5 (2016)MathSciNetMATH
13.
Zurück zum Zitat Du, M., Jia, R., Song, D.: Robust anomaly detection and backdoor attack detection via differential privacy. arXiv preprint arXiv:1911.07116 (2019) Du, M., Jia, R., Song, D.: Robust anomaly detection and backdoor attack detection via differential privacy. arXiv preprint arXiv:​1911.​07116 (2019)
14.
Zurück zum Zitat Firdausi, I., lim, C., Erwin, A., Nugroho, A.S.: Analysis of machine learning techniques used in behavior-based malware detection. In: 2010 Second International Conference on Advances in Computing, Control, and Telecommunication Technologies, pp. 201–203 (2010) Firdausi, I., lim, C., Erwin, A., Nugroho, A.S.: Analysis of machine learning techniques used in behavior-based malware detection. In: 2010 Second International Conference on Advances in Computing, Control, and Telecommunication Technologies, pp. 201–203 (2010)
15.
Zurück zum Zitat Gavriluţ, D., Cimpoeşu, M., Anton, D., Ciortuz, L.: Malware detection using machine learning. In: 2009 International Multiconference on Computer Science and Information Technology, pp. 735–741 (2009) Gavriluţ, D., Cimpoeşu, M., Anton, D., Ciortuz, L.: Malware detection using machine learning. In: 2009 International Multiconference on Computer Science and Information Technology, pp. 735–741 (2009)
16.
Zurück zum Zitat Ham, H.S., Choi, M.J.: Analysis of android malware detection performance using machine learning classifiers. In: 2013 international conference on ICT Convergence (ICTC), pp. 490–495. IEEE (2013) Ham, H.S., Choi, M.J.: Analysis of android malware detection performance using machine learning classifiers. In: 2013 international conference on ICT Convergence (ICTC), pp. 490–495. IEEE (2013)
17.
Zurück zum Zitat Hardy, W., Chen, L., Hou, S., Ye, Y., Li, X.: DL4MD: a deep learning framework for intelligent malware detection. In: Proceedings of the International Conference on Data Mining (DMIN), p. 61 (2016) Hardy, W., Chen, L., Hou, S., Ye, Y., Li, X.: DL4MD: a deep learning framework for intelligent malware detection. In: Proceedings of the International Conference on Data Mining (DMIN), p. 61 (2016)
18.
Zurück zum Zitat Hofmeyr, S.A., Forrest, S., Somayaji, A.: Intrusion detection using sequences of system calls. J. Comput. Secur. 6(3), 151–180 (1998)CrossRef Hofmeyr, S.A., Forrest, S., Somayaji, A.: Intrusion detection using sequences of system calls. J. Comput. Secur. 6(3), 151–180 (1998)CrossRef
19.
Zurück zum Zitat Jung, W., Kim, S., Choi, S.: Poster: deep learning for zero-day flash malware detection. In: 36th IEEE Symposium on Security and Privacy, vol. 10, pp. 2809695–2817880 (2015) Jung, W., Kim, S., Choi, S.: Poster: deep learning for zero-day flash malware detection. In: 36th IEEE Symposium on Security and Privacy, vol. 10, pp. 2809695–2817880 (2015)
21.
Zurück zum Zitat Koh, P.W., Liang, P.: Understanding black-box predictions via influence functions. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70. pp. 1885–1894. (2017) JMLR. org Koh, P.W., Liang, P.: Understanding black-box predictions via influence functions. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70. pp. 1885–1894. (2017) JMLR. org
22.
Zurück zum Zitat Koh, P.W., Steinhardt, J., Liang, P.: Stronger data poisoning attacks break data sanitization defenses. arXiv preprint arXiv:1811.00741 (2018) Koh, P.W., Steinhardt, J., Liang, P.: Stronger data poisoning attacks break data sanitization defenses. arXiv preprint arXiv:​1811.​00741 (2018)
23.
Zurück zum Zitat Kolosnjaji, B., et al.: Adversarial malware binaries: Evading deep learning for malware detection in executables. In: 2018 26th European Signal Processing Conference (EUSIPCO), pp. 533–537. IEEE (2018) Kolosnjaji, B., et al.: Adversarial malware binaries: Evading deep learning for malware detection in executables. In: 2018 26th European Signal Processing Conference (EUSIPCO), pp. 533–537. IEEE (2018)
24.
Zurück zum Zitat Konečnỳ, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016) Konečnỳ, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:​1610.​05492 (2016)
25.
Zurück zum Zitat Kumar, B.J., Naveen, H., Kumar, B.P., Sharma, S.S., Villegas, J.: Logistic regression for polymorphic malware detection using anova f-test. In: 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp. 1–5. IEEE (2017) Kumar, B.J., Naveen, H., Kumar, B.P., Sharma, S.S., Villegas, J.: Logistic regression for polymorphic malware detection using anova f-test. In: 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp. 1–5. IEEE (2017)
26.
Zurück zum Zitat Kwon, J., Lee, H.: Bingraph: Discovering mutant malware using hierarchical semantic signatures. In: 2012 7th International Conference on Malicious and Unwanted Software, pp. 104–111 (2012) Kwon, J., Lee, H.: Bingraph: Discovering mutant malware using hierarchical semantic signatures. In: 2012 7th International Conference on Malicious and Unwanted Software, pp. 104–111 (2012)
27.
Zurück zum Zitat Laskov, P., Schäfer, C., Kotenko, I., Müller, K.R.: Intrusion detection in unlabeled data with quarter-sphere support vector machines. PIK-praxis der Informationsverarbeitung und Kommunikation 27(4), 228–236 (2004)CrossRef Laskov, P., Schäfer, C., Kotenko, I., Müller, K.R.: Intrusion detection in unlabeled data with quarter-sphere support vector machines. PIK-praxis der Informationsverarbeitung und Kommunikation 27(4), 228–236 (2004)CrossRef
28.
Zurück zum Zitat Li, W.J., Wang, K., Stolfo, S.J., Herzog, B.: Fileprints: identifying file types by n-gram analysis. In: Proceedings from the Sixth Annual IEEE SMC Information Assurance Workshop, pp. 64–71. IEEE (2005) Li, W.J., Wang, K., Stolfo, S.J., Herzog, B.: Fileprints: identifying file types by n-gram analysis. In: Proceedings from the Sixth Annual IEEE SMC Information Assurance Workshop, pp. 64–71. IEEE (2005)
30.
Zurück zum Zitat Liu, Y., et al.: Trojaning attack on neural networks (2017) Liu, Y., et al.: Trojaning attack on neural networks (2017)
31.
Zurück zum Zitat McLaughlin, N. et al.: Deep android malware detection. In: Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, CODASPY 2017, p. 301-308. Association for Computing Machinery, New York (2017) McLaughlin, N. et al.: Deep android malware detection. In: Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, CODASPY 2017, p. 301-308. Association for Computing Machinery, New York (2017)
32.
Zurück zum Zitat Muñoz-González, L., et al.: Towards poisoning of deep learning algorithms with back-gradient optimization. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 27–38 (2017) Muñoz-González, L., et al.: Towards poisoning of deep learning algorithms with back-gradient optimization. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 27–38 (2017)
35.
Zurück zum Zitat Rieck, K., Laskov, P.: Language models for detection of unknown attacks in network traffic. J. Comput. Virol. 2(4), 243–256 (2007)CrossRef Rieck, K., Laskov, P.: Language models for detection of unknown attacks in network traffic. J. Comput. Virol. 2(4), 243–256 (2007)CrossRef
36.
Zurück zum Zitat Ronen, R., Radu, M., Feuerstein, C., Yom-Tov, E., Ahmadi, M.: Microsoft malware classification challenge. arXiv preprint arXiv:1802.10135 (2018) Ronen, R., Radu, M., Feuerstein, C., Yom-Tov, E., Ahmadi, M.: Microsoft malware classification challenge. arXiv preprint arXiv:​1802.​10135 (2018)
37.
Zurück zum Zitat Santos, I., Devesa, J., Brezo, F., Nieves, J., Bringas, P.G.: OPEM: a static-dynamic approach for machine-learning-based malware detection. In: Herrero, À. et al. (eds.) International Joint Conference CISIS’ 12-ICEUTE’ 12-SOCO’ 12 Special Sessions. Advances in Intelligent Systems and Computing, vol. 189. Springer, Berlin (2013) https://doi.org/10.1007/978-3-642-33018-6_28 Santos, I., Devesa, J., Brezo, F., Nieves, J., Bringas, P.G.: OPEM: a static-dynamic approach for machine-learning-based malware detection. In: Herrero, À. et al. (eds.) International Joint Conference CISIS’ 12-ICEUTE’ 12-SOCO’ 12 Special Sessions. Advances in Intelligent Systems and Computing, vol. 189. Springer, Berlin (2013) https://​doi.​org/​10.​1007/​978-3-642-33018-6_​28
38.
Zurück zum Zitat Sasaki, S., Hidano, S., Uchibayashi, T., Suganuma, T., Hiji, M., Kiyomoto, S.: On embedding backdoor in malware detectors using machine learning. In: 2019 17th International Conference on Privacy, Security and Trust (PST), pp. 1–5. IEEE (2019) Sasaki, S., Hidano, S., Uchibayashi, T., Suganuma, T., Hiji, M., Kiyomoto, S.: On embedding backdoor in malware detectors using machine learning. In: 2019 17th International Conference on Privacy, Security and Trust (PST), pp. 1–5. IEEE (2019)
39.
Zurück zum Zitat Sgandurra, D., Muñoz-González, L., Mohsen, R., Lupu, E.C.: Automated dynamic analysis of ransomware: Benefits, limitations and use for detection. arXiv preprint arXiv:1609.03020 (2016) Sgandurra, D., Muñoz-González, L., Mohsen, R., Lupu, E.C.: Automated dynamic analysis of ransomware: Benefits, limitations and use for detection. arXiv preprint arXiv:​1609.​03020 (2016)
40.
Zurück zum Zitat Shafahi, A., et al.: Poison frogs! targeted clean-label poisoning attacks on neural networks. In: Advances in Neural Information Processing Systems, pp. 6103–6113 (2018) Shafahi, A., et al.: Poison frogs! targeted clean-label poisoning attacks on neural networks. In: Advances in Neural Information Processing Systems, pp. 6103–6113 (2018)
41.
Zurück zum Zitat Siddiqui, M., Wang, M.C., Lee, J.: Data mining methods for malware detection using instruction sequences. In: Artificial Intelligence and Applications, pp. 358–363 (2008) Siddiqui, M., Wang, M.C., Lee, J.: Data mining methods for malware detection using instruction sequences. In: Artificial Intelligence and Applications, pp. 358–363 (2008)
42.
Zurück zum Zitat Steinhardt, J., Koh, P.W.W., Liang, P.S.: Certified defenses for data poisoning attacks. In: Advances in Neural Information Processing Systems, pp. 3517–3529 (2017) Steinhardt, J., Koh, P.W.W., Liang, P.S.: Certified defenses for data poisoning attacks. In: Advances in Neural Information Processing Systems, pp. 3517–3529 (2017)
43.
Zurück zum Zitat Tran, B., Li, J., Madry, A.: Spectral signatures in backdoor attacks. In: Advances in Neural Information Processing Systems, pp. 8000–8010 (2018) Tran, B., Li, J., Madry, A.: Spectral signatures in backdoor attacks. In: Advances in Neural Information Processing Systems, pp. 8000–8010 (2018)
44.
Zurück zum Zitat Wang, B., et al.: Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 707–723. IEEE (2019) Wang, B., et al.: Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 707–723. IEEE (2019)
45.
Zurück zum Zitat Xiao, H., Xiao, H., Eckert, C.: Adversarial label flips attack on support vector machines. In: ECAI, pp. 870–875 (2012) Xiao, H., Xiao, H., Eckert, C.: Adversarial label flips attack on support vector machines. In: ECAI, pp. 870–875 (2012)
46.
Zurück zum Zitat Yuan, Z., Lu, Y., Wang, Z., Xue, Y.: Droid-sec: deep learning in android malware detection. In: Proceedings of the 2014 ACM conference on SIGCOMM, pp. 371–372 (2014) Yuan, Z., Lu, Y., Wang, Z., Xue, Y.: Droid-sec: deep learning in android malware detection. In: Proceedings of the 2014 ACM conference on SIGCOMM, pp. 371–372 (2014)
Metadaten
Titel
Stronger Targeted Poisoning Attacks Against Malware Detection
verfasst von
Shintaro Narisada
Shoichiro Sasaki
Seira Hidano
Toshihiro Uchibayashi
Takuo Suganuma
Masahiro Hiji
Shinsaku Kiyomoto
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-65411-5_4