skip to main content
10.1145/1128817.1128824acmconferencesArticle/Chapter ViewAbstractPublication Pagesasia-ccsConference Proceedingsconference-collections
Article

Can machine learning be secure?

Published:21 March 2006Publication History

ABSTRACT

Machine learning systems offer unparalled flexibility in dealing with evolving input in a variety of applications, such as intrusion detection systems and spam e-mail filtering. However, machine learning algorithms themselves can be a target of attack by a malicious adversary. This paper provides a framework for answering the question, "Can machine learning be secure?" Novel contributions of this paper include a taxonomy of different types of attacks on machine learning techniques and systems, a variety of defenses against those attacks, a discussion of ideas that are important to security for machine learning, an analytical model giving a lower bound on attacker's work function, and a list of open problems.

References

  1. I. Androutsopoulos, J. Koutsias, K. V. Chandrinos, G. Paliouras, and C. D. Spyropolous. An evaluation of naive Bayesian anti-spam filtering. Proceedings of the Workshop on Machine Learning in the New Information Age, pages 9--17, 2000.]]Google ScholarGoogle Scholar
  2. D. Angluin, Queries and concept learning. Machine Learning, 2(4):319--342, Apr. 1988.]] Google ScholarGoogle ScholarCross RefCross Ref
  3. Apache, http://spamassassin.apache.org/. Spam Assassin.]]Google ScholarGoogle Scholar
  4. P. Auer. Learning nested differences in the presence of malicious noise. Theoretical Computer Science, 185(1):159--175, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. V. J. Baston and F. Bostock. Deception games. International Journal of Game Theory, 17(2):129--134, 1988.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. N. H. Bshouty, N. Eiron, and E. Kushilevitz. PAC learning with nasty noise. Theoretical Computer Science, 288(2):255--275, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. N. Cesa-Bianchi, Y. Freund, D. P. Helmbold, D. Haussler, R. E. Schapire, and M. K. Warmuth. How to use expert advice. Journal of the ACM, 44(3):427--485, May 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. Dalvi, P. Domingos, Mausam, S. Sanghai, and D. Verma. Adversarial classification. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 99--108, Seattle, WA, 2004. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B. Fristedt. The deceptive number changing game in the absence of symmetry. International Journal of Game Theory, 26:183--191, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Graham-Cumming. How to beat an adaptive spam filter. Presentation at the MIT Spam Conference, Jan. 2004.]]Google ScholarGoogle Scholar
  11. T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, 2003.]]Google ScholarGoogle Scholar
  12. S. A. Heise and H. S. Morse. The DARPA JFACC program: Modeling and control of military operations. In Proceedings of the 39th IEEE Conference on Decision and Control, pages 2551--2555. IEEE, 2000.]]Google ScholarGoogle ScholarCross RefCross Ref
  13. M. Herbster and M. K. Warmuth. Tracking the best expert. Machine Learning, 32(2):151--178, Aug. 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. P. Hespanha, Y. S. Ateskan, and H. H. Kizilocak. Deception in non-cooperative games with partial information. In Proceedings of the 2nd DARPA-JFACC Symposium on Advances in Enterprise Control, 2000.]]Google ScholarGoogle Scholar
  15. M. Kearns and M. Li. Learning in the presence of malicious errors. SIAM Journal on Computing, 22:807--837, 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Lazarevic, L. Ertöz, V. Kumar, A. Ozgur, and J. Srivastava. A comparative study of anomaly detection schemes in network intrusion detection. In D. Barbará and C. Kamath, editors, Proceedings of the Third SIAM International Conference on Data Mining, May 2003.]]Google ScholarGoogle ScholarCross RefCross Ref
  17. K.-T. Lee. On a deception game with three boxes. International Journal of Game Theory, 22:89--95, 1993.]]Google ScholarGoogle ScholarCross RefCross Ref
  18. Y. Liao and V. R. Vemuri. Using text categorization techniques for intrusion detection. In Proceedings of the 11th USENIX Security Symposium, pages 51--59, Aug. 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J.-P. M. Linnartz and M. van Dijk. Analysis of the sensitivity attack against electronic watermarks in images. In D. Aucsmith, editor, Information Hiding '98, pages 258--272. Springer-Verlag, 1998.]]Google ScholarGoogle ScholarCross RefCross Ref
  20. N. Littlestone and M. K. Warmuth. The weighted majority algorithm. Information and Computation, 108(2):212--261, 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Lowd and C. Meek. Adversarial learning. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 641--647, 2005.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. Lowd and C. Meek. Good word attacks on statistical spam filters. In Proceedings of the Second Conference on Email and Anti-Spam (CEAS), 2005.]]Google ScholarGoogle Scholar
  23. M. V. Mahoney and P. K. Chan. Learning nonstationary models of normal network traffic for detecting novel attacks. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 376--385, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Mukkamala, G. Janoski, and A. Sung. Intrusion detection using neural networks and support vector machines. In Proceedings of the International Joint Conference on Neural Networks (IJCNN'02), pages 1702--1707, 2002.]]Google ScholarGoogle ScholarCross RefCross Ref
  25. B. Nelson. Designing, Implementing, and Analyzing a System for Virus Detection. Master's thesis, University of California at Berkeley, Dec. 2005.]]Google ScholarGoogle Scholar
  26. V. Paxson, Bro: A system for detecting network intruders in real-time. Computer Networks, 31(23):2435--2463, Dec. 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. N. Provos. A virtual honeypot framework. In Proceedings of the 13th USENIX Security Symposium, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. Raina, A. Y. Ng, and D. Koller. Transfer learning by constructing informative priors. In Neural Information Processing Systems Workshop on Inductive Transfer: 10 Years Later, 2005.]]Google ScholarGoogle Scholar
  29. M. Sakaguchi. Effect of correlation in a simple deception game. Mathematica Japonica, 35(3):527--536, 1990.]]Google ScholarGoogle Scholar
  30. R. A. Servedio. Smooth boosting and learning with malicious noise. Journal of Machine Learning Research (JMLR), 4:633--648, Sept. 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Shawe-Taylor and N. Cristianini. Kernel Methods for Pattern Analysis. Cambridge University Press, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Spencer. A deception game. American Math Monthly, 80:416--417, 1973.]]Google ScholarGoogle ScholarCross RefCross Ref
  33. S. J. Stolfo, S. Hershkop, K. Wang, O. Nimeskern, and C. W. Hu. A behavior-based approach to secure email systems. In Mathematical Methods, Models and Architectures for Computer Networks Security, 2003.]]Google ScholarGoogle Scholar
  34. S. J. Stolfo, W. J. Li, S. Hershkop, K. Wang, C. W. Hu, and O. Nimeskern. Detecting viral propagations using email behavior profiles. In ACM Transactions on Internet Technology, 2004.]]Google ScholarGoogle Scholar
  35. L. G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134--1142, Nov. 1984.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. L. G. Valiant. Learning disjunctions of conjunctions. In Proceedings of the 9th International Joint Conference on Artificial Intelligence, pages 560--566, 1985.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. V. Vovk. Aggregating strategies. In M. Fulk and J. Case, editors, Proceeding of the 7th Annual Workshop on Computational Learning Theory, pages 371--383, San Mateo, CA, 1990. Morgan-Kaufmann.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. L. Wehenkel. Machine learning approaches to power system security assessment. IEEE Intelligent Systems and Their Applications, 12(5):60--72, Sept.-Oct. 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. G. L. Wittel and S. F. Wu. On attacking statistical spam filters. In Proceedings of the First Conference on Email and Anti-Spam (CEAS), 2004.]]Google ScholarGoogle Scholar
  40. W. Xu, P. Bodik, and D. Patterson. A flexible architecture for statistical learning and data mining from system log streams. In Temporal Data Mining: Algorithms, Theory and Applications, Brighton, UK, Nov. 2004. The Fourth IEEE International Conference on Data Mining.]]Google ScholarGoogle Scholar
  41. D.-Y. Yeung and C. Chow. Parzen-window network intrusion detectors. In Proceedings of the Sixteenth International Conference on Pattern Recognition, pages 385--388, Aug. 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. K. Yu and V. Tresp. Learning to learn and collaborative filtering. In Neural Information Processing Systems Workshop on Inductive Transfer: 10 Years Later, 2005.]]Google ScholarGoogle Scholar

Index Terms

  1. Can machine learning be secure?

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASIACCS '06: Proceedings of the 2006 ACM Symposium on Information, computer and communications security
        March 2006
        384 pages
        ISBN:1595932720
        DOI:10.1145/1128817

        Copyright © 2006 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 21 March 2006

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate418of2,322submissions,18%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader