ABSTRACT
Machine learning systems offer unparalled flexibility in dealing with evolving input in a variety of applications, such as intrusion detection systems and spam e-mail filtering. However, machine learning algorithms themselves can be a target of attack by a malicious adversary. This paper provides a framework for answering the question, "Can machine learning be secure?" Novel contributions of this paper include a taxonomy of different types of attacks on machine learning techniques and systems, a variety of defenses against those attacks, a discussion of ideas that are important to security for machine learning, an analytical model giving a lower bound on attacker's work function, and a list of open problems.
- I. Androutsopoulos, J. Koutsias, K. V. Chandrinos, G. Paliouras, and C. D. Spyropolous. An evaluation of naive Bayesian anti-spam filtering. Proceedings of the Workshop on Machine Learning in the New Information Age, pages 9--17, 2000.]]Google Scholar
- D. Angluin, Queries and concept learning. Machine Learning, 2(4):319--342, Apr. 1988.]] Google ScholarCross Ref
- Apache, http://spamassassin.apache.org/. Spam Assassin.]]Google Scholar
- P. Auer. Learning nested differences in the presence of malicious noise. Theoretical Computer Science, 185(1):159--175, 1997.]] Google ScholarDigital Library
- V. J. Baston and F. Bostock. Deception games. International Journal of Game Theory, 17(2):129--134, 1988.]] Google ScholarDigital Library
- N. H. Bshouty, N. Eiron, and E. Kushilevitz. PAC learning with nasty noise. Theoretical Computer Science, 288(2):255--275, 2002.]] Google ScholarDigital Library
- N. Cesa-Bianchi, Y. Freund, D. P. Helmbold, D. Haussler, R. E. Schapire, and M. K. Warmuth. How to use expert advice. Journal of the ACM, 44(3):427--485, May 1997.]] Google ScholarDigital Library
- N. Dalvi, P. Domingos, Mausam, S. Sanghai, and D. Verma. Adversarial classification. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 99--108, Seattle, WA, 2004. ACM Press.]] Google ScholarDigital Library
- B. Fristedt. The deceptive number changing game in the absence of symmetry. International Journal of Game Theory, 26:183--191, 1997.]] Google ScholarDigital Library
- J. Graham-Cumming. How to beat an adaptive spam filter. Presentation at the MIT Spam Conference, Jan. 2004.]]Google Scholar
- T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, 2003.]]Google Scholar
- S. A. Heise and H. S. Morse. The DARPA JFACC program: Modeling and control of military operations. In Proceedings of the 39th IEEE Conference on Decision and Control, pages 2551--2555. IEEE, 2000.]]Google ScholarCross Ref
- M. Herbster and M. K. Warmuth. Tracking the best expert. Machine Learning, 32(2):151--178, Aug. 1998.]] Google ScholarDigital Library
- J. P. Hespanha, Y. S. Ateskan, and H. H. Kizilocak. Deception in non-cooperative games with partial information. In Proceedings of the 2nd DARPA-JFACC Symposium on Advances in Enterprise Control, 2000.]]Google Scholar
- M. Kearns and M. Li. Learning in the presence of malicious errors. SIAM Journal on Computing, 22:807--837, 1993.]] Google ScholarDigital Library
- A. Lazarevic, L. Ertöz, V. Kumar, A. Ozgur, and J. Srivastava. A comparative study of anomaly detection schemes in network intrusion detection. In D. Barbará and C. Kamath, editors, Proceedings of the Third SIAM International Conference on Data Mining, May 2003.]]Google ScholarCross Ref
- K.-T. Lee. On a deception game with three boxes. International Journal of Game Theory, 22:89--95, 1993.]]Google ScholarCross Ref
- Y. Liao and V. R. Vemuri. Using text categorization techniques for intrusion detection. In Proceedings of the 11th USENIX Security Symposium, pages 51--59, Aug. 2002.]] Google ScholarDigital Library
- J.-P. M. Linnartz and M. van Dijk. Analysis of the sensitivity attack against electronic watermarks in images. In D. Aucsmith, editor, Information Hiding '98, pages 258--272. Springer-Verlag, 1998.]]Google ScholarCross Ref
- N. Littlestone and M. K. Warmuth. The weighted majority algorithm. Information and Computation, 108(2):212--261, 1994.]] Google ScholarDigital Library
- D. Lowd and C. Meek. Adversarial learning. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 641--647, 2005.]] Google ScholarDigital Library
- D. Lowd and C. Meek. Good word attacks on statistical spam filters. In Proceedings of the Second Conference on Email and Anti-Spam (CEAS), 2005.]]Google Scholar
- M. V. Mahoney and P. K. Chan. Learning nonstationary models of normal network traffic for detecting novel attacks. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 376--385, 2002.]] Google ScholarDigital Library
- S. Mukkamala, G. Janoski, and A. Sung. Intrusion detection using neural networks and support vector machines. In Proceedings of the International Joint Conference on Neural Networks (IJCNN'02), pages 1702--1707, 2002.]]Google ScholarCross Ref
- B. Nelson. Designing, Implementing, and Analyzing a System for Virus Detection. Master's thesis, University of California at Berkeley, Dec. 2005.]]Google Scholar
- V. Paxson, Bro: A system for detecting network intruders in real-time. Computer Networks, 31(23):2435--2463, Dec. 1999.]] Google ScholarDigital Library
- N. Provos. A virtual honeypot framework. In Proceedings of the 13th USENIX Security Symposium, 2004.]] Google ScholarDigital Library
- R. Raina, A. Y. Ng, and D. Koller. Transfer learning by constructing informative priors. In Neural Information Processing Systems Workshop on Inductive Transfer: 10 Years Later, 2005.]]Google Scholar
- M. Sakaguchi. Effect of correlation in a simple deception game. Mathematica Japonica, 35(3):527--536, 1990.]]Google Scholar
- R. A. Servedio. Smooth boosting and learning with malicious noise. Journal of Machine Learning Research (JMLR), 4:633--648, Sept. 2003.]] Google ScholarDigital Library
- J. Shawe-Taylor and N. Cristianini. Kernel Methods for Pattern Analysis. Cambridge University Press, 2004.]] Google ScholarDigital Library
- J. Spencer. A deception game. American Math Monthly, 80:416--417, 1973.]]Google ScholarCross Ref
- S. J. Stolfo, S. Hershkop, K. Wang, O. Nimeskern, and C. W. Hu. A behavior-based approach to secure email systems. In Mathematical Methods, Models and Architectures for Computer Networks Security, 2003.]]Google Scholar
- S. J. Stolfo, W. J. Li, S. Hershkop, K. Wang, C. W. Hu, and O. Nimeskern. Detecting viral propagations using email behavior profiles. In ACM Transactions on Internet Technology, 2004.]]Google Scholar
- L. G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134--1142, Nov. 1984.]] Google ScholarDigital Library
- L. G. Valiant. Learning disjunctions of conjunctions. In Proceedings of the 9th International Joint Conference on Artificial Intelligence, pages 560--566, 1985.]]Google ScholarDigital Library
- V. Vovk. Aggregating strategies. In M. Fulk and J. Case, editors, Proceeding of the 7th Annual Workshop on Computational Learning Theory, pages 371--383, San Mateo, CA, 1990. Morgan-Kaufmann.]] Google ScholarDigital Library
- L. Wehenkel. Machine learning approaches to power system security assessment. IEEE Intelligent Systems and Their Applications, 12(5):60--72, Sept.-Oct. 1997.]] Google ScholarDigital Library
- G. L. Wittel and S. F. Wu. On attacking statistical spam filters. In Proceedings of the First Conference on Email and Anti-Spam (CEAS), 2004.]]Google Scholar
- W. Xu, P. Bodik, and D. Patterson. A flexible architecture for statistical learning and data mining from system log streams. In Temporal Data Mining: Algorithms, Theory and Applications, Brighton, UK, Nov. 2004. The Fourth IEEE International Conference on Data Mining.]]Google Scholar
- D.-Y. Yeung and C. Chow. Parzen-window network intrusion detectors. In Proceedings of the Sixteenth International Conference on Pattern Recognition, pages 385--388, Aug. 2002.]] Google ScholarDigital Library
- K. Yu and V. Tresp. Learning to learn and collaborative filtering. In Neural Information Processing Systems Workshop on Inductive Transfer: 10 Years Later, 2005.]]Google Scholar
Index Terms
- Can machine learning be secure?
Recommendations
Adversarial machine learning
AISec '11: Proceedings of the 4th ACM workshop on Security and artificial intelligenceIn this paper (expanded from an invited talk at AISEC 2010), we discuss an emerging field of study: adversarial machine learning---the study of effective machine learning techniques against an adversarial opponent. In this paper, we: give a taxonomy for ...
Open problems in the security of learning
AISec '08: Proceedings of the 1st ACM workshop on Workshop on AISecMachine learning has become a valuable tool for detecting and preventing malicious activity. However, as more applications employ machine learning techniques in adversarial decision-making situations, increasingly powerful attacks become possible ...
An Introduction to Adversarial Machine Learning
Big Data AnalyticsAbstractMachine learning based system are increasingly being used for sensitive tasks such as security surveillance, guiding autonomous vehicle, taking investment decisions, detecting and blocking network intrusion and malware etc. However, recent ...
Comments