ABSTRACT
Machine-learning (ML) algorithms are increasingly utilized in privacy-sensitive applications such as predicting lifestyle choices, making medical diagnoses, and facial recognition. In a model inversion attack, recently introduced in a case study of linear classifiers in personalized medicine by Fredrikson et al., adversarial access to an ML model is abused to learn sensitive genomic information about individuals. Whether model inversion attacks apply to settings outside theirs, however, is unknown. We develop a new class of model inversion attack that exploits confidence values revealed along with predictions. Our new attacks are applicable in a variety of settings, and we explore two in depth: decision trees for lifestyle surveys as used on machine-learning-as-a-service systems and neural networks for facial recognition. In both cases confidence values are revealed to those with the ability to make prediction queries to models. We experimentally show attacks that are able to estimate whether a respondent in a lifestyle survey admitted to cheating on their significant other and, in the other context, show how to recover recognizable images of people's faces given only their name and access to the ML model. We also initiate experimental exploration of natural countermeasures, investigating a privacy-aware decision tree training algorithm that is a simple variant of CART learning, as well as revealing only rounded confidence values. The lesson that emerges is that one can avoid these kinds of MI attacks with negligible degradation to utility.
- DeepFace: Closing the Gap to Human-Level Performance in Face Verification. In Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
- AT&T Laboratories Cambridge. The ORL database of faces. http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html.Google Scholar
- M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar. Can machine learning be secure? In Proceedings of the 2006 ACM Symposium on Information, computer and communications security, pages 16--25. ACM, 2006. Google ScholarDigital Library
- BigML. https://www.bigml.com/.Google Scholar
- G. Bradski. The OpenCV library. Dr. Dobb's Journal of Software Tools, Jan. 2000.Google Scholar
- C.-L. Chi, W. Nick Street, J. G. Robinson, and M. A. Crawford. Individualized patient-centered lifestyle recommendations: An expert system for communicating patient specific cardiovascular risk information and prioritizing lifestyle options. J. of Biomedical Informatics, 45(6):1164--1174, Dec. 2012. Google ScholarDigital Library
- G. Cormode. Personal privacy vs population privacy: learning to attack anonymization. In KDD, 2011. Google ScholarDigital Library
- M. Dabbah, W. Woo, and S. Dlay. Secure authentication for face recognition. In IEEE Symposium on Computational Intelligence in Image and Signal Processing, pages 121--126, April 2007.Google ScholarCross Ref
- C. Dillow. Augmented identity app helps you identify strangers on the street. Popular Science, Feb. 23 2010.Google Scholar
- I. Dinur and K. Nissim. Revealing information while preserving privacy. In PODS, 2003. Google ScholarDigital Library
- C. Dwork. Differential privacy. In ICALP. Springer, 2006. Google ScholarDigital Library
- C. Dwork, F. McSherry, and K. Talwar. The price of privacy and the limits of lp decoding. In STOC, 2007. Google ScholarDigital Library
- M. Fredrikson, E. Lantz, S. Jha, S. Lin, D. Page, and T. Ristenpart. Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing. In USENIX Security Symposium, pages 17--32, 2014. Google ScholarDigital Library
- Fredrikson, M. and Jha, S. and Ristenpart, T. Model inversion attacks and basic countermeasures (Technical Report). Technical report, 2015.Google Scholar
- I. J. Goodfellow, D. Warde-Farley, P. Lamblin, V. Dumoulin, M. Mirza, R. Pascanu, J. Bergstra, F. Bastien, and Y. Bengio. Pylearn2: a machine learning research library. arXiv preprint arXiv:1308.4214, 2013.Google Scholar
- Google. Prediction API. https://cloud.google.com/prediction/.Google Scholar
- W. Hickey. FiveThirtyEight.com DataLab: How americans like their steak. http://fivethirtyeight.com/datalab/how-americans-like-their-steak/, May 2014.Google Scholar
- N. Homer, S. Szelinger, M. Redman, D. Duggan, W. Tembe, J. Muehling, J. V. Pearson, D. A. Stephan, S. F. Nelson, and D. W. Craig. Resolving individuals contributing trace amounts of dna to highly complex mixtures using high-density snp genotyping microarrays. PLOS Genetics, 2008.Google ScholarCross Ref
- G. Huang, H. Lee, and E. Learned-Miller. Learning hierarchical representations for face verification with convolutional deep belief networks. In Computer Vision and Pattern Recognition (CVPR), June 2012. Google ScholarDigital Library
- International Warfarin Pharmacogenetic Consortium. Estimation of the warfarin dose with clinical and pharmacogenetic data. New England Journal of Medicine, 360(8):753--764, 2009.Google ScholarCross Ref
- Kairos AR, Inc. Facial recognition API. https://developer.kairos.com/docs.Google Scholar
- S. P. Kasiviswanathan, M. Rudelson, and A. Smith. The power of linear reconstruction attacks. In SODA, 2013. Google ScholarDigital Library
- S. P. Kasiviswanathan, M. Rudelson, A. Smith, and J. Ullman. The price of privately releasing contingency tables and the spectra of random matrices with correlated rows. In STOC, 2010. Google ScholarDigital Library
- J. Klontz, B. Klare, S. Klum, A. Jain, and M. Burge. Open source biometric recognition. In IEEE International Conference on Biometrics: Theory, Applications and Systems, pages 1--8, Sept 2013.Google ScholarCross Ref
- T. Komarova, D. Nekipelov, and E. Yakovlev. Estimation of Treatment Effects from Combined Data: Identification versus Data Security. NBER volume Economics of Digitization: An Agenda, To appear.Google Scholar
- Lambda Labs. Facial recognition API. https://lambdal.com/face-recognition-api.Google Scholar
- H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pages 609--616, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- N. Li, W. Qardaji, D. Su, Y. Wu, and W. Yang. Membership privacy: A unifying framework for privacy definitions. In Proceedings of ACM CCS, 2013. Google ScholarDigital Library
- G. Loukides, J. C. Denny, and B. Malin. The disclosure of diagnosis codes can breach research participants' privacy. Journal of the American Medical Informatics Association, 17(3):322--327, 2010.Google ScholarCross Ref
- D. Lowd and C. Meek. Adversarial learning. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pages 641--647. ACM, 2005. Google ScholarDigital Library
- Microsoft. Microsoft Azure Machine Learning.Google Scholar
- A. Narayanan and V. Shmatikov. Robust de-anonymization of large sparse datasets. In IEEE Symposium on Security and Privacy, pages 111--125, 2008. Google ScholarDigital Library
- J. Prince. Social science research on pornography. http://byuresearch.org/ssrp/downloads/GSShappiness.pdf.Google Scholar
- S. Sankararaman, G. Obozinski, M. I. Jordan, and E. Halperin. Genomic privacy and limits of individual detection in a pool. Nature Genetics, 41(9):965--967, 2009.Google ScholarCross Ref
- C. Savage. Facial scanning is making gains in surveillance. The New York Times, Aug. 21 2013.Google Scholar
- SkyBiometry. Facial recognition API. https://www.skybiometry.com/Documentation#faces/recognize.Google Scholar
- T. W. Smith, P. Marsden, M. Hout, and J. Kim. General social surveys, 1972-2012. National Opinion Research Center {producer}; The Roper Center for Public Opinion Research, University of Connecticut {distributor}, 2103.Google Scholar
- L. Sweeney. Simple demographics often identify people uniquely. 2000.Google Scholar
- R. Wang, Y. F. Li, X. Wang, H. Tang, and X. Zhou. Learning your identity and disease from research papers: information leaks in genome wide association studies. In CCS, 2009. Google ScholarDigital Library
- Wise.io. http://www.wise.io/.Google Scholar
Index Terms
- Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures
Recommendations
Denial-of-service attacks and countermeasures on BitTorrent
BitTorrent has been widely used for the efficient distribution of files, such as digital content and media files, to very large numbers of users. However, previous work has exposed vulnerabilities in the protocol and demonstrated that they can be ...
On Real-Time Model Inversion Attacks Detection
Distributed Computer and Communication Networks: Control, Computation, CommunicationsAbstractThe article deals with the issues of detecting adversarial attacks on machine learning models. In the most general case, adversarial attacks are special data changes at one of the stages of the machine learning pipeline, which are designed to ...
Key Reinstallation Attacks: Forcing Nonce Reuse in WPA2
CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications SecurityWe introduce the key reinstallation attack. This attack abuses design or implementation flaws in cryptographic protocols to reinstall an already-in-use key. This resets the key's associated parameters such as transmit nonces and receive replay counters. ...
Comments