ABSTRACT
Machine learning is widely used to produce models for a range of applications and is increasingly offered as a service by major technology companies. However, the required massive data collection raises privacy concerns during both training and prediction stages. In this paper, we design and implement a general framework for privacy-preserving machine learning and use it to obtain new solutions for training linear regression, logistic regression and neural network models. Our protocols are in a three-server model wherein data owners secret share their data among three servers who train and evaluate models on the joint data using three-party computation (3PC). Our main contribution is a new and complete framework ($\textABY ^3$) for efficiently switching back and forth between arithmetic, binary, and Yao 3PC which is of independent interest. Many of the conversions are based on new techniques that are designed and optimized for the first time in this paper. We also propose new techniques for fixed-point multiplication of shared decimal values that extends beyond the three-party case, and customized protocols for evaluating piecewise polynomial functions. We design variants of each building block that is secure against \em malicious adversaries who deviate arbitrarily. We implement our system in C++. Our protocols are up to \em four orders of magnitude faster than the best prior work, hence significantly reducing the gap between privacy-preserving and plaintext training.
Supplemental Material
- Azure machine learning studio. https://azure.microsoft.com/en-us/services/machine-learning-studio/.Google Scholar
- Eigen library. http://eigen.tuxfamily.org/.Google Scholar
- Google cloud ai. https://cloud.google.com/products/machine-learning/.Google Scholar
- Machine learning on aws. https://aws.amazon.com/machine-learning/.Google Scholar
- MNIST database. http://yann.lecun.com/exdb/mnist/. Accessed: 2016-07--14.Google Scholar
- Watson machine learning. https://www.ibm.com/cloud/machine-learning.Google Scholar
- M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pages 308--318. ACM, 2016. Google ScholarDigital Library
- Y. Aono, T. Hayashi, L. Trieu Phong, and L. Wang. Scalable and secure logistic regression via homomorphic encryption. In Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, pages 142--144. ACM, 2016. Google ScholarDigital Library
- T. Araki, J. Furukawa, Y. Lindell, A. Nof, and K. Ohara. High-throughput semi-honest secure three-party computation with an honest majority. In E. R. Weippl, S. Katzenbeisser, C. Kruegel, A. C. Myers, and S. Halevi, editors, Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, October 24--28, 2016, pages 805--817. ACM, 2016. Google ScholarDigital Library
- J. BARZILAI and J. J. Borwein. Two-point step size gradient methods. 8:141--148, 01 1988.Google Scholar
- A. Ben-David, N. Nisan, and B. Pinkas. FairplayMP: a system for secure multi-party computation. pages 257--266. Google ScholarDigital Library
- D. Bogdanov, S. Laur, and J. Willemson. Sharemind: A framework for fast privacy-preserving computations. pages 192--206. Google ScholarDigital Library
- D. Bogdanov, R. Talviste, and J. Willemson. Deploying secure multi-party computation for financial data analysis. In International Conference on Financial Cryptography and Data Security, pages 57--64. Springer, 2012.Google ScholarCross Ref
- F. Bourse, M. Minelli, M. Minihold, and P. Paillier. Fast homomorphic evaluation of deep discretized neural networks. Cryptology ePrint Archive, Report 2017/1114, 2017. https://eprint.iacr.org/2017/1114.Google Scholar
- P. Bunn and R. Ostrovsky. Secure two-party k-means clustering. In Proceedings of the 14th ACM conference on Computer and communications security, pages 486--497. ACM, 2007. Google ScholarDigital Library
- N. Bü scher, A. Holzer, A. Weber, and S. Katzenbeisser. Compiling low depth circuits for practical secure computation. In I. G. Askoxylakis, S. Ioannidis, S. K. Katsikas, and C. A. Meadows, editors, Computer Security - ESORICS 2016 - 21st European Symposium on Research in Computer Security, Heraklion, Greece, September 26--30, 2016, Proceedings, Part II, volume 9879 of Lecture Notes in Computer Science, pages 80--98. Springer, 2016.Google Scholar
- R. Canetti. Security and composition of multiparty cryptographic protocols. 13(1):143--202, 2000. Google ScholarDigital Library
- H. Chabanne, A. de Wargny, J. Milgram, C. Morel, and E. Prouff. Privacy-preserving classification on deep neural network. IACR Cryptology ePrint Archive, 2017:35, 2017.Google Scholar
- N. Chandran, J. A. Garay, P. Mohassel, and S. Vusirikala. Efficient, constant-round and actively secure MPC: beyond the three-party case. In B. M. Thuraisingham, D. Evans, T. Malkin, and D. Xu, editors, Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017, pages 277--294. ACM, 2017. Google ScholarDigital Library
- N. Chandran, D. Gupta, A. Rastogi, R. Sharma, and S. Tripathi. Ezpc: Programmable, efficient, and scalable secure two-party computation. Cryptology ePrint Archive, Report 2017/1109, 2017. https://eprint.iacr.org/2017/1109.Google Scholar
- M. Chase, R. Gilad-Bachrach, K. Laine, K. Lauter, and P. Rindal. Private collaborative neural network learning.Google Scholar
- K. Chida, G. Morohashi, H. Fuji, F. Magata, A. Fujimura, K. Hamada, D. Ikarashi, and R. Yamamoto. Implementation and evaluation of an efficient secure computation system using ?r'for healthcare statistics. Journal of the American Medical Informatics Association, 21(e2):e326--e331, 2014.Google Scholar
- M. Chiesa, D. Demmler, M. Canini, M. Schapira, and T. Schneider. Towards securing internet exchange points against curious onlookers. In L. Eggert and C. Perkins, editors, Proceedings of the 2016 Applied Networking Research Workshop, ANRW 2016, Berlin, Germany, July 16, 2016, pages 32--34. ACM, 2016. Google ScholarDigital Library
- D. Demmler, T. Schneider, and M. Zohner. ABY - A framework for efficient mixed-protocol secure two-party computation.Google Scholar
- W. Du and M. J. Atallah. Privacy-preserving cooperative scientific computations. In csfw, volume 1, page 273. Citeseer, 2001. Google ScholarDigital Library
- W. Du, Y. S. Han, and S. Chen. Privacy-preserving multivariate statistical analysis: Linear regression and classification. In SDM, volume 4, pages 222--233. SIAM, 2004.Google ScholarCross Ref
- M. K. Franklin, M. Gondree, and P. Mohassel. Multi-party indirect indexing and applications. pages 283--297. Google ScholarDigital Library
- J. Furukawa, Y. Lindell, A. Nof, and O. Weinstein. High-throughput secure three-party computation for malicious adversaries and an honest majority. In J. Coron and J. B. Nielsen, editors, Advances in Cryptology - EUROCRYPT 2017 - 36th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Paris, France, April 30 - May 4, 2017, Proceedings, Part II, volume 10211 of Lecture Notes in Computer Science, pages 225--255, 2017.Google ScholarCross Ref
- A. Gascon, P. Schoppmann, B. Balle, M. Raykova, J. Doerner, S. Zahur, and D. Evans. Secure linear regression on vertically partitioned datasets.Google Scholar
- I. Giacomelli, S. Jha, M. Joye, C. D. Page, and K. Yoon. Privacy-preserving ridge regression over distributed data from lhe. Cryptology ePrint Archive, Report 2017/979, 2017. https://eprint.iacr.org/2017/979.Google Scholar
- R. Gilad-Bachrach, N. Dowlin, K. Laine, K. Lauter, M. Naehrig, and J. Wernsing. Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy. In International Conference on Machine Learning, pages 201--210, 2016. Google ScholarDigital Library
- R. Gilad-Bachrach, K. Laine, K. Lauter, P. Rindal, and M. Rosulek. Secure data exchange: A marketplace in the cloud. Cryptology ePrint Archive, Report 2016/620, 2016. http://eprint.iacr.org/2016/620.Google Scholar
- D. Harris. A taxonomy of parallel prefix networks, 12 2003.Google Scholar
- E. Hesamifard, H. Takabi, and M. Ghasemi. Cryptodl: Deep neural networks over encrypted data. arXiv preprint arXiv:1711.05189, 2017.Google Scholar
- G. Jagannathan and R. N. Wright. Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pages 593--599. ACM, 2005. Google ScholarDigital Library
- V. Kolesnikov and T. Schneider. Improved garbled circuit: Free XOR gates and applications. pages 486--498. Google ScholarDigital Library
- R. Kumaresan, S. Raghuraman, and A. Sealfon. Network oblivious transfer. In M. Robshaw and J. Katz, editors, Advances in Cryptology - CRYPTO 2016 - 36th Annual International Cryptology Conference, Santa Barbara, CA, USA, August 14--18, 2016, Proceedings, Part II, volume 9815 of Lecture Notes in Computer Science, pages 366--396. Springer, 2016. Google ScholarDigital Library
- J. Launchbury, D. Archer, T. DuBuisson, and E. Mertens. Application-scale secure multiparty computation. In European Symposium on Programming Languages and Systems, pages 8--26. Springer, 2014. Google ScholarDigital Library
- Y. Lindell and B. Pinkas. Privacy preserving data mining. In Annual International Cryptology Conference, pages 36--54. Springer, 2000. Google ScholarDigital Library
- J. Liu, M. Juuti, Y. Lu, and N. Asokan. Oblivious neural network predictions via minionn transformations. In B. M. Thuraisingham, D. Evans, T. Malkin, and D. Xu, editors, Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017, pages 619--631. ACM, 2017. Google ScholarDigital Library
- H. B. McMahan, D. Ramage, K. Talwar, and L. Zhang. Learning differentially private language models without losing accuracy. arXiv preprint arXiv:1710.06963, 2017.Google Scholar
- P. Mohassel, M. Rosulek, and Y. Zhang. Fast and secure three-party computation: The garbled circuit approach. pages 591--602. Google ScholarDigital Library
- P. Mohassel and Y. Zhang. Secureml: A system for scalable privacy-preserving machine learning. In 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22--26, 2017, pages 19--38. IEEE Computer Society, 2017.Google ScholarCross Ref
- M. Naor, B. Pinkas, and R. Sumner. Privacy preserving auctions and mechanism design. In EC, pages 129--139, 1999. Google ScholarDigital Library
- V. Nikolaenko, U. Weinsberg, S. Ioannidis, M. Joye, D. Boneh, and N. Taft. Privacy-preserving ridge regression on hundreds of millions of records. In Security and Privacy (SP), 2013 IEEE Symposium on, pages 334--348. IEEE, 2013. Google ScholarDigital Library
- M. S. Riazi, C. Weinert, O. Tkachenko, E. M. Songhori, T. Schneider, and F. Koushanfar. Chameleon: A hybrid secure computation framework for machine learning applications. Google ScholarDigital Library
- P. Rindal. libOTe: an efficient, portable, and easy to use Oblivious Transfer Library. https://github.com/osu-crypto/libOTe.Google Scholar
- B. D. Rouhani, M. S. Riazi, and F. Koushanfar. Deepsecure: Scalable provably-secure deep learning. arXiv preprint arXiv:1705.08963, 2017.Google Scholar
- A. P. Sanil, A. F. Karr, X. Lin, and J. P. Reiter. Privacy preserving regression modelling via distributed computation. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 677--682. ACM, 2004. Google ScholarDigital Library
- R. Shokri and V. Shmatikov. Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pages 1310--1321. ACM, 2015. Google ScholarDigital Library
- R. Shokri, M. Stronati, C. Song, and V. Shmatikov. Membership inference attacks against machine learning models. In Security and Privacy (SP), 2017 IEEE Symposium on, pages 3--18. IEEE, 2017.Google ScholarCross Ref
- A. B. Slavkovic, Y. Nardi, and M. M. Tibbits. " secure" logistic regression of horizontally and vertically partitioned distributed databases. In Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007), pages 723--728. IEEE, 2007. Google ScholarDigital Library
- C. Song, T. Ristenpart, and V. Shmatikov. Machine learning models that remember too much. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pages 587--601. ACM, 2017. Google ScholarDigital Library
- E. M. Songhori, S. U. Hussain, A. Sadeghi, T. Schneider, and F. Koushanfar. Tinygarble: Highly compressed and scalable sequential garbled circuits. In 2015 IEEE Symposium on Security and Privacy, pages 411--428, May 2015. Google ScholarDigital Library
- F. Tramèr, F. Zhang, A. Juels, M. K. Reiter, and T. Ristenpart. Stealing machine learning models via prediction apis. In USENIX Security Symposium, pages 601--618, 2016. Google ScholarDigital Library
- J. Vaidya, H. Yu, and X. Jiang. Privacy-preserving svm classification. Knowledge and Information Systems, 14(2):161--178, 2008. Google ScholarDigital Library
- S. Wu, T. Teruya, J. Kawamoto, J. Sakuma, and H. Kikuchi. Privacy-preservation for stochastic gradient descent application to secure logistic regression. The 27th Annual Conference of the Japanese Society for Artificial Intelligence, 27:1--4, 2013.Google Scholar
- H. Yu, J. Vaidya, and X. Jiang. Privacy-preserving svm classification on vertically partitioned data. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 647--656. Springer, 2006. Google ScholarDigital Library
- S. Zahur, M. Rosulek, and D. Evans. Two halves make a whole - reducing data transfer in garbled circuits using half gates. pages 220--250.Google Scholar
Index Terms
- ABY3: A Mixed Protocol Framework for Machine Learning
Recommendations
An efficient fair UC-secure protocol for two-party computation
With the development of modern Internet and mobile networks, there is an increasing need for collaborative privacy-preserving applications. Secure multi-party computation SMPC gives a general solution to these applications and has become a hot topic. ...
Online Efficient Secure Logistic Regression based on Function Secret Sharing
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge ManagementLogistic regression is an algorithm widely used for binary classification in various real-world applications such as fraud detection, medical diagnosis, and recommendation systems. However, training a logistic regression model with data from different ...
Secure Multi-Party Computation without Agreement
It has recently been shown that authenticated Byzantine agreement, in which more than a third of the parties are corrupted, cannot be securely realized under concurrent or parallel (stateless) composition. This result puts into question any usage of ...
Comments