ABSTRACT
Consider an analyst who wants to release aggregate statistics about a data set containing sensitive information. Using differentially private algorithms guarantees that the released statistics reveal very little about any particular record in the data set. In this paper we study the asymptotic properties of differentially private algorithms for statistical inference.
We show that for a large class of statistical estimators T and input distributions P, there is a differentially private estimator AT with the same asymptotic distribution as T. That is, the random variables AT(X) and T(X) converge in distribution when X consists of an i.i.d. sample from P of increasing size. This implies that AT(X) is essentially as good as the original statistic T(X) for statistical inference, for sufficiently large samples. Our technique applies to (almost) any pair T,P such that T is asymptotically normal on i.i.d. samples from P---in particular, to parametric maximum likelihood estimators and estimators for logistic and linear regression under standard regularity conditions.
A consequence of our techniques is the existence of low-space streaming algorithms whose output converges to the same asymptotic distribution as a given estimator T (for the same class of estimators and input distributions as above).
Supplemental Material
- Avrim Blum, Cynthia Dwork, Frank McSherry, and Kobbi Nissim. Practical privacy: The SuLQ framework. In PODS, pages 128--138. ACM, 2005. Google ScholarDigital Library
- Kevin L. Chang and Ravi Kannan. Pass-efficient algorithms for learning mixtures of uniform distributions. SIAM J. Comput., 39(3):783--812, 2009. Google ScholarDigital Library
- Steve Chien, Katrina Ligett, and Andrew McGregor. Space-efficient estimation of robust statistics and distribution testing. In Innovations in Computer Since (ICS), pages 251--265, 2010.Google Scholar
- K. Chaudhuri, C. Monteleoni, and A.D. Sarwate. Differentially private empirical risk minimization. Journal of Machine Learning Research, 2011. Google ScholarDigital Library
- 6}DKMMN06Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT, LNCS, pages 486--503. Springer, 2006. Google ScholarDigital Library
- Cynthia Dwork and Jing Lei. Differential privacy and robust statistics. In STOC '09: Proceedings of the 41st annual ACM symposium on Theory of computing, pages 371--380, 2009. Google ScholarDigital Library
- Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In TCC, pages 265--284. Springer, 2006. Google ScholarDigital Library
- Cynthia Dwork. Differential privacy. In ICALP, LNCS, pages 1--12, 2006. Google ScholarDigital Library
- W. Feller. An Introduction to Probability Theory and Its Applications. Wiley, 3rd edition, 1971.Google Scholar
- Luisa Turrin Fernholz. von Mises Calculus for Statistical Functionals. Springer-Verlag, 1983.Google ScholarCross Ref
- Sudipto Guha and Andrew McGregor. Space-efficient sampling. In AISTATS, pages 169--176, 2007.Google Scholar
- Michael Kearns. Efficient noise-tolerant learning from statistical queries. Journal of the ACM, 45(6):983--1006, 1998. Preliminary version in proceedings of STOC'93. Google ScholarDigital Library
- 8}KLNRS08Shiva Prasad Kasiviswanathan, Homin K. Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. What can we learn privately? In FOCS, pages 531--540. IEEE Computer Society, 2008. Google ScholarDigital Library
- Frank McSherry and Kunal Talwar. Mechanism design via differential privacy. In FOCS, pages 94--103. IEEE, 2007. Google ScholarDigital Library
- Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. Smooth sensitivity and sampling in private data analysis. In STOC, pages 75--84. ACM, 2007. Google ScholarDigital Library
- D. N. Politis, J. P. Romano, and M. Wolf. Subsampling. Springer-Verlag, 1999.Google ScholarCross Ref
- Benjamin I. P. Rubinstein, Peter L. Bartlett, Ling Huang, and Nina Taft. Learning in a large function space: Privacy-preserving mechanisms for svm learning. arXiv:0911.5708v1 {cs.LG}.Google Scholar
- Mark Schervish. Theory of Statistics. Springer, 1996.Google Scholar
- Adam Smith. Efficient, differentially private point estimators. CoRR, abs/0809.4794, 2008.Google Scholar
- Larry Wasserman and Shuheng Zhou. A statistical framework for differential privacy. Journal of the American Statistical Association, 105(489):375--389, Mar 2010. arXiv:0811.2501v1 {math.ST}.Google ScholarCross Ref
Index Terms
- Privacy-preserving statistical estimation with optimal convergence rates
Recommendations
Convergence rates for differentially private statistical estimation
ICML'12: Proceedings of the 29th International Coference on International Conference on Machine LearningDifferential privacy is a cryptographically-motivated definition of privacy which has gained significant attention over the past few years. Differentially private solutions enforce privacy by adding random noise to a function computed over the data, and ...
Protecting sensitive place visits in privacy-preserving trajectory publishing
Highlights- We propose a method for privacy-preserving trajectory publishing.
- It aims at ...
AbstractThe rise of mobile computing has generated huge amount of trajectory data. Since these data are valuable for many people, publishing them while providing adequate individual privacy protection has been a challenging task. In this paper,...
Point and interval estimation for the two-parameter Birnbaum-Saunders distribution based on Type-II censored samples
The maximum likelihood estimators, based on Type-II censored samples, of a two-parameter Birnbaum-Saunders distribution are discussed. We propose a simple bias-reduction method to reduce the bias of the maximum likelihood estimators. We also discuss a ...
Comments