ABSTRACT
We demonstrate that, ignoring computational constraints, it is possible to release privacy-preserving databases that are useful for all queries over a discretized domain from any given concept class with polynomial VC-dimension. We show a new lower bound for releasing databases that are useful for halfspace queries over a continuous domain. Despite this, we give a privacy-preserving polynomial time algorithm that releases information useful for all halfspace queries, for a slightly relaxed definition of usefulness. Inspired by learning theory, we introduce a new notion of data privacy, which we call distributional privacy, and show that it is strictly stronger than the prevailing privacy notion, differential privacy.
- M. Anthony and P. Bartlett. Neural Network Learning: Theoretical Foundations. Cambridge University Press, 1999. Google ScholarDigital Library
- M.F. Balcan, A. Blum, and S. Vempala. Kernels as features: On kernels, margins, and low-dimensional mappings. Machine Learning, 65(1):79--94, 2006. Google ScholarDigital Library
- B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, and K. Talwar. Privacy, accuracy, and consistency too: a holistic solution to contingency table release. Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 273--282, 2007. Google ScholarDigital Library
- A. Blum, C. Dwork, F. McSherry, and K. Nissim. Practical privacy: the SuLQ framework. Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 128--138, 2005. Google ScholarDigital Library
- S. Dasgupta and A. Gupta. An elementary proof of the Johnson-Lindenstrauss Lemma. International Computer Science Institute, Technical Report, pages 99--006, 1999.Google Scholar
- I. Dinur and K. Nissim. Revealing information while preserving privacy. Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 202--210, 2003. Google ScholarDigital Library
- C. Dwork. Differential privacy. Proc. ICALP, 2006. Google ScholarDigital Library
- C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our Data, Ourselves: Privacy via Distributed Noise Generation. Proceedings of Advances in CryptologyEurocrypt 2006, pages 486--503, 2006. Google ScholarDigital Library
- C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. Proceedings of the 3rd Theory of Cryptography Conference, pages 265--284, 2006. Google ScholarDigital Library
- C. Dwork, F. McSherry, and K. Talwar. The price of privacy and the limits of LP decoding. Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pages 85--94, 2007. Google ScholarDigital Library
- C. Dwork and K. Nissim. Privacy-preserving datamining on vertically partitioned databases. Proc. CRYPTO, pages 528--544, 2004.Google ScholarCross Ref
- Alexandre Evfimievski, Johannes Gehrke, and Ramakrishnan Srikant. Limiting privacy breaches in privacy preserving data mining. In PODS '03: Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 211--222, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
- P. Indyk and R. Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality. Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 604--613, 1998. Google ScholarDigital Library
- Shiva Kasiviswanathan, Homin K. Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. What can we learn privately? http://arxiv.org/abs/0803.0924v1.Google Scholar
- F. McSherry and K. Talwar. Mechanism Design via Differential Privacy. Proceedings of the 48th Annual Symposium on Foundations of Computer Science, 2007. Google ScholarDigital Library
- K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pages 75--84, 2007. Google ScholarDigital Library
- V. Rastogi, D. Suciu, and S. Hong. The Boundary Between Privacy and Utility in Data Publishing. VLDB, 2007. Google ScholarDigital Library
- A. J. Smola and B. Scholkopf. Learning with Kernels. MIT Press, 2002.Google Scholar
- V. N. Vapnik. Statistical Learning Theory. John Wiley and Sons Inc., 1998. Google ScholarDigital Library
Index Terms
- A learning theory approach to non-interactive database privacy
Recommendations
A learning theory approach to noninteractive database privacy
In this article, we demonstrate that, ignoring computational constraints, it is possible to release synthetic databases that are useful for accurately answering large classes of queries while preserving differential privacy. Specifically, we give a ...
Database privacy: balancing confidentiality, integrity and availability
The emphasis in database privacy should fall on a balance between confidentiality, integrity and availability of personal data, rather than on confidentiality alone. This balance should not necessarily be a trade-off, but should take into account the ...
Privacy-preserving deletion to generalization-based anonymous database
CUBE '12: Proceedings of the CUBE International Information Technology ConferenceWhile creating an anonymous database it is assumed that all data is available at the time of creation. Once record is added to database, it is not deleted or if a user wants to delete person's record from database, it will be removed from it in its next ...
Comments