skip to main content
10.1145/1065167.1065184acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
Article

Practical privacy: the SuLQ framework

Published:13 June 2005Publication History

ABSTRACT

We consider a statistical database in which a trusted administrator introduces noise to the query responses with the goal of maintaining privacy of individual database entries. In such a database, a query consists of a pair (S, f) where S is a set of rows in the database and f is a function mapping database rows to {0, 1}. The true answer is ΣiεS f(di), and a noisy version is released as the response to the query. Results of Dinur, Dwork, and Nissim show that a strong form of privacy can be maintained using a surprisingly small amount of noise -- much less than the sampling error -- provided the total number of queries is sublinear in the number of database rows. We call this query and (slightly) noisy reply the SuLQ (Sub-Linear Queries) primitive. The assumption of sublinearity becomes reasonable as databases grow increasingly large.We extend this work in two ways. First, we modify the privacy analysis to real-valued functions f and arbitrary row types, as a consequence greatly improving the bounds on noise required for privacy. Second, we examine the computational power of the SuLQ primitive. We show that it is very powerful indeed, in that slightly noisy versions of the following computations can be carried out with very few invocations of the primitive: principal component analysis, k means clustering, the Perceptron Algorithm, the ID3 algorithm, and (apparently!) all algorithms that operate in the in the statistical query learning model [11].

References

  1. D. Agrawal and C. C. Aggarwal. On the design and quantification of privacy preserving data mining algorithms. In Proc. Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Agrawal and R. Srikant. Privacy-preserving data mining. In Proc. ACM SIGMOD Conference on Management of Data, pages 439--450. ACM Press, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. N. R. Adam and J. C. Wortmann, Security-Control Methods for Statistical Databases: A Comparative Study, ACM Computing Surveys 21(4), pp. 515--556, 1989.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Chawla, C. Dwork, F. McSherry, A. Smith and H. Wee, Toward Privacy in Public Databases, TCC 2005.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D.E. Denning. Secure statistical database with random sample queries. ACM Trans. Database Syst. 5, 3(Sept.), 291--315, 1980.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. I. Dinur and K. Nissim, Revealing information while preserving privacy, Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 202--210, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Dwork and N. Nissim, Privacy-Preserving Datamining on Vertically Partitioned Databases, Proceedings of CRYPTO 2004, pp. 528--544, 2004.]]Google ScholarGoogle ScholarCross RefCross Ref
  8. A. Evfimievsky, J. Gehrke, and R. Srikant, Limiting privacy breaches in privacy preserving data mining. In Proc. Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 211--222, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Feigenbaum, Y. Ishai, T. Malkin, K. Nissim, M. Strauss, and R. N. Wright. Secure multiparty computation of approximations. In Proc. 28th International Colloquium on Automata, Languages and Programming, pages 927--938. Springer-Verlag, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Halevi, E. Kushilevitz, R. Krauthgamer, and K. Nissim. Private approximations of np-hard functions. In Proc. 33th Annual ACM Symposium on the Theory of Computing, pages 550--559, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Kearns, Efficient Noise-Tolerant Learning from Statistical Queries, JACM 45(6), pp. 983--1006, 1998. See also Proc. 25th ACM STOC, pp. 392--401, 1993]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Kocher, Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems, CRYPTO 1996, Lecture Notes in Computer Science, Volume 1109, Jan 1996, Page 104.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Lindell and B. Pinkas. Privacy preserving data mining. J. Cryptology, 15(3):177--206, 2002. An earlier version appeared in Proc. Crypto 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. Leiss. Randomizing a practical method for protecting statistical databases against compromise. In Proceedings of the 8th Conference on Very Large Databases, pp. 189--196.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. J. O'Connel, Search Program for Significant Variables, Comp. Phys. Comm. 8, 1974.]]Google ScholarGoogle Scholar
  16. D. MacKay Information Theory, Inference, and Learning Algorithms Cambridge University Press, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. R. Quinlan, Induction of Decision Trees, Machine learning 1(1), 1986, pp. 81--106.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    PODS '05: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
    June 2005
    388 pages
    ISBN:1595930620
    DOI:10.1145/1065167
    • General Chair:
    • Georg Gottlob,
    • Program Chair:
    • Foto Afrati

    Copyright © 2005 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 13 June 2005

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate642of2,707submissions,24%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader