skip to main content
10.1145/1557019.1557090acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Differentially private recommender systems: Building privacy into the Netflix Prize contenders

Published:28 June 2009Publication History

ABSTRACT

We consider the problem of producing recommendations from collective user behavior while simultaneously providing guarantees of privacy for these users. Specifically, we consider the Netflix Prize data set, and its leading algorithms, adapted to the framework of differential privacy.

Unlike prior privacy work concerned with cryptographically securing the computation of recommendations, differential privacy constrains a computation in a way that precludes any inference about the underlying records from its output. Such algorithms necessarily introduce uncertainty--i.e., noise--to computations, trading accuracy for privacy.

We find that several of the leading approaches in the Netflix Prize competition can be adapted to provide differential privacy, without significantly degrading their accuracy. To adapt these algorithms, we explicitly factor them into two parts, an aggregation/learning phase that can be performed with differential privacy guarantees, and an individual recommendation phase that uses the learned correlations and an individual's data to provide personalized recommendations. The adaptations are non-trivial, and involve both careful analysis of the per-record sensitivity of the algorithms to calibrate noise, as well as new post-processing steps to mitigate the impact of this noise.

We measure the empirical trade-off between accuracy and privacy in these adaptations, and find that we can provide non-trivial formal privacy guarantees while still outperforming the Cinematch baseline Netflix provides.

Skip Supplemental Material Section

Supplemental Material

p627-mcsherry.mp4

mp4

91.7 MB

References

  1. C. C. Aggarwal. On k-anonymity and the curse of dimensionality. In K. Bohm, C. S. Jensen, L. M. Haas, M. L. Kersten, P.-A. Larson, and B. C. Ooi, editors, VLDB, pages 901--909. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. E. Aimeur, G. Brassard, J. M. Fernandez, and F. S. M. Onana. Alambic: a privacy-preserving recommender system for electronic commerce. Int. J. Information Security, 7(5):307--334, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. Aimeur, G. Brassard, J. M. Fernandez, F. S. M. Onana, and Z. Rakowski. Experimental demonstration of a hybrid privacy-preserving recommender system. In ARES, pages 161--170. IEEE Computer Society, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. M. Bell and Y. Koren. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. In ICDM, pages 43--52. IEEE Computer Society, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. M. Bell, Y. Koren, and C. Volinsky. The BellKor solution to the Netflix Prize. Available from http://www.netflixprize.com, 2007.Google ScholarGoogle Scholar
  6. R. M. Bell, Y. Koren, and C. Volinsky. The BellKor 2008 solution to the Netflix Prize. Available from http://www.netflixprize.com, 2008.Google ScholarGoogle Scholar
  7. A. Blum, C. Dwork, F. McSherry, and K. Nissim. Practical privacy: the SuLQ framework. In C. Li, editor, PODS, pages 128--138. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Brickell and V. Shmatikov. The cost of privacy: destruction of data-mining utility in anonymized data publishing. In Li et al. {20}, pages 70--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Calandrino, A. Narayanan, E. Felten, and V. Shmatikov. Don't review that book: Privacy risks of collaborative filtering. Manuscript, 2009.Google ScholarGoogle Scholar
  10. J. F. Canny. Collaborative filtering with privacy. In IEEE Symposium on Security and Privacy, pages 45--57, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. F. Canny. Collaborative filtering with privacy via factor analysis. In SIGIR, pages 238--245. ACM, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Dasgupta, J. E. Hopcroft, and F. McSherry. Spectral analysis of random graphs with skewed degree distributions. In FOCS, pages 602--610. IEEE Computer Society, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Dwork. Differential privacy. Invited talk. In Automata, Languages and Programming--ICALP (2), volume 4052 of Lecture Notes in Computer Science, pages 1--12. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Dwork. An ad omnia approach to defining and achieving private data analysis. In F. Bonchi, E. Ferrari, B. Malin, and Y. Saygin, editors, PinKDD, volume 4890 of Lecture Notes in Computer Science, pages 1--13. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. Dwork. Differential privacy: A survey of results. In M. Agrawal, D.-Z. Du, Z. Duan, and A. Li, editors, Theory and Applications of Models of Computation, 5th International Conference, TAMC 2008, Xi'an, China, April 25-29, 2008. Proceedings, volume 4978 of Lecture Notes in Computer Science, pages 1--19. Springer, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: Privacy via distributed noise generation. In S. Vaudenay, editor, Advances in Cryptology--EUROCRYPT 2006, volume 4004 of Lecture Notes in Computer Science, pages 486--503. Springer, May 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In S. Halevi and T. Rabin, editors, Theory of Cryptography Conference--TCC 2006, volume 3876 of Lecture Notes in Computer Science, pages 265--284. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. R. Ganta, S. P. Kasiviswanathan, and A. Smith. Composition attacks and auxiliary information in data privacy. In Li et al. {20}, pages 265--273. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Li et al. {20}, pages 426--434. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. Li, B. Liu, and S. Sarawagi, editors. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, August 24-27, 2008. ACM, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Machanavajjhala, D. Kifer, J. M. Abowd, J. Gehrke, and L. Vilhuber. Privacy: Theory meets practice on the map. In ICDE, pages 277--286. IEEE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Narayanan and V. Shmatikov. Robust de-anonymization of large sparse datasets. In IEEE Symposium on Security and Privacy, pages 111--125. IEEE Computer Society, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. L. Sweeney. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5):557--570, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Differentially private recommender systems: Building privacy into the Netflix Prize contenders

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
          June 2009
          1426 pages
          ISBN:9781605584959
          DOI:10.1145/1557019

          Copyright © 2009 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 28 June 2009

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,133of8,635submissions,13%

          Upcoming Conference

          KDD '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader