research-article

Differentially private recommender systems: Building privacy into the Netflix Prize contenders

Authors:
Frank McSherry

Microsoft Research, Mountain View, CA, USA

Microsoft Research, Mountain View, CA, USA
View Profile

,
Ilya Mironov

Microsoft Research, Mountain View, CA, USA

Microsoft Research, Mountain View, CA, USA
View Profile

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data miningJune 2009Pages 627–636https://doi.org/10.1145/1557019.1557090

Published:28 June 2009Publication History

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 627–636

ABSTRACT

We consider the problem of producing recommendations from collective user behavior while simultaneously providing guarantees of privacy for these users. Specifically, we consider the Netflix Prize data set, and its leading algorithms, adapted to the framework of differential privacy.

Unlike prior privacy work concerned with cryptographically securing the computation of recommendations, differential privacy constrains a computation in a way that precludes any inference about the underlying records from its output. Such algorithms necessarily introduce uncertainty--i.e., noise--to computations, trading accuracy for privacy.

We find that several of the leading approaches in the Netflix Prize competition can be adapted to provide differential privacy, without significantly degrading their accuracy. To adapt these algorithms, we explicitly factor them into two parts, an aggregation/learning phase that can be performed with differential privacy guarantees, and an individual recommendation phase that uses the learned correlations and an individual's data to provide personalized recommendations. The adaptations are non-trivial, and involve both careful analysis of the per-record sensitivity of the algorithms to calibrate noise, as well as new post-processing steps to mitigate the impact of this noise.

We measure the empirical trade-off between accuracy and privacy in these adaptations, and find that we can provide non-trivial formal privacy guarantees while still outperforming the Cinematch baseline Netflix provides.

Supplemental Material

p627-mcsherry.mp4

mp4

91.7 MB

Download

References

C. C. Aggarwal. On k-anonymity and the curse of dimensionality. In K. Bohm, C. S. Jensen, L. M. Haas, M. L. Kersten, P.-A. Larson, and B. C. Ooi, editors, VLDB, pages 901--909. ACM, 2005. Google ScholarDigital Library
E. Aimeur, G. Brassard, J. M. Fernandez, and F. S. M. Onana. Alambic: a privacy-preserving recommender system for electronic commerce. Int. J. Information Security, 7(5):307--334, 2008. Google ScholarDigital Library
E. Aimeur, G. Brassard, J. M. Fernandez, F. S. M. Onana, and Z. Rakowski. Experimental demonstration of a hybrid privacy-preserving recommender system. In ARES, pages 161--170. IEEE Computer Society, 2008. Google ScholarDigital Library
R. M. Bell and Y. Koren. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. In ICDM, pages 43--52. IEEE Computer Society, 2007. Google ScholarDigital Library
R. M. Bell, Y. Koren, and C. Volinsky. The BellKor solution to the Netflix Prize. Available from http://www.netflixprize.com, 2007.Google Scholar
R. M. Bell, Y. Koren, and C. Volinsky. The BellKor 2008 solution to the Netflix Prize. Available from http://www.netflixprize.com, 2008.Google Scholar
A. Blum, C. Dwork, F. McSherry, and K. Nissim. Practical privacy: the SuLQ framework. In C. Li, editor, PODS, pages 128--138. ACM, 2005. Google ScholarDigital Library
J. Brickell and V. Shmatikov. The cost of privacy: destruction of data-mining utility in anonymized data publishing. In Li et al. {20}, pages 70--78. Google ScholarDigital Library
J. Calandrino, A. Narayanan, E. Felten, and V. Shmatikov. Don't review that book: Privacy risks of collaborative filtering. Manuscript, 2009.Google Scholar
J. F. Canny. Collaborative filtering with privacy. In IEEE Symposium on Security and Privacy, pages 45--57, 2002. Google ScholarDigital Library
J. F. Canny. Collaborative filtering with privacy via factor analysis. In SIGIR, pages 238--245. ACM, 2002. Google ScholarDigital Library
A. Dasgupta, J. E. Hopcroft, and F. McSherry. Spectral analysis of random graphs with skewed degree distributions. In FOCS, pages 602--610. IEEE Computer Society, 2004. Google ScholarDigital Library
C. Dwork. Differential privacy. Invited talk. In Automata, Languages and Programming--ICALP (2), volume 4052 of Lecture Notes in Computer Science, pages 1--12. Springer, 2006. Google ScholarDigital Library
C. Dwork. An ad omnia approach to defining and achieving private data analysis. In F. Bonchi, E. Ferrari, B. Malin, and Y. Saygin, editors, PinKDD, volume 4890 of Lecture Notes in Computer Science, pages 1--13. Springer, 2007. Google ScholarDigital Library
C. Dwork. Differential privacy: A survey of results. In M. Agrawal, D.-Z. Du, Z. Duan, and A. Li, editors, Theory and Applications of Models of Computation, 5th International Conference, TAMC 2008, Xi'an, China, April 25-29, 2008. Proceedings, volume 4978 of Lecture Notes in Computer Science, pages 1--19. Springer, 2008. Google ScholarDigital Library
C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: Privacy via distributed noise generation. In S. Vaudenay, editor, Advances in Cryptology--EUROCRYPT 2006, volume 4004 of Lecture Notes in Computer Science, pages 486--503. Springer, May 2006. Google ScholarDigital Library
C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In S. Halevi and T. Rabin, editors, Theory of Cryptography Conference--TCC 2006, volume 3876 of Lecture Notes in Computer Science, pages 265--284. Springer, 2006. Google ScholarDigital Library
S. R. Ganta, S. P. Kasiviswanathan, and A. Smith. Composition attacks and auxiliary information in data privacy. In Li et al. {20}, pages 265--273. Google ScholarDigital Library
Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Li et al. {20}, pages 426--434. Google ScholarDigital Library
Y. Li, B. Liu, and S. Sarawagi, editors. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, August 24-27, 2008. ACM, 2008.Google ScholarDigital Library
A. Machanavajjhala, D. Kifer, J. M. Abowd, J. Gehrke, and L. Vilhuber. Privacy: Theory meets practice on the map. In ICDE, pages 277--286. IEEE, 2008. Google ScholarDigital Library
A. Narayanan and V. Shmatikov. Robust de-anonymization of large sparse datasets. In IEEE Symposium on Security and Privacy, pages 111--125. IEEE Computer Society, 2008. Google ScholarDigital Library
L. Sweeney. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5):557--570, 2002. Google ScholarDigital Library

Index Terms

Differentially private recommender systems: Building privacy into the Netflix Prize contenders

Recommendations

Privacy-preserving topic model for tagging recommender systems

Tagging recommender systems provide users the freedom to explore tags and obtain recommendations. The releasing and sharing of these tagging datasets will accelerate both commercial and research work on recommender systems. However, releasing the ...
Read More
ReuseKNN: Neighborhood Reuse for Differentially Private KNN-Based Recommendations
User-based KNN recommender systems (UserKNN) utilize the rating data of a target user’s k nearest neighbors in the recommendation process. This, however, increases the privacy risk of the neighbors, since the recommendations could expose the neighbors’ ...
Read More
A differentially private algorithm for location data release

The rise of mobile technologies in recent years has led to large volumes of location information, which are valuable resources for knowledge discovery such as travel patterns mining and traffic analysis. However, location dataset has been confronted ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
June 2009
1426 pages
ISBN:9781605584959
DOI:10.1145/1557019
General Chairs:
John Elder
Elder Research, Inc., USA
,
Françoise Soulié Fogelman
KXEN, France
,
Program Chairs:
Peter Flach
University of Bristol, UK
,
Mohammed Zaki
RPI, USA
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 June 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Netflix
differential privacy
recommender systems
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 485
  Total Citations
  View Citations
- 4,110
  Total Downloads
- Downloads (Last 12 months)242
- Downloads (Last 6 weeks)21
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Differentially private recommender systems: Building privacy into the Netflix Prize contenders

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Privacy-preserving topic model for tagging recommender systems

ReuseKNN: Neighborhood Reuse for Differentially Private KNN-Based Recommendations

A differentially private algorithm for location data release

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Differentially private recommender systems: Building privacy into the Netflix Prize contenders

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Privacy-preserving topic model for tagging recommender systems

ReuseKNN: Neighborhood Reuse for Differentially Private KNN-Based Recommendations

A differentially private algorithm for location data release

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media