ABSTRACT
Recommender systems are one of the most pervasive applications of machine learning in industry, with many services using them to match users to products or information. As such it is important to ask: what are the possible fairness risks, how can we quantify them, and how should we address them? In this paper we offer a set of novel metrics for evaluating algorithmic fairness concerns in recommender systems. In particular we show how measuring fairness based on pairwise comparisons from randomized experiments provides a tractable means to reason about fairness in rankings from recommender systems. Building on this metric, we offer a new regularizer to encourage improving this metric during model training and thus improve fairness in the resulting rankings. We apply this pairwise regularization to a large-scale, production recommender system and show that we are able to significantly improve the system's pairwise fairness.
- Ryan Prescott Adams and Richard S Zemel. 2011. Ranking via Sinkhorn Propagation. arXiv preprint arXiv:1106.1925 (2011).Google Scholar
- Alekh Agarwal, Alina Beygelzimer, Miroslav Dud'ik, John Langford, and Hanna Wallach. 2018a. A reductions approach to fair classification. arXiv preprint arXiv:1803.02453 (2018).Google Scholar
- Aman Agarwal, Ivan Zaitsev, Xuanhui Wang, Cheng Li, Marc Najork, and Thorsten Joachims. 2018b. Estimating Position Bias without Intrusive Interventions. arXiv preprint arXiv:1812.05161 (2018). Google ScholarDigital Library
- Eytan Bakshy, Solomon Messing, and Lada A Adamic. 2015. Exposure to ideologically diverse news and opinion on Facebook. Science , Vol. 348, 6239 (2015), 1130--1132.Google Scholar
- Y Bechavod and K Ligett. 2017. Penalizing unfairness in binary classification. arXiv preprint arXiv:1707.00044 (2017).Google Scholar
- Irwan Bello, Sayali Kulkarni, Sagar Jain, Craig Boutilier, Ed Chi, Elad Eban, Xiyang Luo, Alan Mackey, and Ofer Meshi. 2018. Seq2Slate: Re-ranking and Slate Optimization with RNNs. arXiv preprint arXiv:1810.02019 (2018).Google Scholar
- Alex Beutel, Jilin Chen, Tulsee Doshi, Hai Qian, Allison Woodruff, Christine Luu, Pierre Kreitmann, Jonathan Bischof, and Ed H Chi. 2019. Putting Fairness Principles into Practice: Challenges, Metrics, and Improvements. In AIES . Google ScholarDigital Library
- Alex Beutel, Jilin Chen, Zhe Zhao, and Ed H Chi. 2017a. Data decisions and theoretical implications when adversarially learning fair representations. arXiv preprint arXiv:1707.00075 (2017).Google Scholar
- Alex Beutel, Ed H Chi, Zhiyuan Cheng, Hubert Pham, and John Anderson. 2017b. Beyond globally optimal: Focused learning for improved recommendations. In WWW . 203--212. Google ScholarDigital Library
- Alex Beutel, Paul Covington, Sagar Jain, Can Xu, Jia Li, Vince Gatto, and Ed H Chi. 2018. Latent Cross: Making Use of Context in Recurrent Recommender Systems. In WSDM . 46--54. Google ScholarDigital Library
- Asia J. Biega, Krishna P. Gummadi, and Gerhard Weikum. 2018. Equity of Attention: Amortizing Individual Fairness in Rankings. In SIGIR . 405--414. Google ScholarDigital Library
- Daniel Borkan, Lucas Dixon, Jeffrey Sorensen, Nithum Thain, and Lucy Vasserman. 2019. Nuanced Metrics for Measuring Unintended Bias with Real Data for Text Classification. (2019).Google Scholar
- Toon Calders and Sicco Verwer. 2010. Three naive Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery , Vol. 21, 2 (2010), 277--292. Google ScholarDigital Library
- Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to rank: from pairwise approach to listwise approach. In ICML . Google ScholarDigital Library
- Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H. Chi. 2019. Top-K Off-Policy Correction for a REINFORCE Recommender System. In WSDM . 456--464. Google ScholarDigital Library
- Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In RecSys. 191--198. Google ScholarDigital Library
- Cynthia S Crowson, Elizabeth J Atkinson, and Terry M Therneau. 2016. Assessing calibration of prognostic risk scores. Statistical methods in medical research , Vol. 25, 4 (2016), 1692--1706.Google Scholar
- Lucas Dixon, John Li, Jeffrey Sorensen, Nithum Thain, and Lucy Vasserman. 2018. Measuring and mitigating unintended bias in text classification. (2018).Google Scholar
- Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. ACM, 214--226. Google ScholarDigital Library
- Harrison Edwards and Amos Storkey. 2015. Censoring representations with an adversary. arXiv preprint arXiv:1511.05897 (2015).Google Scholar
- Michael D Ekstrand, Mucun Tian, Mohammed R Imran Kazi, Hoda Mehrpouyan, and Daniel Kluver. 2018. Exploring author gender in book rating and recommendation. In RecSys . 242--250. Google ScholarDigital Library
- Gabriel Goh, Andrew Cotter, Maya R. Gupta, and Michael P. Friedlander. 2016. Satisfying Real-world Goals with Dataset Constraints. In Advances in Neural Information Processing Systems. 2415--2423. Google ScholarDigital Library
- Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of opportunity in supervised learning. In Advances in neural information processing systems . Google ScholarDigital Library
- Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web. 173--182. Google ScholarDigital Library
- Xinran He, Junfeng Pan, Ou Jin, Tianbing Xu, Bo Liu, Tao Xu, Yanxin Shi, Antoine Atallah, Ralf Herbrich, Stuart Bowers, et almbox. 2014. Practical lessons from predicting clicks on ads at facebook. In Proceedings of the Eighth International Workshop on Data Mining for Online Advertising. ACM, 1--9. Google ScholarDigital Library
- Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015).Google Scholar
- Ray Jiang, Silvia Chiappa, Tor Lattimore, Andras Agyorgy, and Pushmeet Kohli. 2019. Degenerate Feedback Loops in Recommender Systems. (2019).Google Scholar
- Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In KDD. 133--142. Google ScholarDigital Library
- Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased learning-to-rank with biased feedback. In WSDM. 781--789. Google ScholarDigital Library
- Toshihiro Kamishima, Shotaro Akaho, and Jun Sakuma. 2011. Fairness-aware learning through regularization approach. In 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW). IEEE, 643--650. Google ScholarDigital Library
- Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2016. Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807 (2016).Google Scholar
- Jon Kleinberg and Manish Raghavan. 2018. Selection Problems in the Presence of Implicit Bias. arXiv preprint arXiv:1801.03533 (2018).Google Scholar
- Yehuda Koren. 2009. Collaborative filtering with temporal dynamics. In KDD. 447--456. Google ScholarDigital Library
- Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 8 (2009), 30--37. Google ScholarDigital Library
- Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, and Richard S. Zemel. 2016. The Variational Fair Autoencoder. In ICRL .Google Scholar
- Jiaqi Ma, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, and Ed H Chi. 2018. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In KDD . 1930--1939. Google ScholarDigital Library
- David Madras, Elliot Creager, Toniann Pitassi, and Richard S. Zemel. 2018. Learning Adversarially Fair and Transferable Representations. In ICML .Google Scholar
- H Brendan McMahan, Gary Holt, David Sculley, Michael Young, Dietmar Ebner, Julian Grady, Lan Nie, Todd Phillips, Eugene Davydov, Daniel Golovin, et almbox. 2013. Ad click prediction: a view from the trenches. In KDD . Google ScholarDigital Library
- Rishabh Mehrotra, James McInerney, Hugues Bouchard, Mounia Lalmas, and Fernando Diaz. 2018. Towards a Fair Marketplace: Counterfactual Evaluation of the trade-off between Relevance, Fairness & Satisfaction in Recommendation Systems. In CIKM . 2243--2251. Google ScholarDigital Library
- Geoff Pleiss, Manish Raghavan, Felix Wu, Jon Kleinberg, and Kilian Q Weinberger. 2017. On fairness and calibration. In Advances in Neural Information Processing Systems. 5680--5689. Google ScholarDigital Library
- Ya'acov Ritov, Yuekai Sun, and Ruofei Zhao. 2017. On conditional parity as a notion of non-discrimination in machine learning. arXiv preprint arXiv:1706.08519 (2017).Google Scholar
- Tobias Schnabel, Adith Swaminathan, Peter I Frazier, and Thorsten Joachims. 2016a. Unbiased comparative evaluation of ranking functions. In Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval. ACM, 109--118. Google ScholarDigital Library
- Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016b. Recommendations as treatments: Debiasing learning and evaluation. arXiv preprint arXiv:1602.05352 (2016). Google ScholarDigital Library
- Ashudeep Singh and Thorsten Joachims. 2018. Fairness of Exposure in Rankings. In KDD. 2219--2228. Google ScholarDigital Library
- Ashudeep Singh and Thorsten Joachims. 2019. Policy Learning for Fairness in Ranking . arXiv preprint arXiv:1902.04056 (2019).Google Scholar
- Julia Stoyanovich, Ke Yang, and HV Jagadish. 2018. Online set selection with fairness and diversity constraints. In EDBT .Google Scholar
- Lidan Wang, Jimmy Lin, and Donald Metzler. 2011. A cascade ranking model for efficient ranked retrieval. In SIGIR. 105--114. Google ScholarDigital Library
- Chao-Yuan Wu, Amr Ahmed, Alex Beutel, Alexander J Smola, and How Jing. 2017. Recurrent recommender networks. In WSDM. 495--503. Google ScholarDigital Library
- Sirui Yao and Bert Huang. 2017. Beyond parity: Fairness objectives for collaborative filtering. In Advances in Neural Information Processing Systems . Google ScholarDigital Library
- Xing Yi, Liangjie Hong, Erheng Zhong, Nanthan Nan Liu, and Suju Rajan. 2014. Beyond clicks: dwell time for personalization. In RecSys. 113--120. Google ScholarDigital Library
- Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. 2015. Fairness constraints: Mechanisms for fair classification. arXiv preprint arXiv:1507.05259 (2015).Google Scholar
- Meike Zehlike, Francesco Bonchi, Carlos Castillo, Sara Hajian, Mohamed Megahed, and Ricardo Baeza-Yates. 2017. Fa* ir: A fair top-k ranking algorithm. In CIKM. ACM, 1569--1578. Google ScholarDigital Library
- Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning fair representations. In ICML. 325--333. Google ScholarDigital Library
- Brian Hu Zhang, Blake Lemoine, and Margaret Mitchell. 2018. Mitigating Unwanted Biases with Adversarial Learning. CoRR , Vol. abs/1801.07593 (2018). arxiv: 1801.07593Google Scholar
- Ziwei Zhu, Xia Hu, and James Caverlee. 2018. Fairness-Aware Tensor-Based Recommendation. In CIKM. 1153--1162. Google ScholarDigital Library
Index Terms
- Fairness in Recommendation Ranking through Pairwise Comparisons
Recommendations
Serendipitous Personalized Ranking for Top-N Recommendation
WI-IAT '12: Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01Serendipitous recommendation has benefitted both e-retailers and users. It tends to suggest items which are both unexpected and useful to users. These items are not only profitable to the retailers but also surprisingly suitable to consumers' tastes. ...
Using a trust network to improve top-N recommendation
RecSys '09: Proceedings of the third ACM conference on Recommender systemsTop-N item recommendation is one of the important tasks of recommenders. Collaborative filtering is the most popular approach to building recommender systems which can predict ratings for a given user and item. Collaborative filtering can be extended ...
Fairness and Transparency in Recommendation: The Users’ Perspective
UMAP '21: Proceedings of the 29th ACM Conference on User Modeling, Adaptation and PersonalizationThough recommender systems are defined by personalization, recent work has shown the importance of additional, beyond-accuracy objectives, such as fairness. Because users often expect their recommendations to be purely personalized, these new ...
Comments