ABSTRACT
As machine learning becomes increasingly incorporated within high impact decision ecosystems, there is a growing need to understand the long-term behaviors of deployed ML-based decision systems and their potential consequences. Most approaches to understanding or improving the fairness of these systems have focused on static settings without considering long-term dynamics. This is understandable; long term dynamics are hard to assess, particularly because they do not align with the traditional supervised ML research framework that uses fixed data sets. To address this structural difficulty in the field, we advocate for the use of simulation as a key tool in studying the fairness of algorithms. We explore three toy examples of dynamical systems that have been previously studied in the context of fair decision making for bank loans, college admissions, and allocation of attention. By analyzing how learning agents interact with these systems in simulation, we are able to extend previous work, showing that static or single-step analyses do not give a complete picture of the long-term consequences of an ML-based decision system. We provide an extensible open-source software framework for implementing fairness-focused simulation studies and further reproducible research, available at https://github.com/google/ml-fairness-gym.
Supplemental Material
Available for Download
Supplemental material.
- Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. Openai gym. arXiv preprint arXiv:1606.01540 (2016).Google Scholar
- Michael Brückner and Tobias Scheffer. 2011. Stackelberg games for adversarial prediction problems. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 547--555.Google ScholarDigital Library
- Alex Campolo, Madelyn Sanfilippo, Meredith Whittaker, and Kate Crawford. 2017. AI now 2017 report. AI Now Institute at New York University (2017).Google Scholar
- Pablo Samuel Castro, Subhodeep Moitra, Carles Gelada, Saurabh Kumar, and Marc G. Bellemare. 2018. Dopamine: A Research Framework for Deep Reinforcement Learning. (2018). http://arxiv.org/abs/1812.06110Google Scholar
- Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/mlGoogle Scholar
- Hadi Elzayn, Shahin Jabbari, Christopher Jung, Michael Kearns, Seth Neel, Aaron Roth, and Zachary Schutzman. 2019. Fair algorithms for learning in allocation problems. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, 170--179.Google ScholarDigital Library
- Danielle Ensign, Sorelle A Friedler, Scott Neville, Carlos Scheidegger, and Suresh Venkatasubramanian. 2018. Runaway Feedback Loops in Predictive Policing. In Conference on Fairness, Accountability and Transparency. ACM, 160--171.Google Scholar
- Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 259--268.Google ScholarDigital Library
- Jay W Forrester. 2007. System dynamics - a personal view of the first fifty years. System Dynamics Review: The Journal of the System Dynamics Society 23, 2-3 (2007), 345--358.Google Scholar
- Ian Hacking. 1986. Making Up People. In Reconstructing individualism: Autonomy, individuality, and the self in Western thought, Thomas C Heller, Morton Sosna, and David E Wellberry (Eds.). Stanford University Press.Google Scholar
- Ian Hacking. 1995. The looping effects of human kinds. (1995).Google Scholar
- Ian Hacking, Jan Hacking, et al. 1999. The social construction of what? Harvard university press.Google Scholar
- Moritz Hardt, Nimrod Megiddo, Christos Papadimitriou, and Mary Wootters. 2016. Strategic classification. In Proceedings of the 2016 ACM conference on innovations in theoretical computer science. ACM, 111--122.Google ScholarDigital Library
- Moritz Hardt, Eric Price, and Nathan Srebro. 2016. Equality of opportunity in supervised learning. In Proceedings of the 30th International Conference on Neural Information Processing Systems. Curran Associates Inc., 3323--3331.Google Scholar
- Anna Lauren Hoffmann. 2019. Where fairness fails: data, algorithms, and the limits of antidiscrimination discourse. Information, Communication & Society 22, 7 (2019), 900--915.Google ScholarCross Ref
- Lily Hu, Nicole Immorlica, and Jennifer Wortman Vaughan. 2019. The disparate effects of strategic manipulation. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, 259--268.Google ScholarDigital Library
- Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, and Aaron Roth. 2017. Fairness in reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 1617--1626.Google ScholarDigital Library
- Matthew Joseph, Michael Kearns, Jamie H Morgenstern, and Aaron Roth. 2016. Fairness in learning: Classic and contextual bandits. In Advances in Neural Information Processing Systems. 325--333.Google Scholar
- Faisal Kamiran and Toon Calders. 2009. Classifying without discriminating. In 2009 2nd International Conference on Computer, Control and Communication. IEEE, 1--6.Google ScholarCross Ref
- Sampath Kannan, Aaron Roth, and Juba Ziani. 2019. Downstream effects of affirmative action. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, 240--248.Google ScholarDigital Library
- Lydia T Liu, Sarah Dean, Esther Rolf, Max Simchowitz, and Moritz Hardt. 2018. Delayed Impact of Fair Machine Learning. In Proceedings of the 35th International Conference on Machine Learning.Google Scholar
- Smitha Milli, John Miller, Anca D Dragan, and Moritz Hardt. 2019. The Social Cost of Strategic Classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, 230--239.Google ScholarDigital Library
- Shira Mitchell, Eric Potash, and Solon Barocas. 2018. Prediction-based decisions and fairness: A catalogue of choices, assumptions, and definitions. arXiv preprint arXiv:1811.07867 (2018).Google Scholar
- Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529.Google Scholar
- ProPublica. [n.d.]. compas-analysis. https://github.com/propublica/compas-analysis/Google Scholar
- Rashida Richardson, Jason Schultz, and Kate Crawford. 2019. Dirty Data, Bad Predictions: How Civil Rights Violations Impact Police Data, Predictive Policing Systems, and Justice. New York University Law Review Online, Forthcoming (2019).Google Scholar
- Ken Ross. 2007. A mathematician at the ballpark: Odds and probabilities for baseball fans. Penguin.Google Scholar
- Andrew D Selbst, Danah Boyd, Sorelle A Friedler, Suresh Venkatasubramanian, and Janet Vertesi. 2019. Fairness and abstraction in sociotechnical systems. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, 59--68.Google ScholarDigital Library
- Till Speicher, Hoda Heidari, Nina Grgic-Hlaca, Krishna P Gummadi, Adish Singla, Adrian Weller, and Muhammad Bilal Zafar. 2018. A Unified Approach to Quantifying Algorithmic Unfairness: Measuring Individual & Group Unfairness via Inequality Indices. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2239--2248.Google ScholarDigital Library
- John D Sterman. 2001. System dynamics modeling: tools for learning in a complex world. California management review 43, 4 (2001), 8--25.Google Scholar
- Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction.Google Scholar
- Sebastian Thrun, Wolfram Burgard, and Dieter Fox. 2005. Probabilistic robotics.Google Scholar
Index Terms
- Fairness is not static: deeper understanding of long term fairness via simulation studies
Recommendations
Airtime Fairness for IEEE 802.11 Multirate Networks
Under a multi rate network scenario, the IEEE 802.11 DCF MAC fails to provide air-time fairness for all competing stations since the protocol is designed for ensuring max-min throughput fairness and the maximum achievable throughput by any station gets ...
Optimal CWmin selection for achieving proportional fairness in multi-rate 802.11e WLANs: test-bed implementation and evaluation
WiNTECH '06: Proceedings of the 1st international workshop on Wireless network testbeds, experimental evaluation & characterizationWe investigate the optimal selection of minimum contention window values to achieve proportional fairness in a multirate IEEE 802.11e test-bed. Unlike other approaches, the proposed model accounts for the contention-based nature of 802.11's MAC layer ...
Achieving MAC-layer fairness in CSMA/CA networks
We demonstrate that CSMA/CA networks, including IEEE 802.11 networks, exhibit severe fairness problem in many scenarios, where some hosts obtain most of the channel's bandwidth while others starve. Most existing solutions require nodes to overhear ...
Comments