ABSTRACT
Classification models are often used to make decisions that affect humans: whether to approve a loan application, extend a job offer, or provide insurance. In such applications, individuals should have the ability to change the decision of the model. When a person is denied a loan by a credit scoring model, for example, they should be able to change the input variables of the model in a way that will guarantee approval. Otherwise, this person will be denied the loan so long as the model is deployed, and -- more importantly --will lack agency over a decision that affects their livelihood.
In this paper, we propose to evaluate a linear classification model in terms of recourse, which we define as the ability of a person to change the decision of the model through actionable input variables (e.g., income vs. age or marital status). We present an integer programming toolkit to: (i) measure the feasibility and difficulty of recourse in a target population; and (ii) generate a list of actionable changes for a person to obtain a desired outcome. We discuss how our tools can inform different stakeholders by using them to audit recourse for credit scoring models built with real-world datasets. Our results illustrate how recourse can be significantly affected by common modeling practices, and motivate the need to evaluate recourse in algorithmic decision-making.
- Charu C Aggarwal, Chen Chen, and Jiawei Han. 2010. The Inverse Classification Problem. Journal of Computer Science and Technology 25, 3 (2010), 458--468. Google ScholarDigital Library
- Ifeoma Ajunwa, Sorelle Friedler, Carlos E Scheidegger, and Suresh Venkatasubramanian. 2016. Hiring by Algorithm: Predicting and Preventing Disparate Impact. Available at SSRN (2016).Google ScholarCross Ref
- Kevin Bache and Moshe Lichman. 2013. UCI Machine Learning Repository.Google Scholar
- Pietro Belotti, Pierre Bonami, Matteo Fischetti, Andrea Lodi, Michele Monaci, Amaya Nogales-Gómez, and Domenico Salvagnin. 2016. On handling indicator constraints in mixed integer programming. Computational Optimization and Applications 65, 3 (2016), 545--566. Google ScholarDigital Library
- Reuben Binns, Max Van Kleek, Michael Veale, Ulrik Lyngs, Jun Zhao, and Nigel Shadbolt. 2018. 'It's Reducing a Human Being to a Percentage': Perceptions of Justice in Algorithmic Decisions. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 377. Google ScholarDigital Library
- Or Biran and Kathleen McKeown. 2014. Justification narratives for individual classifications. In Proceedings of the AutoML workshop at ICML, Vol. 2014.Google Scholar
- Jean-François Blanchette and Deborah G Johnson. 2002. Data retention and the panoptic society: The social benefits of forgetfulness. The Information Society 18, 1 (2002), 33--45.Google ScholarCross Ref
- Allison Chang, Cynthia Rudin, Michael Cavaretta, Robert Thomas, and Gloria Chou. 2012. How to reverse-engineer quality rankings. Machine Learning 88, 3 (2012), 369--398. Google ScholarDigital Library
- Alexandra Chouldechova, Diana Benavides-Prado, Oleksandr Fialko, and Rhema Vaithianathan. 2018. A Case Study of Algorithm-Assisted Decision Making in Child Maltreatment Hotline Screening Decisions. In Conference on Fairness, Accountability and Transparency. 134--148.Google Scholar
- Danielle Keats Citron and Frank Pasquale. 2014. The Scored Society: Due Process for Automated Predictions. Washington Law Review 89 (2014), 1.Google Scholar
- Kate Crawford and Jason Schultz. 2014. Big data and due process: Toward a framework to redress predictive privacy harms. BCL Rev. 55 (2014), 93.Google Scholar
- Open Knowledge Foundation Deutschland. 2018. Get Involved: We Crack the Schufa! https://okfn.de/blog/2018/02/openschufa-english/.Google Scholar
- Jinshuo Dong, Aaron Roth, Zachary Schutzman, Bo Waggoner, and Zhiwei Steven Wu. 2018. Strategic Classification from Revealed Preferences. In Proceedings of the 2018 ACM Conference on Economics and Computation. ACM, 55--70. Google ScholarDigital Library
- Finale Doshi-Velez, Mason Kortz, Ryan Budish, Chris Bavitz, Sam Gershman, David O'Brien, Stuart Schieber, James Waldo, David Weinberger, and Alexandra Wood. 2017. Accountability of AI Under the Law: The Role of Explanation. ArXiv e-prints, Article arXiv:1711.01134 (Nov. 2017). arXiv:1711.01134Google Scholar
- Lilian Edwards and Michael Veale. 2017. Slave to the Algorithm: Why a Right to an Explanation Is Probably Not the Remedy You Are Looking for. Duke L. & Tech. Rev. 16 (2017), 18.Google Scholar
- Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. 2018. Analysis of classifiers' robustness to adversarial perturbations. Machine Learning 107, 3 (2018), 481--508. Google ScholarDigital Library
- Moritz Hardt, Nimrod Megiddo, Christos Papadimitriou, and Mary Wootters. 2016. Strategic classification. In Proceedings of the 2016 ACM Conference on Innovations in Theoretical Computer Science. ACM, 111--122. Google ScholarDigital Library
- IBM ILOG. 2018. CPLEX Optimizer 12.8. https:/www.ibm.com/analytics/cplex-optimizer.Google Scholar
- Kaggle. 2011. Give Me Some Credit. http://www.kaggle.com/c/GiveMeSomeCredit/.Google Scholar
- Jon Kleinberg and Manish Raghavan. 2018. How Do Classifiers Induce Agents To Invest Effort Strategically? ArXiv e-prints, Article arXiv:1807.05307 (July 2018), arXiv:1807.05307 pages. arXiv:cs.CY/1807.05307Google Scholar
- Brian Y Lim and Anind K Dey. 2009. Assessing demand for intelligibility in context-aware applications. In Proceedings of the 11th international conference on Ubiquitous computing. ACM, 195--204. Google ScholarDigital Library
- David Martens and Foster Provost. 2014. Explaining data-driven document classifications. MIS Quarterly 38, 1 (2014), 73--100. Google ScholarDigital Library
- Hans Mittleman. 2018. Mixed Integer Linear Programming Benchmarks (MIPLIB 2010). http://plato.asu.edu/ftp/milpc.html.Google Scholar
- Cathy O'Neil. 2016. Weapons of math destruction: How big data increases inequality and threatens democracy. Broadway Books. Google Scholar
- Brett Poulin, Roman Eisner, Duane Szafron, Paul Lu, Russell Greiner, David S Wishart, Alona Fyshe, Brandon Pearcy, Cam MacDonell, and John Anvik. 2006. Visual explanation of evidence with additive classifiers. In Proceedings Of The National Conference On Artificial Intelligence, Vol. 21. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, 1822. Google ScholarDigital Library
- Dillon Reisman, Jason Schultz, Kate Crawford, and Meredith Whittaker. 2018. Algorithmic Impact Assessments: A Practical Framework for Public Agency Accountability. AI Now Technical Report.Google Scholar
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1135--1144. Google ScholarDigital Library
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High-precision model-agnostic explanations. In AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
- Andrew D Selbst and Solon Barocas. 2018. The Intuitive Appeal of Explainable Machines. Fordham Law Review, Forthcoming (2018).Google Scholar
- Shayak Sen, Piotr Mardziel, Anupam Datta, and Matthew Fredrikson. 2018. Supervising Feature Influence. arXiv preprint arXiv:1803.10815 (2018).Google Scholar
- Ravi Shroff. 2017. Predictive Analytics for City Agencies: Lessons from Children's Services. Big data 5, 3 (2017), 189--196.Google Scholar
- Naeem Siddiqi. 2012. Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring. Vol. 3. John Wiley & Sons.Google Scholar
- Alexander Spangher and Berk Ustun. 2018. Actionable Recourse in Linear Classification. In Proceedings of the 5th Workshop on Fairness, Accountability and Transparency in Machine Learning. https://econcs.seas.harvard.edu/files/econcs/files/spangher_fatml18.pdfGoogle Scholar
- Winnie F Taylor. 1980. Meeting the Equal Credit Opportunity Act's Specificity Requirement: Judgmental and Statistical Scoring Systems. Buff. L. Rev. 29 (1980), 73.Google Scholar
- John A Tomlin. 1988. Special ordered sets and an application to gas supply operations planning. Mathematical programming 42, 1-3 (1988), 69--84. Google ScholarDigital Library
- Florian Tramèr, Fan Zhang, Ari Juels, Michael K Reiter, and Thomas Ristenpart. 2016. Stealing Machine Learning Models via Prediction APIs.. In USENIX Security Symposium. 601--618. Google ScholarDigital Library
- United States Congress. 2003. The Fair and Accurate Credit Transactions Act.Google Scholar
- Sandra Wachter and Brent Mittelstadt. 2018. A Right to Reasonable Inferences: Re-thinking Data Protection Law in the Age of Big Data and AI. Columbia Business Law Review, Forthcoming (2018).Google Scholar
- Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR. (2017).Google Scholar
- Colin Wilhelm. 2018. Big Data and the Credit Gap. https:/www.politico.com/agenda/story/2018/02/07/big-data-credit-gap-000630.Google Scholar
- I-Cheng Yeh and Che-hui Lien. 2009. The Comparisons of Data Mining Techniques for the Predictive Accuracy of Probability of Default of Credit Card Clients. Expert Systems with Applications 36, 2 (2009), 2473--2480. Google ScholarDigital Library
Index Terms
- Actionable Recourse in Linear Classification
Recommendations
Recourse-based stochastic nonlinear programming: properties and Benders-SQP algorithms
In this paper, we study recourse-based stochastic nonlinear programs and make two sets of contributions. The first set assumes general probability spaces and provides a deeper understanding of feasibility and recourse in stochastic nonlinear programs. A ...
Disjunctive Decomposition for Two-Stage Stochastic Mixed-Binary Programs with Random Recourse
This paper introduces disjunctive decomposition for two-stage mixed 0-1 stochastic integer programs (SIPs) with random recourse. Disjunctive decomposition allows for cutting planes based on disjunctive programming to be generated for each scenario ...
The use of profit scoring as an alternative to credit scoring systems in peer-to-peer (P2P) lending
This study goes beyond peer-to-peer (P2P) lending credit scoring systems by proposing a profit scoring. Credit scoring systems estimate loan default probability. Although failed borrowers do not reimburse the entire loan, certain amounts may be ...
Comments