research-article

Fair Forests: Regularized Tree Induction to Minimize Model Bias

Authors:
Edward Raff

Booz Allen Hamilton, Columbia, MD, USA

Booz Allen Hamilton, Columbia, MD, USA
View Profile

,
Jared Sylvester

Booz Allen Hamilton, Columbia, MD, USA

Booz Allen Hamilton, Columbia, MD, USA
View Profile

,
Steven Mills

Booz Allen Hamilton, Columbia, MD, USA

Booz Allen Hamilton, Columbia, MD, USA
View Profile

AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and SocietyDecember 2018Pages 243–250https://doi.org/10.1145/3278721.3278742

Published:27 December 2018Publication History

AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society

Pages 243–250

ABSTRACT

The potential lack of fairness in the outputs of machine learning algorithms has recently gained attention both within the research community as well as in society more broadly. Surprisingly, there is no prior work developing tree-induction algorithms for building fair decision trees or fair random forests. These methods have widespread popularity as they are one of the few to be simultaneously interpretable, non-linear, and easy-to-use. In this paper we develop, to our knowledge, the first technique for the induction of fair decision trees.We show that our "Fair Forest" retains the benefits of the tree-based approach, while providing both greater accuracy and fairness than other alternatives, for both "group fairness'' and "individual fairness.'' We also introduce new measures for fairness which are able to handle multinomial and continues attributes as well as regression problems, as opposed to binary attributes and labels only. Finally, we demonstrate a new, more robust evaluation procedure for algorithms that considers the dataset in its entirety rather than only a specific protected attribute.

References

Yahav Bechavod and Katrina Ligett. 2017. Learning Fair Classifiers: A Regularization-Inspired Approach. In FAT ML Workshop . http://arxiv.org/abs/1707.00044Google Scholar
Richard Berk, Hoda Heidari, Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Seth Neel, and Aaron Roth. 2017. A Convex Framework for Fair Regression. In FAT ML Workshop . http://arxiv.org/abs/1706.02409Google Scholar
Leo Breiman. 2001. Random forests . Machine learning , Vol. 45, 1 (2001), 5--32. Google ScholarDigital Library
Leo Breiman. 2003. Manual on setting up, using, and understanding random forests v4.0 . Statistics Department University of California Berkeley, CA, USA (2003).Google Scholar
Leo Breiman, Jerome Friedman, Charles J. Stone, and R.A. Olshen. 1984. Classification and Regression Trees. CRC press.Google Scholar
Toon Calders, Asim Karim, Faisal Kamiran, Wasif Ali, and Xiangliang Zhang. 2013. Controlling Attribute Effect in Linear Regression. In 2013 IEEE 13th International Conference on Data Mining. IEEE, 71--80.Google Scholar
Toon Calders and Sicco Verwer. 2010. Three Naive Bayes Approaches for Discrimination-free Classification . Data Min. Knowl. Discov. , Vol. 21, 2 (9 2010), 277--292. Google ScholarDigital Library
Tianqi Chen and Carlos Guestrin. 2016. XGBoost: Reliable Large-scale Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . Google ScholarDigital Library
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness Through Awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS '12). ACM, New York, NY, USA, 214--226. Google ScholarDigital Library
Cynthia Dwork, Nicole Immorlica, Adam Tauman Kalai, and Max Leiserson. 2017. Decoupled classifiers for fair and efficient machine learning. In FAT ML Workshop . https://doi.org/1707.06613Google Scholar
Harrison Edwards and Amos Storkey. 2016. Censoring Representations with an Adversary. In International Conference on Learning Representations (ICLR) . http://arxiv.org/abs/1511.05897Google Scholar
Manuel Ferná ndez-Delgado, Eva Cernadas, Senén Barro, and Dinani Amorim. 2014. Do we Need Hundreds of Classifiers to Solve Real World Classification Problems? Journal of Machine Learning Research , Vol. 15 (2014), 3133--3181. http://jmlr.org/papers/v15/delgado14a.html Google ScholarDigital Library
Jerome H. Friedman. 2002. Stochastic gradient boosting . Computational Statistics & Data Analysis , Vol. 38, 4 (2002), 367--378. http://www.sciencedirect.com/science/article/pii/S0167947301000652 Google ScholarDigital Library
Eva Garcí a-Martí n and Niklas Lavesson. 2017. Is it ethical to avoid error analysis?. In FAT ML Workshop . http://arxiv.org/abs/1706.10237Google Scholar
Patrick Hall and Navdeep Gill. 2017. Debugging the Black-Box COMPAS Risk Assessment Instrument to Diagnose and Remediate Bias . (2017). https://openreview.net/pdf?id=r1iWHVJ7ZGoogle Scholar
Moritz Hardt, Eric Price, and Nathan Srebro. 2016. Equality of Opportunity in Supervised Learning . In Advances in Neural Information Processing Systems 29 (NIPS 2016) . Google ScholarDigital Library
Faisal Kamiran and Toon Calders. 2009. Classifying without discriminating. In 2009 2nd International Conference on Computer, Control and Communication. IEEE, 1--6.Google ScholarCross Ref
Toshihiro Kamishima, Shotaro Akaho, and Jun Sakuma. 2011. Fairness-aware Learning Through Regularization Approach. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW '11). IEEE Computer Society, Washington, DC, USA, 643--650. Google ScholarDigital Library
Virgile Landeiro and Aron Culotta. 2016. Robust Text Classification in the Presence of Confounding Bias. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI'16). AAAI Press, 186--193. http://dl.acm.org/citation.cfm?id=3015812.3015840 Google ScholarDigital Library
Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, and Richard Zemel. 2016. The Variational Fair Autoencoder. In International Conference on Learning Representations (ICLR) . http://arxiv.org/abs/1511.00830Google Scholar
Gilles Louppe, Louis Wehenkel, Antonio Sutera, and Pierre Geurts. 2013. Understanding variable importances in forests of randomized trees . In Advances in Neural Information Processing Systems 26, C.j.c. Burges, L Bottou, M Welling, Z Ghahramani, and K.q. Weinberger (Eds.). 431--439. http://media.nips.cc/nipsbooks/nipspapers/paper_files/nips26/281.pdf Google ScholarDigital Library
Binh Thanh Luong, Salvatore Ruggieri, and Franco Turini. 2011. k-NN As an Implementation of Situation Testing for Discrimination Discovery and Prevention. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '11). ACM, New York, NY, USA, 502--510. Google ScholarDigital Library
Dino Pedreshi, Salvatore Ruggieri, and Franco Turini. 2008. Discrimination-aware Data Mining. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '08). ACM, New York, NY, USA, 560--568. Google ScholarDigital Library
J R Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann series in M achine L earning, Vol. 1. Morgan Kaufmann. 302 pages. http://portal.acm.org/citation.cfm?id=152181 Google ScholarDigital Library
Edward Raff. 2017. JSAT: Java Statistical Analysis Tool, a Library for Machine Learning . Journal of Machine Learning Research , Vol. 18, 23 (2017), 1--5. http://jmlr.org/papers/v18/16--131.html Google ScholarDigital Library
Michael Skirpan and Micha Gorelick. 2017. The Authority of "Fair" in Machine Learning. In FAT ML Workshop . http://arxiv.org/abs/1706.09976Google Scholar
Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning Fair Representations. In Proceedings of the 30th International Conference on Machine Learning (Proceedings of Machine Learning Research), Sanjoy Dasgupta and David McAllester (Eds.), Vol. 28. PMLR, Atlanta, Georgia, USA, 325--333. http://proceedings.mlr.press/v28/zemel13.html Google ScholarDigital Library

Index Terms

Fair Forests: Regularized Tree Induction to Minimize Model Bias
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Classification and regression trees
2. Social and professional topics
  1. Professional topics
    1. Computing profession
      1. Codes of ethics
  2. User characteristics

Recommendations

Performance evaluation of a fair backoff algorithm for IEEE 802.11 DFWMAC
MobiHoc '02: Proceedings of the 3rd ACM international symposium on Mobile ad hoc networking & computing

Due to hidden terminals and a dynamic topology, contention among stations in an ad-hoc network is not homogeneous. Some stations are at a disadvantage in opportunity of access to the shared channel and can suffer severe throughput degradation when the ...
Read More
Inter-AP coordination for fair throughput in infrastructure-based IEEE 802.11 mesh networks
IWCMC '06: Proceedings of the 2006 international conference on Wireless communications and mobile computing

This paper studies throughput fairness among different basic service sets (BSSs) in infrastructure-based IEEE 802.11 mesh networks, where inter-BSS interference is unavoidable because of the difficulty in frequency and coverage planning and the limited ...
Read More
Enhanced binary exponential backoff algorithm for fair channel access in the ieee 802.11 medium access control protocol

The medium access control protocol determines system throughput in wireless mobile ad hoc networks following the ieee 802.11 standard. Under this standard, asynchronous data transmissions have a defined distributed coordination function that allows ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society
December 2018
406 pages
ISBN:9781450360128
DOI:10.1145/3278721
Program Chairs:
Jason Furman
Harvard University, USA
,
Gary Marchant
Arizona State University, USA
,
Huw Price
Cambridge University, UK
,
Francesca Rossi
IBM Research, USA & University of Padova, Italy
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 December 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
fairness
feature importance
random forest
Qualifiers
- research-article
Conference

Acceptance Rates
AIES '18 Paper Acceptance Rate61of162submissions,38%Overall Acceptance Rate61of162submissions,38%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 27
  Total Citations
  View Citations
- 375
  Total Downloads
- Downloads (Last 12 months)68
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fair Forests: Regularized Tree Induction to Minimize Model Bias

AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society

ABSTRACT

References

Cited By

Index Terms

Recommendations

Performance evaluation of a fair backoff algorithm for IEEE 802.11 DFWMAC

Inter-AP coordination for fair throughput in infrastructure-based IEEE 802.11 mesh networks

Enhanced binary exponential backoff algorithm for fair channel access in the ieee 802.11 medium access control protocol

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Fair Forests: Regularized Tree Induction to Minimize Model Bias

AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society

ABSTRACT

References

Cited By

Index Terms

Recommendations

Performance evaluation of a fair backoff algorithm for IEEE 802.11 DFWMAC

Inter-AP coordination for fair throughput in infrastructure-based IEEE 802.11 mesh networks

Enhanced binary exponential backoff algorithm for fair channel access in the ieee 802.11 medium access control protocol

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media