research-article

Open Access

Measuring and Mitigating Unintended Bias in Text Classification

Authors:
Lucas Dixon

Google, New York, NY, USA

Google, New York, NY, USA
View Profile

,
John Li

Google, New York, NY, USA

Google, New York, NY, USA
View Profile

,
Jeffrey Sorensen

Google, New York, NY, USA

Google, New York, NY, USA
View Profile

,
Nithum Thain

Google, New York, NY, USA

Google, New York, NY, USA
View Profile

,
Lucy Vasserman

Google, New York, NY, USA

Google, New York, NY, USA
View Profile

AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and SocietyDecember 2018Pages 67–73https://doi.org/10.1145/3278721.3278729

Published:27 December 2018Publication History

AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society

Pages 67–73

ABSTRACT

We introduce and illustrate a new approach to measuring and mitigating unintended bias in machine learning models. Our definition of unintended bias is parameterized by a test set and a subset of input features. We illustrate how this can be used to evaluate text classifiers using a synthetic test set and a public corpus of comments annotated for toxicity from Wikipedia Talk pages. We also demonstrate how imbalances in training data can lead to unintended bias in the resulting models, and therefore potentially unfair applications. We use a set of common demographic identity terms as the subset of input features on which we measure bias. This technique permits analysis in the common scenario where demographic information on authors and readers is unavailable, so that bias mitigation must focus on the content of the text itself. The mitigation method we introduce is an unsupervised approach based on balancing the training dataset. We demonstrate that this approach reduces the unintended bias without compromising overall model quality.

References

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/ Software available from tensorflow.org.Google Scholar
Alex Beutel, Jilin Chen, Zhe Zhao, and Ed H. Chi. 2017. Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations. CoRR abs/1707.00075 (2017). http://arxiv.org/abs/1707.00075Google Scholar
Su Lin Blodgett and Brendan O'Connor. 2017. Racial Disparity in Natural Language Processing: A Case Study of Social Media African-American English. CoRR abs/1707.00061 (2017). http://arxiv.org/abs/1707.00061Google Scholar
Tolga Bolukbasi, Kai-Wei Chang, James Y. Zou, Venkatesh Saligrama, and Adam Kalai. 2016. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. CoRR abs/1607.06520 (2016). http://arxiv.org/abs/1607.06520Google Scholar
François Chollet et al. 2015. Keras. https://github.com/fchollet/keras.Google Scholar
Tom Fawcett. 2006. An introduction to ROC analysis. Pattern recognition letters 27, 8 (2006), 861--874. Google ScholarDigital Library
Michael Feldman, Sorelle A. Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and Removing Disparate Impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '15). ACM, New York, NY, USA, 259--268. Google ScholarDigital Library
Sorelle A. Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. 2016. On the (im)possibility of fairness. CoRR abs/1609.07236 (2016). http://arxiv.org/ abs/1609.07236Google Scholar
Moritz Hardt, Eric Price, and Nathan Srebro. 2016. Equality of Opportunity in Supervised Learning. CoRR abs/1610.02413 (2016). http://arxiv.org/abs/1610.02413Google Scholar
Dirk Hovy and Shannon L. Spruit. 2016. The Social Impact of Natural Language Processing. In ACL.Google Scholar
Jon M. Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2016. Inherent Trade-Offs in the Fair Determination of Risk Scores. CoRR abs/1609.05807 (2016). http://arxiv.org/abs/1609.05807Google Scholar
R. Tatman. 2017. Gender and Dialect Bias in YouTube's Automatic Captions. Ethics in Natural Language Processing.Google Scholar
Ellery Wulczyn, Nithum Thain, and Lucas Dixon. 2017. Ex Machina: Personal Attacks Seen at Scale. In Proceedings of the 26th International Conference on World Wide Web (WWW '17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 1391--1399. Google ScholarDigital Library

Index Terms

Measuring and Mitigating Unintended Bias in Text Classification
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Discourse, dialogue and pragmatics
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Neural networks
2. General and reference
  1. Cross-computing tools and techniques

Recommendations

Improving Text Classification Accuracy by Training Label Cleaning

In text classification (TC) and other tasks involving supervised learning, labelled data may be scarce or expensive to obtain. Semisupervised learning and active learning are two strategies whose aim is maximizing the effectiveness of the resulting ...
Read More
Text Classification from Labeled and Unlabeled Documents using EM
Special issue on information retrieval

This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining ...
Read More
Vertical Ensemble Co-Training for Text Classification
Regular Papers

High-quality, labeled data is essential for successfully applying machine learning methods to real-world text classification problems. However, in many cases, the amount of labeled data is very small compared to that of the unlabeled, and labeling ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society
December 2018
406 pages
ISBN:9781450360128
DOI:10.1145/3278721
Program Chairs:
Jason Furman
Harvard University, USA
,
Gary Marchant
Arizona State University, USA
,
Huw Price
Cambridge University, UK
,
Francesca Rossi
IBM Research, USA & University of Padova, Italy
Copyright © 2018 Owner/Author
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 December 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
algorithmic bias
fairness
machine learning
natural language processing
text classification
Qualifiers
- research-article
Conference

Acceptance Rates
AIES '18 Paper Acceptance Rate61of162submissions,38%Overall Acceptance Rate61of162submissions,38%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 186
  Total Citations
  View Citations
- 10,190
  Total Downloads
- Downloads (Last 12 months)2,873
- Downloads (Last 6 weeks)337
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Measuring and Mitigating Unintended Bias in Text Classification

AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society

ABSTRACT

References

Cited By

Index Terms

Recommendations

Improving Text Classification Accuracy by Training Label Cleaning

Text Classification from Labeled and Unlabeled Documents using EM

Vertical Ensemble Co-Training for Text Classification

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media