nach oben

Erschienen in:

2023 | OriginalPaper | Buchkapitel

An Investigation of Structures Responsible for Gender Bias in BERT and DistilBERT

verfasst von : Thibaud Leteno, Antoine Gourru, Charlotte Laclau, Christophe Gravier

Erschienen in: Advances in Intelligent Data Analysis XXI

Verlag: Springer Nature Switzerland

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In recent years, large Transformer-based Pre-trained Language Models (PLM) have changed the Natural Language Processing (NLP) landscape, by pushing the performance boundaries of the state-of-the-art on a wide variety of tasks. However, this performance gain goes along with an increase in complexity, and as a result, the size of such models (up to billions of parameters) represents a constraint for their deployment on embedded devices or short-inference time tasks. To cope with this situation, compressed models emerged (e.g. DistilBERT), democratizing their usage in a growing number of applications that impact our daily lives. A crucial issue is the fairness of the predictions made by both PLMs and their distilled counterparts. In this paper, we propose an empirical exploration of this problem by formalizing two questions: (1) Can we identify the neural mechanism(s) responsible for gender bias in BERT (and by extension DistilBERT)? (2) Does distillation tend to accentuate or mitigate gender bias (e.g. is DistilBERT more prone to gender bias than its uncompressed version, BERT)? Our findings are the following: (I) one cannot identify a specific layer that produces bias; (II) every attention head uniformly encodes bias; except in the context of underrepresented classes with a high imbalance of the sensitive attribute; (III) this subset of heads is different as we re-fine tune the network; (IV) bias is more homogeneously produced by the heads in the distilled model.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Geolet: An Interpretable Model for Trajectory Classification

Nächstes Kapitel Discovering Diverse Top-K Characteristic Lists

The model is also trained on a next sentence prediction task, but that is irrelevant in our work and therefore not presented here.

BERT: https://huggingface.co/docs/Transformers/model_doc/bert,

DistilBERT: https://huggingface.co/docs/Transformers/model_doc/distilbert.

Dataset: https://www.kaggle.com/competitions/defi-ia-insa-toulouse/data.

Basta, C., Costa-jussà, M.R., Casas, N.: Evaluating the underlying gender bias in contextualized word embeddings. In: The 1st Workshop on Gender Bias in NLP, pp. 33–39. ACL (2019)

Bender, E.M., Gebru, T., McMillan-Major, A., Shmitchell, S.: On the dangers of stochastic parrots: can language models be too big? In: FAccT, pp. 610–623 (2021)

Bhardwaj, R., Majumder, N., Poria, S.: Investigating gender bias in bert. Cognitive Computation (2020)

Caton, S., Haas, C.: Fairness in machine learning: a survey. arXiv preprint arXiv:2010.04053 (2020)

Dai, A.M., Le, Q.V.: Semi-supervised sequence learning. In: NeurIPS, p. 3079–3087 (2015)

De-Arteaga, M., et al.: Bias in bios: a case study of semantic representation bias in a high-stakes setting. In: FaccT, pp. 120–128 (2019)

Delobelle, P., Berendt, B.: Fairdistillation: mitigating stereotyping in language models. In: ECML-PKDD (2022)

Delobelle, P., Tokpo, E., Calders, T., Berendt, B.: Measuring fairness with biased rulers: a comparative study on bias metrics for pre-trained language models. In: Proceedings of NAACL, pp. 1693–1706 (2022)

Demner-Fushman, D., Chapman, W.W., McDonald, C.J.: What can natural language processing do for clinical decision support? J. Biomed. Inform. 42(5), 760–772 (2009)CrossRef

10.

Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (2019)

11.

Dixon, L., Li, J., Sorensen, J., Thain, N., Vasserman, L.: Measuring and mitigating unintended bias in text classification. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 67–73 (2018)

12.

Gupta, M., Agrawal, P.: Compression of deep learning models for text: a survey. ACM Trans. Knowl. Discov. Data 16(4), 1–55 (2022)CrossRef

13.

Hao, Y., Dong, L., Wei, F., Xu, K.: Investigating learning dynamics of BERT fine-tuning. In: Proceedings of the Conference of the Asia-Pacific Chapter of the ACL and the IJCNLP. Suzhou, China (2020)

14.

Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004)CrossRefMATH

15.

Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: NeurIPS, vol. 29 (2016)

16.

Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arxiv pre-print (2015)

17.

Hooker, S., Courville, A., Clark, G., Dauphin, Y., Frome, A.: What do compressed deep neural networks forget? arxiv pre-print (2021)

18.

Hooker, S., Moorosi, N., Clark, G., Bengio, S., Denton, E.: Characterising bias in compressed models. arXiv preprint arXiv:2010.03058 (2020)

19.

Hutchinson, B., Prabhakaran, V., Denton, E., Webster, K., Zhong, Y., Denuyl, S.: Social biases in NLP models as barriers for persons with disabilities. In: ACL, pp. 5491–5501 (2020)

20.

Jatobá, M., Santos, J., Gutierriz, I., Moscon, D., Fernandes, P.O., Teixeira, J.P.: Evolution of artificial intelligence research in human resources. Procedia Comput. Sci. 164, 137–142 (2019)CrossRef

21.

Jawahar, G., Sagot, B., Seddah, D.: What does bert learn about the structure of language? In: ACL (2019)

22.

Kim, S., Gholami, A., Yao, Z., Mahoney, M.W., Keutzer, K.: I-bert: Integer-only bert quantization (2021)

23.

Kurita, K., Vyas, N., Pareek, A., Black, A.W., Tsvetkov, Y.: Measuring bias in contextualized word representations. In: The 1st Workshop on Gender Bias in Natural Language Processing, pp. 166–172 (2019)

24.

LeCun, Y., Denker, J., Solla, S.: Optimal brain damage. In: Advances in Neural Information Processing Systems, vol. 2 (1989)

25.

Lukasik, M., Bhojanapalli, S., Menon, A.K., Kumar, S.: Teacher’s pet: understanding and mitigating biases in distillation. arXiv preprint arXiv:2106.10494 (2021)

26.

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. 54(6), 1–35 (2021)CrossRef

27.

Merchant, A., Rahimtoroghi, E., Pavlick, E., Tenney, I.: What happens to BERT embeddings during fine-tuning? In: Proceedings of the 3rd BlackboxNLP Workshop, pp. 33–44. ACL (2020)

28.

Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training. OpenAI blog (2018)

29.

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019)

30.

Raghu, M., Gilmer, J., Yosinski, J., Sohl-Dickstein, J.: Svcca: singular vector canonical correlation analysis for deep learning dynamics and interpretability. In: NeurIPS, vol. 30 (2017)

31.

Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arxiv pre-print (2020)

32.

Sheng, E., Chang, K., Natarajan, P., Peng, N.: The woman worked as a babysitter: on biases in language generation. In: EMNLP/IJCNLP, pp. 3405–3410 (2019)

33.

Swinger, N., De-Arteaga, M., Heffernan IV, N.T., Leiserson, M.D., Kalai, A.T.: What are the biases in my word embedding? In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pp. 305–311 (2019)

34.

Vaswani, A., et al.: Attention is all you need. In: NeurIPS, vol. 30 (2017)

35.

Xu, G., Hu, Q.: Can model compression improve NLP fairness. arXiv preprint arXiv:2201.08542 (2022)

Titel: An Investigation of Structures Responsible for Gender Bias in BERT and DistilBERT
verfasst von: Thibaud Leteno
Antoine Gourru
Charlotte Laclau
Christophe Gravier
Verlag: Springer Nature Switzerland
Buch: Advances in Intelligent Data Analysis XXI
Print ISBN: 978-3-031-30046-2

Electronic ISBN: 978-3-031-30047-9

Copyright-Jahr: 2023
DOI: https://doi.org/10.1007/978-3-031-30047-9_20

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner