Skip to main content

2023 | OriginalPaper | Buchkapitel

An Investigation of Structures Responsible for Gender Bias in BERT and DistilBERT

verfasst von : Thibaud Leteno, Antoine Gourru, Charlotte Laclau, Christophe Gravier

Erschienen in: Advances in Intelligent Data Analysis XXI

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In recent years, large Transformer-based Pre-trained Language Models (PLM) have changed the Natural Language Processing (NLP) landscape, by pushing the performance boundaries of the state-of-the-art on a wide variety of tasks. However, this performance gain goes along with an increase in complexity, and as a result, the size of such models (up to billions of parameters) represents a constraint for their deployment on embedded devices or short-inference time tasks. To cope with this situation, compressed models emerged (e.g. DistilBERT), democratizing their usage in a growing number of applications that impact our daily lives. A crucial issue is the fairness of the predictions made by both PLMs and their distilled counterparts. In this paper, we propose an empirical exploration of this problem by formalizing two questions: (1) Can we identify the neural mechanism(s) responsible for gender bias in BERT (and by extension DistilBERT)? (2) Does distillation tend to accentuate or mitigate gender bias (e.g. is DistilBERT more prone to gender bias than its uncompressed version, BERT)? Our findings are the following: (I) one cannot identify a specific layer that produces bias; (II) every attention head uniformly encodes bias; except in the context of underrepresented classes with a high imbalance of the sensitive attribute; (III) this subset of heads is different as we re-fine tune the network; (IV) bias is more homogeneously produced by the heads in the distilled model.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The model is also trained on a next sentence prediction task, but that is irrelevant in our work and therefore not presented here.
 
Literatur
1.
Zurück zum Zitat Basta, C., Costa-jussà, M.R., Casas, N.: Evaluating the underlying gender bias in contextualized word embeddings. In: The 1st Workshop on Gender Bias in NLP, pp. 33–39. ACL (2019) Basta, C., Costa-jussà, M.R., Casas, N.: Evaluating the underlying gender bias in contextualized word embeddings. In: The 1st Workshop on Gender Bias in NLP, pp. 33–39. ACL (2019)
2.
Zurück zum Zitat Bender, E.M., Gebru, T., McMillan-Major, A., Shmitchell, S.: On the dangers of stochastic parrots: can language models be too big? In: FAccT, pp. 610–623 (2021) Bender, E.M., Gebru, T., McMillan-Major, A., Shmitchell, S.: On the dangers of stochastic parrots: can language models be too big? In: FAccT, pp. 610–623 (2021)
3.
Zurück zum Zitat Bhardwaj, R., Majumder, N., Poria, S.: Investigating gender bias in bert. Cognitive Computation (2020) Bhardwaj, R., Majumder, N., Poria, S.: Investigating gender bias in bert. Cognitive Computation (2020)
5.
Zurück zum Zitat Dai, A.M., Le, Q.V.: Semi-supervised sequence learning. In: NeurIPS, p. 3079–3087 (2015) Dai, A.M., Le, Q.V.: Semi-supervised sequence learning. In: NeurIPS, p. 3079–3087 (2015)
6.
Zurück zum Zitat De-Arteaga, M., et al.: Bias in bios: a case study of semantic representation bias in a high-stakes setting. In: FaccT, pp. 120–128 (2019) De-Arteaga, M., et al.: Bias in bios: a case study of semantic representation bias in a high-stakes setting. In: FaccT, pp. 120–128 (2019)
7.
Zurück zum Zitat Delobelle, P., Berendt, B.: Fairdistillation: mitigating stereotyping in language models. In: ECML-PKDD (2022) Delobelle, P., Berendt, B.: Fairdistillation: mitigating stereotyping in language models. In: ECML-PKDD (2022)
8.
Zurück zum Zitat Delobelle, P., Tokpo, E., Calders, T., Berendt, B.: Measuring fairness with biased rulers: a comparative study on bias metrics for pre-trained language models. In: Proceedings of NAACL, pp. 1693–1706 (2022) Delobelle, P., Tokpo, E., Calders, T., Berendt, B.: Measuring fairness with biased rulers: a comparative study on bias metrics for pre-trained language models. In: Proceedings of NAACL, pp. 1693–1706 (2022)
9.
Zurück zum Zitat Demner-Fushman, D., Chapman, W.W., McDonald, C.J.: What can natural language processing do for clinical decision support? J. Biomed. Inform. 42(5), 760–772 (2009)CrossRef Demner-Fushman, D., Chapman, W.W., McDonald, C.J.: What can natural language processing do for clinical decision support? J. Biomed. Inform. 42(5), 760–772 (2009)CrossRef
10.
Zurück zum Zitat Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (2019) Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (2019)
11.
Zurück zum Zitat Dixon, L., Li, J., Sorensen, J., Thain, N., Vasserman, L.: Measuring and mitigating unintended bias in text classification. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 67–73 (2018) Dixon, L., Li, J., Sorensen, J., Thain, N., Vasserman, L.: Measuring and mitigating unintended bias in text classification. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 67–73 (2018)
12.
Zurück zum Zitat Gupta, M., Agrawal, P.: Compression of deep learning models for text: a survey. ACM Trans. Knowl. Discov. Data 16(4), 1–55 (2022)CrossRef Gupta, M., Agrawal, P.: Compression of deep learning models for text: a survey. ACM Trans. Knowl. Discov. Data 16(4), 1–55 (2022)CrossRef
13.
Zurück zum Zitat Hao, Y., Dong, L., Wei, F., Xu, K.: Investigating learning dynamics of BERT fine-tuning. In: Proceedings of the Conference of the Asia-Pacific Chapter of the ACL and the IJCNLP. Suzhou, China (2020) Hao, Y., Dong, L., Wei, F., Xu, K.: Investigating learning dynamics of BERT fine-tuning. In: Proceedings of the Conference of the Asia-Pacific Chapter of the ACL and the IJCNLP. Suzhou, China (2020)
14.
Zurück zum Zitat Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004)CrossRefMATH Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004)CrossRefMATH
15.
Zurück zum Zitat Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: NeurIPS, vol. 29 (2016) Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: NeurIPS, vol. 29 (2016)
16.
Zurück zum Zitat Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arxiv pre-print (2015) Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arxiv pre-print (2015)
17.
Zurück zum Zitat Hooker, S., Courville, A., Clark, G., Dauphin, Y., Frome, A.: What do compressed deep neural networks forget? arxiv pre-print (2021) Hooker, S., Courville, A., Clark, G., Dauphin, Y., Frome, A.: What do compressed deep neural networks forget? arxiv pre-print (2021)
18.
Zurück zum Zitat Hooker, S., Moorosi, N., Clark, G., Bengio, S., Denton, E.: Characterising bias in compressed models. arXiv preprint arXiv:2010.03058 (2020) Hooker, S., Moorosi, N., Clark, G., Bengio, S., Denton, E.: Characterising bias in compressed models. arXiv preprint arXiv:​2010.​03058 (2020)
19.
Zurück zum Zitat Hutchinson, B., Prabhakaran, V., Denton, E., Webster, K., Zhong, Y., Denuyl, S.: Social biases in NLP models as barriers for persons with disabilities. In: ACL, pp. 5491–5501 (2020) Hutchinson, B., Prabhakaran, V., Denton, E., Webster, K., Zhong, Y., Denuyl, S.: Social biases in NLP models as barriers for persons with disabilities. In: ACL, pp. 5491–5501 (2020)
20.
Zurück zum Zitat Jatobá, M., Santos, J., Gutierriz, I., Moscon, D., Fernandes, P.O., Teixeira, J.P.: Evolution of artificial intelligence research in human resources. Procedia Comput. Sci. 164, 137–142 (2019)CrossRef Jatobá, M., Santos, J., Gutierriz, I., Moscon, D., Fernandes, P.O., Teixeira, J.P.: Evolution of artificial intelligence research in human resources. Procedia Comput. Sci. 164, 137–142 (2019)CrossRef
21.
Zurück zum Zitat Jawahar, G., Sagot, B., Seddah, D.: What does bert learn about the structure of language? In: ACL (2019) Jawahar, G., Sagot, B., Seddah, D.: What does bert learn about the structure of language? In: ACL (2019)
22.
Zurück zum Zitat Kim, S., Gholami, A., Yao, Z., Mahoney, M.W., Keutzer, K.: I-bert: Integer-only bert quantization (2021) Kim, S., Gholami, A., Yao, Z., Mahoney, M.W., Keutzer, K.: I-bert: Integer-only bert quantization (2021)
23.
Zurück zum Zitat Kurita, K., Vyas, N., Pareek, A., Black, A.W., Tsvetkov, Y.: Measuring bias in contextualized word representations. In: The 1st Workshop on Gender Bias in Natural Language Processing, pp. 166–172 (2019) Kurita, K., Vyas, N., Pareek, A., Black, A.W., Tsvetkov, Y.: Measuring bias in contextualized word representations. In: The 1st Workshop on Gender Bias in Natural Language Processing, pp. 166–172 (2019)
24.
Zurück zum Zitat LeCun, Y., Denker, J., Solla, S.: Optimal brain damage. In: Advances in Neural Information Processing Systems, vol. 2 (1989) LeCun, Y., Denker, J., Solla, S.: Optimal brain damage. In: Advances in Neural Information Processing Systems, vol. 2 (1989)
25.
Zurück zum Zitat Lukasik, M., Bhojanapalli, S., Menon, A.K., Kumar, S.: Teacher’s pet: understanding and mitigating biases in distillation. arXiv preprint arXiv:2106.10494 (2021) Lukasik, M., Bhojanapalli, S., Menon, A.K., Kumar, S.: Teacher’s pet: understanding and mitigating biases in distillation. arXiv preprint arXiv:​2106.​10494 (2021)
26.
Zurück zum Zitat Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. 54(6), 1–35 (2021)CrossRef Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. 54(6), 1–35 (2021)CrossRef
27.
Zurück zum Zitat Merchant, A., Rahimtoroghi, E., Pavlick, E., Tenney, I.: What happens to BERT embeddings during fine-tuning? In: Proceedings of the 3rd BlackboxNLP Workshop, pp. 33–44. ACL (2020) Merchant, A., Rahimtoroghi, E., Pavlick, E., Tenney, I.: What happens to BERT embeddings during fine-tuning? In: Proceedings of the 3rd BlackboxNLP Workshop, pp. 33–44. ACL (2020)
28.
Zurück zum Zitat Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training. OpenAI blog (2018) Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training. OpenAI blog (2018)
29.
Zurück zum Zitat Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019)
30.
Zurück zum Zitat Raghu, M., Gilmer, J., Yosinski, J., Sohl-Dickstein, J.: Svcca: singular vector canonical correlation analysis for deep learning dynamics and interpretability. In: NeurIPS, vol. 30 (2017) Raghu, M., Gilmer, J., Yosinski, J., Sohl-Dickstein, J.: Svcca: singular vector canonical correlation analysis for deep learning dynamics and interpretability. In: NeurIPS, vol. 30 (2017)
31.
Zurück zum Zitat Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arxiv pre-print (2020) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arxiv pre-print (2020)
32.
Zurück zum Zitat Sheng, E., Chang, K., Natarajan, P., Peng, N.: The woman worked as a babysitter: on biases in language generation. In: EMNLP/IJCNLP, pp. 3405–3410 (2019) Sheng, E., Chang, K., Natarajan, P., Peng, N.: The woman worked as a babysitter: on biases in language generation. In: EMNLP/IJCNLP, pp. 3405–3410 (2019)
33.
Zurück zum Zitat Swinger, N., De-Arteaga, M., Heffernan IV, N.T., Leiserson, M.D., Kalai, A.T.: What are the biases in my word embedding? In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pp. 305–311 (2019) Swinger, N., De-Arteaga, M., Heffernan IV, N.T., Leiserson, M.D., Kalai, A.T.: What are the biases in my word embedding? In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pp. 305–311 (2019)
34.
Zurück zum Zitat Vaswani, A., et al.: Attention is all you need. In: NeurIPS, vol. 30 (2017) Vaswani, A., et al.: Attention is all you need. In: NeurIPS, vol. 30 (2017)
Metadaten
Titel
An Investigation of Structures Responsible for Gender Bias in BERT and DistilBERT
verfasst von
Thibaud Leteno
Antoine Gourru
Charlotte Laclau
Christophe Gravier
Copyright-Jahr
2023
DOI
https://doi.org/10.1007/978-3-031-30047-9_20

Premium Partner