nach oben

Neural Computing and Applications

Erschienen in:

24.07.2020 | Original Article

Extensive study on the underlying gender bias in contextualized word embeddings

verfasst von: Christine Basta, Marta R. Costa-jussà, Noe Casas

Erschienen in: Neural Computing and Applications | Ausgabe 8/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Gender bias is affecting many natural language processing applications. While we are still far from proposing debiasing methods that will solve the problem, we are making progress analyzing the impact of this bias in current algorithms. This paper provides an extensive study of the underlying gender bias in popular contextualized word embeddings. Our study provides an insightful analysis of evaluation measures applied to several English data domains and the layers of the contextualized word embeddings. It is also adapted and extended to the Spanish language. Our study points out the advantages and limitations of the various evaluation measures that we are using and aims to standardize the evaluation of gender bias in contextualized word embeddings.

Vorheriger Artikel Parameterized neural network training for the solution of a class of stiff initial value systems

Nächster Artikel “SPOCU”: scaled polynomial constant unit activation function

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

https://github.com/HIT-SCIR/ELMoForManyLangs.

https://github.com/biomedicaltranslationcorpora/corpora.

http://opus.nlpl.eu/Europarl.php.

http://opus.nlpl.eu/TED2013.php.

http://www.statmt.org/wmt13/translation-task.html.

https://github.com/tolga-b/debiaswe/blob/master/data/definitional_pairs.json.

https://github.com/tolga-b/debiaswe/blob/master/data/equalize_pairs.json.

https://github.com/tolga-b/debiaswe/blob/master/data/professions.json.

Both the ‘biased list’ and ‘extended biased list’ were kindly provided by Hila Gonen to reproduce experiments from her study Gonen and Goldberg [9].

https://spacy.io/models/es.

https://github.com/dccuchile/spanishwordembeddings.

Basta C, Costa-jussà MR, Casas N (2019) Evaluating the underlying gender bias in contextualized word embeddings. arXiv:190408783

Blodgett SL, Barocas S, Daumé III H, Wallach H (2020) Language (technology) is power: a critical survey of “bias” in NLP. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 5454–5476

Bolukbasi T, Chang KW, Zou JY, Saligrama V, Kalai AT (2016) Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Advances in Neural Information Processing Systems 29, Curran Associates, Inc., pp 4349–4357

Costa-jussà MR (2019) An analysis of gender bias studies in natural language processing. Nature Machine Intelligence 1

Dev S, Li T, Phillips J, Srikumar V (2019) On measuring and mitigating biased inferences of word embeddings. arXiv:1908.09369

Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:181004805

Ethayarajh K, Duvenaud D, Hirst G (2019) Understanding undesirable word embedding associations. In: Proc. of the ACL

Font JE, Costa-jussà MR (2019) Equalizing gender biases in neural machine translation with word embeddings techniques. arXiv:1901.03116

Gonen H, Goldberg Y (2019) Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them. arXiv:1903.03862

10.

Gonen H, Webster K (2020) Automatically identifying gender issues in machine translation using perturbations. arXiv:2004.14065

11.

Guo W, Caliskan A (2020) Detecting emergent intersectional biases: contextualized word embeddings contain a distribution of human-like biases. arXiv:200603955

12.

Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735CrossRef

13.

Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. In: Proc. of the ACL (Volume 1: Long Papers), Melbourne, Australia, pp 328–339

14.

Huang L, Sun C, Qiu X, Huang X (2019) GlossBERT: BERT for word sense disambiguation with gloss knowledge. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), Hong Kong, China, pp 3509–3514

15.

Kaneko M, Bollegala D (2019) Gender-preserving debiasing for pre-trained word embeddings. In: Proc. of the ACL, Florence, Italy, pp 1641–1650, https://doi.org/10.18653/v1/P19-1160

16.

Khattak FK, Jeblee S, Pou-Prom C, Abdalla M, Meaney C, Rudzicz F (2019) A survey of word embeddings for clinical text. J Biomed Inf X 4:100057

17.

Kurita K, Vyas N, Pareek A, Black AW, Tsvetkov Y (2019) Measuring bias in contextualized word representations. In: Proceedings of the first workshop on gender bias in natural language processing, pp 166–172

18.

Liu NF, Gardner M, Belinkov Y, Peters ME, Smith NA (2019) Linguistic knowledge and transferability of contextual representations. In: Proceedings of the conference of the north american chapter of the association for computational linguistics: human language technologies

19.

Lu K, Mardziel P, Wu F, Amancharla P, Datta A (2018) Gender bias in neural natural language processing. arXiv:1807.11714

20.

May C, Wang A, Bordia S, Bowman SR, Rudinger R (2019) On measuring social biases in sentence encoders. arXiv:190310561

21.

McGuire L, Mulvey KL, Goff E, Irvin MJ, Winterbottom M, Fields GE, Hartstone-Rose A, Rutland A (2020) Stem gender stereotypes from early childhood through adolescence at informal science centers. J Appl Develop Psychol 67:101109CrossRef

22.

Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in Neural Information Processing Systems 26, pp 3111–3119

23.

Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp 1532–1543

24.

Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proc. of the ACL(Long Papers), New Orleans, Louisiana, pp 2227–2237, https://doi.org/10.18653/v1/N18-1202

25.

Peters M, Ruder S, Smith NA (2019) To tune or not to tune? adapting pretrained representations to diverse tasks. arXiv:190305987

26.

Radford A (2018) Improving language understanding by generative pre-training. https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf

27.

Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners

28.

Smith NA (2020) Contextual word representations: putting words into computers. Commun ACM 63(6):66–74CrossRef

29.

Stanovsky G, Smith NA, Zettlemoyer L (2019) Evaluating gender bias in machine translation. In: Proc. of the ACL, Florence, Italy, pp 1679–1684

30.

Tan YC, Celis LE (2019) Assessing social and intersectional biases in contextualized word representations. In: Advances in Neural Information Processing Systems, pp 13209–13220

31.

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008

32.

Zhang BH, Lemoine B, Mitchell M (2018) Mitigating unwanted biases with adversarial learning. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, ACM, pp 335–340

33.

Zhao J, Wang T, Yatskar M, Ordonez V, Chang KW (2018a) Gender bias in coreference resolution: evaluation and debiasing methods. arXiv:180406876

34.

Zhao J, Zhou Y, Li Z, Wang W, Chang KW (2018b) Learning gender-neutral word embeddings. arXiv:180901496

35.

Zhao J, Wang T, Yatskar M, Cotterell R, Ordonez V, Chang KW (2019) Gender bias in contextualized word embeddings. In: Proc. of the Conference of the NAACL

36.

Zhou P, Shi W, Zhao J, Huang KH, Chen M, Cotterell R, Chang KW (2019) Examining gender bias in languages with grammatical gender. arXiv:190902224

Titel: Extensive study on the underlying gender bias in contextualized word embeddings
verfasst von: Christine Basta
Marta R. Costa-jussà
Noe Casas
Publikationsdatum: 24.07.2020
Verlag: Springer London
Erschienen in: Neural Computing and Applications / Ausgabe 8/2021
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-020-05211-z

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 8/2021

SHEG: summarization and headline generation of news articles using deep learning

A neural integrator model for planning and value-based decision making of a robotics assistant

Time-aware user profiling from personal service ecosystem

Convergence analysis on time scales for HOBAM neural networks in the Stepanov-like weighted pseudo almost automorphic space

Evaluation of computationally intelligent techniques for breast cancer diagnosis

How frontal is a face? Quantitative estimation of face pose based on CNN and geometric projection