Skip to main content

2024 | OriginalPaper | Buchkapitel

Measuring Bias in Generated Text Using Language Models—GPT-2 and BERT

verfasst von : Fozilatunnesa Masuma, Partha Chakraborty, Al-Amin-Ul Islam, Prince Chandra Talukder, Proshanta Roy, Mohammad Abu Yousuf

Erschienen in: Proceedings of Third International Conference on Computing and Communication Networks

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In Natural Language Processing (NLP), a language model is a probabilistic statistical model that estimates the likelihood that a particular sequence of words will appear in a sentence based on the words that came before it. In our experiment, text prompts for evaluating bias were generated using the BERT and GPT-2 language models. Sentiment, toxicity, and gender polarity are the bias measures that we have added in order to measure biases from numerous perspectives. During fine-tuning BERT model, we have achieved 91.48% accuracy on multilabel toxic comment classification. Later, this fine-tuned pretrained model is used for generating text using BOLD dataset prompts. Our work shows a greater percentage of the texts produced by GPT-2 than those produced by BERT which were labeled as toxic. Similar to how it did in the religious ideology sector, BERT's communism prompt resulted in a toxic text. Compared to BERT, GPT-2 produced writings that were more polarized in terms of sentiment, toxicity, and regard.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT:A lite bert for self-supervised learning of language representations. In: International Conference on Learning Representations (2020) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT:A lite bert for self-supervised learning of language representations. In: International Conference on Learning Representations (2020)
2.
Zurück zum Zitat Devlin, J., Chang, M. W., Lee, K., Toutanova, K.: 2019. BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (1) (2019) Devlin, J., Chang, M. W., Lee, K., Toutanova, K.: 2019. BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (1) (2019)
3.
Zurück zum Zitat Dhamala, J., Sun, T., Kumar, V., Krishna, S., Pruksachatkun, Y., Chang, K. W., & Gupta, R.: BOLD: dataset and metrics for measuring biases in open-ended language generation. In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (2021) Dhamala, J., Sun, T., Kumar, V., Krishna, S., Pruksachatkun, Y., Chang, K. W., & Gupta, R.: BOLD: dataset and metrics for measuring biases in open-ended language generation. In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (2021)
4.
Zurück zum Zitat Wang, A., Cho, K.: BERT has a mouth, and it must speak: BERT as a Markov Random field language model. In: Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation, pp, 30–36 (2019) Wang, A., Cho, K.: BERT has a mouth, and it must speak: BERT as a Markov Random field language model. In: Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation, pp, 30–36 (2019)
5.
Zurück zum Zitat Bender, E.M., Friedman, B.: Data statements for natural language processing: toward mitigating system bias and enabling better science. Trans. Assoc. Comput. Linguist. (TACL) 6, 587–604 (2018)CrossRef Bender, E.M., Friedman, B.: Data statements for natural language processing: toward mitigating system bias and enabling better science. Trans. Assoc. Comput. Linguist. (TACL) 6, 587–604 (2018)CrossRef
6.
Zurück zum Zitat Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982–3992, Hong Kong, China. Association for Computational Linguistics (2019) Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982–3992, Hong Kong, China. Association for Computational Linguistics (2019)
7.
Zurück zum Zitat Zhao, J., Wang, T., Yatskar, M., Ordonez, V., Chang, K.W.: Gender bias in coreference resolution: evaluation and debiasing methods. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 2 (Short Papers), pp. 15–20, New Orleans, Louisiana. Association for Computational Linguistics (2018) Zhao, J., Wang, T., Yatskar, M., Ordonez, V., Chang, K.W.: Gender bias in coreference resolution: evaluation and debiasing methods. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 2 (Short Papers), pp. 15–20, New Orleans, Louisiana. Association for Computational Linguistics (2018)
8.
Zurück zum Zitat Nangia, N., Vania, C., Bhalerao, R., Bowman, S.R.: CrowSPairs: a challenge dataset for measuring social biases in masked language models. In EMNLP (2020) Nangia, N., Vania, C., Bhalerao, R., Bowman, S.R.: CrowSPairs: a challenge dataset for measuring social biases in masked language models. In EMNLP (2020)
9.
Zurück zum Zitat Bolukbasi, T., Chang, K.-W., Zou, J.Y., Saligrama, V., Kalai, A.T.: Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In: Advances in Neural Information Processing Systems, pp. 4349–4357 Bolukbasi, T., Chang, K.-W., Zou, J.Y., Saligrama, V., Kalai, A.T.: Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In: Advances in Neural Information Processing Systems, pp. 4349–4357
10.
Zurück zum Zitat Sheng, E., Chang, K.-W., Natarajan, P., Peng, N.: The woman worked as a babysitter: on biases in language generation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP), pp. 3398–3403 (2019) Sheng, E., Chang, K.-W., Natarajan, P., Peng, N.: The woman worked as a babysitter: on biases in language generation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP), pp. 3398–3403 (2019)
11.
Zurück zum Zitat Nadeem, M., Bethke, A., Reddy, S.: StereoSet: measuring stereotypical bias in pretrained language models. arXiv preprint arXiv:2004.09456 (2020) Nadeem, M., Bethke, A., Reddy, S.: StereoSet: measuring stereotypical bias in pretrained language models. arXiv preprint arXiv:​2004.​09456 (2020)
12.
Zurück zum Zitat Guo, W., Caliskan, A.: Detecting emergent intersectional biases: contextualized word embeddings contain a distribution of human-like biases. In: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (AIES '21). Association for Computing Machinery, New York, NY, USA, pp. 122–133 (2021) Guo, W., Caliskan, A.: Detecting emergent intersectional biases: contextualized word embeddings contain a distribution of human-like biases. In: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (AIES '21). Association for Computing Machinery, New York, NY, USA, pp. 122–133 (2021)
13.
Zurück zum Zitat Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI Blog 1, 8–9 (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI Blog 1, 8–9 (2019)
14.
Zurück zum Zitat Munikar, M., Shakya, S., Shrestha, A.: Fine-grained sentiment classification using BERT.In: 2019 Artificial Intelligence for Transforming Business and Society (AITB). Vol. 1, IEEE (2019) Munikar, M., Shakya, S., Shrestha, A.: Fine-grained sentiment classification using BERT.In: 2019 Artificial Intelligence for Transforming Business and Society (AITB). Vol. 1, IEEE (2019)
15.
Zurück zum Zitat Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M. and Davison, J., Shleifer, S., von Platen, P., Clara Ma, Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., Drame, M., Lhoest, Q., Rush, A.M.: HuggingFace’s transformers: state-of-the-art natural language processing. In: EMNLP (2020) Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M. and Davison, J., Shleifer, S., von Platen, P., Clara Ma, Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., Drame, M., Lhoest, Q., Rush, A.M.: HuggingFace’s transformers: state-of-the-art natural language processing. In: EMNLP (2020)
16.
Zurück zum Zitat Zhu, Y., Kiros, R., Zemel, R., Salakhutdinov, R., Urtasun, R., Torralba, A., Fidler, S.: Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 19–27 (2015) Zhu, Y., Kiros, R., Zemel, R., Salakhutdinov, R., Urtasun, R., Torralba, A., Fidler, S.: Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 19–27 (2015)
17.
Zurück zum Zitat Gilbert, C.: Vader: a parsimonious rule-based model for sentiment analysis of social media text (2014) Gilbert, C.: Vader: a parsimonious rule-based model for sentiment analysis of social media text (2014)
18.
Zurück zum Zitat Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., Lowe, R.: Training language models to follow instructions with human feedback (2022). arXiv:2203.02155 Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., Lowe, R.: Training language models to follow instructions with human feedback (2022). arXiv:​2203.​02155
19.
Zurück zum Zitat Abdollah Pour, M.M., Farinneya, P., Toroghi, A., Korikov, A., Pesaranghader, A., Sajed, T., Bharadwaj, M., Mavrin, B., Sanner, S.: Self-supervised contrastive BERT fine-tuning for fusion-based reviewed-item retrieval (2023). arXiv preprint arXiv:2308.00762. Abdollah Pour, M.M., Farinneya, P., Toroghi, A., Korikov, A., Pesaranghader, A., Sajed, T., Bharadwaj, M., Mavrin, B., Sanner, S.: Self-supervised contrastive BERT fine-tuning for fusion-based reviewed-item retrieval (2023). arXiv preprint arXiv:​2308.​00762.
Metadaten
Titel
Measuring Bias in Generated Text Using Language Models—GPT-2 and BERT
verfasst von
Fozilatunnesa Masuma
Partha Chakraborty
Al-Amin-Ul Islam
Prince Chandra Talukder
Proshanta Roy
Mohammad Abu Yousuf
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-0892-5_39