Skip to main content
Top

Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation

  • 2025
  • OriginalPaper
  • Chapter
Published in:

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The chapter delves into the biases inherent in Large Language Models (LLMs) and their implications for various applications. It introduces a two-step methodology to evaluate the resilience of LLMs against biases using jailbreak prompts, which are specifically designed to elicit biased responses. The study benchmarks widely-used LLMs, ranging from small to large models, and assesses their safety across different demographic biases. Notably, the chapter presents a detailed analysis of model behavior under bias elicitation and the effectiveness of various jailbreak attacks, providing insights into the vulnerabilities of LLMs. The findings underscore the need for layered defense mechanisms to ensure the secure deployment of LLMs in real-world applications.
This paper includes text used to test the robustness of LLMs against vulnerabilities. The texts include offensive phrasing and ideas. To avoid any misinterpretation by the reader, these phrases do not represent the views of the authors, organizers or publisher.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Business + Economics & Engineering + Technology"

Online-Abonnement

Springer Professional "Business + Economics & Engineering + Technology" gives you access to:

  • more than 130.000 books
  • more than 540 journals

from the following subject areas:

  • Automotive
  • Construction + Real Estate
  • Business IT + Informatics
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Mechanical Engineering + Materials
  • Surfaces + Materials Technology
  • Insurance + Risk


Secure your knowledge advantage now!

Springer Professional "Engineering + Technology"

Online-Abonnement

Springer Professional "Engineering + Technology" gives you access to:

  • more than 75.000 books
  • more than 390 journals

from the following specialised fileds:

  • Automotive
  • Business IT + Informatics
  • Construction + Real Estate
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Mechanical Engineering + Materials
  • Surfaces + Materials Technology





 

Secure your knowledge advantage now!

Springer Professional "Business + Economics"

Online-Abonnement

Springer Professional "Business + Economics" gives you access to:

  • more than 100.000 books
  • more than 340 journals

from the following specialised fileds:

  • Construction + Real Estate
  • Business IT + Informatics
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Insurance + Risk



Secure your knowledge advantage now!

Title
Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation
Authors
Riccardo Cantini
Giada Cosenza
Alessio Orsino
Domenico Talia
Copyright Year
2025
DOI
https://doi.org/10.1007/978-3-031-78977-9_4
This content is only visible if you are logged in and have the appropriate permissions.
This content is only visible if you are logged in and have the appropriate permissions.

Premium Partner

    Image Credits
    Neuer Inhalt/© ITandMEDIA, Nagarro GmbH/© Nagarro GmbH, AvePoint Deutschland GmbH/© AvePoint Deutschland GmbH, AFB Gemeinnützige GmbH/© AFB Gemeinnützige GmbH, USU GmbH/© USU GmbH, Ferrari electronic AG/© Ferrari electronic AG