Skip to main content
Top

From Regex to Transformers: A Hybrid Framework for Cyber Threat Indicator Extraction from Unstructured Text

  • 2026
  • OriginalPaper
  • Chapter
Published in:

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This chapter delves into the critical challenge of extracting Indicators of Compromise (IOCs) from unstructured cyber threat intelligence (CTI) reports, which is currently a manual and time-consuming process. The authors present a hybrid framework that integrates traditional rule-based methods with a fine-tuned transformer model, specifically DistilBERT, to automate and enhance the extraction of eight IOC categories. The methodology involves dataset acquisition and preprocessing, baseline IOC extraction using regex and spaCy NER, and advanced model training and evaluation. The results show a significant improvement in recall and accuracy over baseline approaches, with the transformer model achieving over 98% precision and recall. The chapter also discusses the deployment of the model through a RESTful API and an interactive Streamlit interface, making it practical for real-world applications. The key takeaway is the potential of NLP and transformer models to revolutionize threat intelligence workflows by reducing manual analysis time and improving the accuracy of IOC extraction.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Business + Economics & Engineering + Technology"

Online-Abonnement

Springer Professional "Business + Economics & Engineering + Technology" gives you access to:

  • more than 130.000 books
  • more than 540 journals

from the following subject areas:

  • Automotive
  • Construction + Real Estate
  • Business IT + Informatics
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Mechanical Engineering + Materials
  • Surfaces + Materials Technology
  • Insurance + Risk


Secure your knowledge advantage now!

Springer Professional "Engineering + Technology"

Online-Abonnement

Springer Professional "Engineering + Technology" gives you access to:

  • more than 75.000 books
  • more than 390 journals

from the following specialised fileds:

  • Automotive
  • Business IT + Informatics
  • Construction + Real Estate
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Mechanical Engineering + Materials
  • Surfaces + Materials Technology





 

Secure your knowledge advantage now!

Springer Professional "Business + Economics"

Online-Abonnement

Springer Professional "Business + Economics" gives you access to:

  • more than 100.000 books
  • more than 340 journals

from the following specialised fileds:

  • Construction + Real Estate
  • Business IT + Informatics
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Insurance + Risk



Secure your knowledge advantage now!

Title
From Regex to Transformers: A Hybrid Framework for Cyber Threat Indicator Extraction from Unstructured Text
Authors
Paul Jideani
Aurona Gerber
Copyright Year
2026
DOI
https://doi.org/10.1007/978-3-032-13075-4_3
This content is only visible if you are logged in and have the appropriate permissions.

Premium Partner

    Image Credits
    Neuer Inhalt/© ITandMEDIA, Nagarro GmbH/© Nagarro GmbH, AvePoint Deutschland GmbH/© AvePoint Deutschland GmbH, AFB Gemeinnützige GmbH/© AFB Gemeinnützige GmbH, USU GmbH/© USU GmbH, Ferrari electronic AG/© Ferrari electronic AG