Skip to main content
Top

2024 | OriginalPaper | Chapter

Loghi: An End-to-End Framework for Making Historical Documents Machine-Readable

Authors : Rutger van Koert, Stefan Klut, Tim Koornstra, Martijn Maas, Luke Peters

Published in: Document Analysis and Recognition – ICDAR 2024 Workshops

Publisher: Springer Nature Switzerland

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Loghi is a novel framework and suite of tools for the layout analysis and text recognition of historical documents. Scans are processed in a modular pipeline, with the option to use alternative tools in most stages. Layout analysis and text recognition can be trained on example images with PageXML ground truth. The framework is intended to convert scanned documents to machine-readable PageXML. Additional tooling is provided for the creation of synthetic ground truth. A visualiser for troubleshooting the text recognition training is also made available. The result is a framework for end-to-end text recognition, which works from initial layout analysis on the scanned documents, and includes text line detection, text recognition, reading order detection and language detection.
The Loghi pipeline has been used successfully in several projects. We achieve good results on the layout analysis and text recognition of both the handwritten and printed archives of the Dutch States General on resolutions spanning the 17th and 18th century. The CER on handwritten 17th century material is below 3%. Loghi is open source and free to use.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
4.
go back to reference Cho, K., van Merrienboer, B., Gülçehre, Ç., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR abs/1406.1078 (2014). http://arxiv.org/abs/1406.1078 Cho, K., van Merrienboer, B., Gülçehre, Ç., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR abs/1406.1078 (2014). http://​arxiv.​org/​abs/​1406.​1078
7.
go back to reference Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd International Conference on Machine Learning, ICML 2006, January 2006, vol. 2006, pp. 369–376 (2006). https://doi.org/10.1145/1143844.1143891 Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd International Conference on Machine Learning, ICML 2006, January 2006, vol. 2006, pp. 369–376 (2006). https://​doi.​org/​10.​1145/​1143844.​1143891
8.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)
10.
go back to reference Kahle, P., Colutto, S., Hackl, G., Mühlberger, G.: Transkribus - a service platform for transcription, recognition and retrieval of historical documents. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 04, pp. 19–24 (2017). https://doi.org/10.1109/ICDAR.2017.307 Kahle, P., Colutto, S., Hackl, G., Mühlberger, G.: Transkribus - a service platform for transcription, recognition and retrieval of historical documents. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 04, pp. 19–24 (2017). https://​doi.​org/​10.​1109/​ICDAR.​2017.​307
11.
go back to reference Kanopoulos, N., Vasanthavada, N., Baker, R.L.: Design of an image edge detection filter using the Sobel operator. IEEE J. Solid-State Circ. 23(2), 358–367 (1988)CrossRef Kanopoulos, N., Vasanthavada, N., Baker, R.L.: Design of an image edge detection filter using the Sobel operator. IEEE J. Solid-State Circ. 23(2), 358–367 (1988)CrossRef
13.
go back to reference Kiessling, B.: Kraken-an universal text recognizer for the humanities. In: ADHO, Éd., Actes de Digital Humanities Conference (2019) Kiessling, B.: Kraken-an universal text recognizer for the humanities. In: ADHO, Éd., Actes de Digital Humanities Conference (2019)
14.
go back to reference Kiessling, B., Tissot, R., Stokes, P., Stökl Ben Ezra, D.: eScriptorium: an open source platform for historical document analysis. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 2, p. 19 (2019). https://doi.org/10.1109/ICDARW.2019.10032 Kiessling, B., Tissot, R., Stokes, P., Stökl Ben Ezra, D.: eScriptorium: an open source platform for historical document analysis. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 2, p. 19 (2019). https://​doi.​org/​10.​1109/​ICDARW.​2019.​10032
15.
go back to reference Klut, S., van Koert, R., Sluijter, R.: Laypa: a novel framework for applying segmentation networks to historical documents. In: Proceedings of the 7th International Workshop on Historical Document Imaging and Processing, HIP 2023, pp. 67–72. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3604951.3605520 Klut, S., van Koert, R., Sluijter, R.: Laypa: a novel framework for applying segmentation networks to historical documents. In: Proceedings of the 7th International Workshop on Historical Document Imaging and Processing, HIP 2023, pp. 67–72. Association for Computing Machinery, New York, NY, USA (2023). https://​doi.​org/​10.​1145/​3604951.​3605520
16.
go back to reference Koolen, M., Hoekstra, F.: Detecting formulaic language use in historical administrative corpora. In: Proceedings of the Computational Humanities Research Conference 2022, pp. 127–151 (2022) Koolen, M., Hoekstra, F.: Detecting formulaic language use in historical administrative corpora. In: Proceedings of the Computational Humanities Research Conference 2022, pp. 127–151 (2022)
17.
go back to reference Koolen, M., et al.: The value of preexisting structures for digital access: modelling the resolutions of the Dutch states general. ACM J. Comput. Cult. Herit. 16(1), 1–24 (2023)CrossRef Koolen, M., et al.: The value of preexisting structures for digital access: modelling the resolutions of the Dutch states general. ACM J. Comput. Cult. Herit. 16(1), 1–24 (2023)CrossRef
18.
go back to reference Koolen, M., Hoekstra, R., Sluijter, R., Oddens, J.: Formulas and decision-making: the case of the states general of the Dutch Republic. In: Proceedings. http://ceur-ws.org (2023). ISSN 1613-0073 Koolen, M., Hoekstra, R., Sluijter, R., Oddens, J.: Formulas and decision-making: the case of the states general of the Dutch Republic. In: Proceedings. http://​ceur-ws.​org (2023). ISSN 1613-0073
28.
go back to reference Smith, R.: An overview of the tesseract OCR engine. In: Ninth International Conference on Document Analysis and Recognition, ICDAR 2007, vol. 2, pp. 629–633. IEEE (2007) Smith, R.: An overview of the tesseract OCR engine. In: Ninth International Conference on Document Analysis and Recognition, ICDAR 2007, vol. 2, pp. 629–633. IEEE (2007)
31.
go back to reference Wick, C., Reul, C., Puppe, F.: Calamari - a high-performance TensorFlow-based deep learning package for optical character recognition. Digit. Humanit. Q. 14(1) (2020) Wick, C., Reul, C., Puppe, F.: Calamari - a high-performance TensorFlow-based deep learning package for optical character recognition. Digit. Humanit. Q. 14(1) (2020)
Metadata
Title
Loghi: An End-to-End Framework for Making Historical Documents Machine-Readable
Authors
Rutger van Koert
Stefan Klut
Tim Koornstra
Martijn Maas
Luke Peters
Copyright Year
2024
DOI
https://doi.org/10.1007/978-3-031-70645-5_6

Premium Partner