Skip to main content

2022 | OriginalPaper | Buchkapitel

A Comprehensive Comparison of Open-Source Libraries for Handwritten Text Recognition in Norwegian

verfasst von : Martin Maarand, Yngvil Beyer, Andre Kåsen, Knut T. Fosseide, Christopher Kermorvant

Erschienen in: Document Analysis Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we introduce an open database of historical handwritten documents fully annotated in Norwegian, the first of its kind, allowing the development of handwritten text recognition models (HTR) in Norwegian. In order to evaluate the performance of state-of-the-art HTR models on this new base, we conducted a systematic survey of open-source HTR libraries published between 2019 and 2021, identified ten libraries and selected four of them to train HTR models. We trained twelve models in different configurations and compared their performance on both random and scripter-based data splitting. The best recognition results were obtained by the PyLaia and Kaldi libraries which have different and complementary characteristics, suggesting that they should be combined to further improve the results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
Literatur
1.
Zurück zum Zitat Arora, A., et al.: Using ASR methods for OCR. In: International Conference on Document Analysis and Recognition (2019) Arora, A., et al.: Using ASR methods for OCR. In: International Conference on Document Analysis and Recognition (2019)
2.
Zurück zum Zitat Augustin, E., Brodin, J.M., Carré, M., Geoffrois, E., Grosicki, E., Prêteux, F.: RIMES evaluation campaign for handwritten mail processing. In: International Conference on Document Analysis and Recognition, p. 5 (2006) Augustin, E., Brodin, J.M., Carré, M., Geoffrois, E., Grosicki, E., Prêteux, F.: RIMES evaluation campaign for handwritten mail processing. In: International Conference on Document Analysis and Recognition, p. 5 (2006)
3.
Zurück zum Zitat Chammas, E., Mokbel, C., Likforman-Sulem, L.: Handwriting recognition of historical documents with few labeled data. In: International Workshop on Document Analysis Systems, pp. 43–48. IEEE (2018) Chammas, E., Mokbel, C., Likforman-Sulem, L.: Handwriting recognition of historical documents with few labeled data. In: International Workshop on Document Analysis Systems, pp. 43–48. IEEE (2018)
4.
Zurück zum Zitat Coquenet, D., Chatelain, C., Paquet, T.: Recurrence-free unconstrained handwritten text recognition using gated fully convolutional network. In: International Conference on Frontiers in Handwriting Recognition, pp. 19–24 (2020) Coquenet, D., Chatelain, C., Paquet, T.: Recurrence-free unconstrained handwritten text recognition using gated fully convolutional network. In: International Conference on Frontiers in Handwriting Recognition, pp. 19–24 (2020)
5.
Zurück zum Zitat Coquenet, D., Chatelain, C., Paquet, T.: End-to-end handwritten paragraph text recognition using a vertical attention network. IEEE Trans. Pattern Anal. Mach. Intell. (2022) Coquenet, D., Chatelain, C., Paquet, T.: End-to-end handwritten paragraph text recognition using a vertical attention network. IEEE Trans. Pattern Anal. Mach. Intell. (2022)
6.
Zurück zum Zitat Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: International Conference on Machine Learning, pp. 369–376 (2006) Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: International Conference on Machine Learning, pp. 369–376 (2006)
7.
Zurück zum Zitat Hegghammer, T.: OCR with tesseract, Amazon textract, and Google document AI: a benchmarking experiment. J. Comput. Soc. Sci. (2021) Hegghammer, T.: OCR with tesseract, Amazon textract, and Google document AI: a benchmarking experiment. J. Comput. Soc. Sci. (2021)
8.
Zurück zum Zitat Hodel, T., Schoch, D., Schneider, C., Purcell, J.: General models for handwritten text recognition: feasibility and state-of-the art. German kurrent as an example. J. Open Humanit. Data 7(13), 1–10 (2021) Hodel, T., Schoch, D., Schneider, C., Purcell, J.: General models for handwritten text recognition: feasibility and state-of-the art. German kurrent as an example. J. Open Humanit. Data 7(13), 1–10 (2021)
10.
Zurück zum Zitat Jørgensen, F., Aasmoe, T., Ruud Husevåg, A.S., Øvrelid, L., Velldal, E.: NorNE: annotating named entities for Norwegian. In: Language Resources and Evaluation Conference (2020) Jørgensen, F., Aasmoe, T., Ruud Husevåg, A.S., Øvrelid, L., Velldal, E.: NorNE: annotating named entities for Norwegian. In: Language Resources and Evaluation Conference (2020)
11.
Zurück zum Zitat Kang, L., Riba, P., Rusiñol, M., Fornés, A., Villegas, M.: Distilling content from style for handwritten word recognition. In: International Conference on Frontiers in Handwriting Recognition (2020) Kang, L., Riba, P., Rusiñol, M., Fornés, A., Villegas, M.: Distilling content from style for handwritten word recognition. In: International Conference on Frontiers in Handwriting Recognition (2020)
12.
Zurück zum Zitat Kang, L., Toledo, J.I., Riba, P., Villegas, M., Fornés, A., Rusiñol, M.: Convolve, attend and spell: an attention-based sequence-to-sequence model for handwritten word recognition. In: Brox, T., Bruhn, A., Fritz, M. (eds.) GCPR 2018. LNCS, vol. 11269, pp. 459–472. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-12939-2_32CrossRef Kang, L., Toledo, J.I., Riba, P., Villegas, M., Fornés, A., Rusiñol, M.: Convolve, attend and spell: an attention-based sequence-to-sequence model for handwritten word recognition. In: Brox, T., Bruhn, A., Fritz, M. (eds.) GCPR 2018. LNCS, vol. 11269, pp. 459–472. Springer, Cham (2019). https://​doi.​org/​10.​1007/​978-3-030-12939-2_​32CrossRef
13.
Zurück zum Zitat Kiessling, B., Tissot, R., Stokes, P., Stökl Ben Ezra, D.: eScriptorium: an open source platform for historical document analysis. In: International Conference on Document Analysis and Recognition Workshops, vol. 2, p. 19 (2019) Kiessling, B., Tissot, R., Stokes, P., Stökl Ben Ezra, D.: eScriptorium: an open source platform for historical document analysis. In: International Conference on Document Analysis and Recognition Workshops, vol. 2, p. 19 (2019)
14.
Zurück zum Zitat Kummervold, P.E., de la Rosa, J., Wetjen, F., Brygfjeld, S.A.: Operationalizing a national digital library: the case for a norwegian transformer model. In: Nordic Conference on Computational Linguistics (2021) Kummervold, P.E., de la Rosa, J., Wetjen, F., Brygfjeld, S.A.: Operationalizing a national digital library: the case for a norwegian transformer model. In: Nordic Conference on Computational Linguistics (2021)
17.
Zurück zum Zitat Michael, J., Weidemann, M., Labahn, R.: Htr engine based on nns p 3 optimizing speed and performance-htr +. Technical report, READ-H2020 Project 674943 (2018) Michael, J., Weidemann, M., Labahn, R.: Htr engine based on nns p 3 optimizing speed and performance-htr +. Technical report, READ-H2020 Project 674943 (2018)
18.
Zurück zum Zitat Muehlberger, G., et al.: Transforming scholarship in the archives through handwritten text recognition: transkribus as a case study. J. Doc. (2019) Muehlberger, G., et al.: Transforming scholarship in the archives through handwritten text recognition: transkribus as a case study. J. Doc. (2019)
19.
Zurück zum Zitat Nesse, A., Sandøy, H.: Norsk Språkhistorie IV: Tidslinjer. Novus, Oslo (2018) Nesse, A., Sandøy, H.: Norsk Språkhistorie IV: Tidslinjer. Novus, Oslo (2018)
20.
Zurück zum Zitat Neto, A.F.S., Bezerra, B.L.D., Toselli, A.H., Lima, E.B.: HTR-flor++: a handwritten text recognition system based on a pipeline of optical and language models. In: ACM Symposium on Document Engineering (2020) Neto, A.F.S., Bezerra, B.L.D., Toselli, A.H., Lima, E.B.: HTR-flor++: a handwritten text recognition system based on a pipeline of optical and language models. In: ACM Symposium on Document Engineering (2020)
21.
Zurück zum Zitat Nguyen, T.T.H., Jatowt, A., Coustaty, M., Nguyen, N.V., Doucet, A.: Deep statistical analysis of OCR rrrors for effective post-OCR processing. In: Joint Conference on Digital Libraries (2019) Nguyen, T.T.H., Jatowt, A., Coustaty, M., Nguyen, N.V., Doucet, A.: Deep statistical analysis of OCR rrrors for effective post-OCR processing. In: Joint Conference on Digital Libraries (2019)
23.
Zurück zum Zitat Povey, D., et al.: Purely sequence-trained neural networks for asr based on lattice-free mmi. In: Interspeech, pp. 2751–2755 (2016) Povey, D., et al.: Purely sequence-trained neural networks for asr based on lattice-free mmi. In: Interspeech, pp. 2751–2755 (2016)
25.
Zurück zum Zitat Strauß, T., Leifert, G., Labahn, R., Mühlberger, G.: Competition on automated text recognition on a read dataset. In: International Conference on Frontiers in Handwriting Recognition (2018) Strauß, T., Leifert, G., Labahn, R., Mühlberger, G.: Competition on automated text recognition on a read dataset. In: International Conference on Frontiers in Handwriting Recognition (2018)
26.
Zurück zum Zitat Sánchez, J.A., Romero, V., Toselli, A., Villegas, M., Vidal, E.: A set of benchmarks for handwritten text recognition on historical documents. Pattern Recogn. 94, 122–134 (2019)CrossRef Sánchez, J.A., Romero, V., Toselli, A., Villegas, M., Vidal, E.: A set of benchmarks for handwritten text recognition on historical documents. Pattern Recogn. 94, 122–134 (2019)CrossRef
28.
Zurück zum Zitat Yousef, M., Bishop, T.E.: Origaminet: weakly-supervised, segmentation-free, one-step, full page textrecognition by learning to unfold. In: Conference on Computer Vision and Pattern Recognition (2020) Yousef, M., Bishop, T.E.: Origaminet: weakly-supervised, segmentation-free, one-step, full page textrecognition by learning to unfold. In: Conference on Computer Vision and Pattern Recognition (2020)
Metadaten
Titel
A Comprehensive Comparison of Open-Source Libraries for Handwritten Text Recognition in Norwegian
verfasst von
Martin Maarand
Yngvil Beyer
Andre Kåsen
Knut T. Fosseide
Christopher Kermorvant
Copyright-Jahr
2022
DOI
https://doi.org/10.1007/978-3-031-06555-2_27

Premium Partner