skip to main content
10.1145/3476887.3476905acmotherconferencesArticle/Chapter ViewAbstractPublication PageshipConference Proceedingsconference-collections
research-article

Including Keyword Position in Image-based Models for Act Segmentation of Historical Registers

Published:31 October 2021Publication History

ABSTRACT

The segmentation of complex images into semantic regions has seen a growing interest these last years with the advent of Deep Learning. Until recently, most existing methods for Historical Document Analysis focused on the visual appearance of documents, ignoring the rich information that textual content can offer. However, the segmentation of complex documents into semantic regions is sometimes impossible relying only on visual features and recent models embed both visual and textual information. In this paper, we focus on the use of both visual and textual information for segmenting historical registers into structured and meaningful units such as acts. An act is a text recording containing valuable knowledge such as demographic information (baptism, marriage or death) or royal decisions (donation or pardon). We propose a simple pipeline to enrich document images with the position of text lines containing key-phrases and show that running a standard image-based layout analysis system on these images can lead to significant gains. Our experiments show that the detection of acts increases from 38 % of mAP to 74 % when adding textual information, in real use-case conditions where text lines positions and content are extracted with an automatic recognition system.

References

  1. Sofia Ares Oliveira, Benoit Seguin, and Frederic Kaplan. 2018. dhSegment: A generic deep-learning approach for document segmentation. In International Conference on Frontiers in Handwriting Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  2. Ashish Arora, Chun Chieh Chang, Babak Rekabdar, Bagher BabaAli, Daniel Povey, David Etter, Desh Raj, Hossein Hadian, Jan Trmal, Paola Garcia, 2019. Using ASR methods for OCR. In International Conference on Document Analysis and Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  3. Raphaël Barman, Maud Ehrmann, S. Clematide, S. Oliveira, and F. Kaplan. 2020. Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers. Journal of Data Mining & Digital Humanities, HistoInformatics, HistoInformatics (2020).Google ScholarGoogle Scholar
  4. Théodore Bluche, Sebastien Hamel, Christopher Kermorvant, Joan Puigcerver, Dominique Stutzmann, Alejandro H. Toselli, and Enrique Vidal. 2017. Preparatory KWS Experiments for Large-Scale Indexing of a Vast Medieval Manuscript Collection in the HIMANIS Project. In International Conference on Document Analysis and Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  5. Mélodie Boillet, Christopher Kermorvant, and Thierry Paquet. 2020. Multiple Document Datasets Pre-training Improves Text Line Detection With Deep Neural Networks. In International Conference on Pattern Recognition.Google ScholarGoogle Scholar
  6. Mélodie Boillet, Christopher Kermorvant, and Thierry Paquet. 2021. Robust Text Line Detection in Historical Documents: Learning and Evaluation Methods. In Preparation.Google ScholarGoogle Scholar
  7. Tobias Grüning, Roger Labahn, Markus Diem, Florian Kleber, and Stefan Fiel. 2018. READ-BAD: A New Dataset and Evaluation Scheme for Baseline Detection in Archival Documents. In International Workshop on Document Analysis Systems.Google ScholarGoogle Scholar
  8. Tobias Grüning, Gundram Leifert, Tobias Strauß, and Roger Labahn. 2019. A Two-Stage Method for Text Line Detection in Historical Documents. International Journal on Document Analysis and Recognition (IJDAR) 22 (09 2019). https://doi.org/10.1007/s10032-019-00332-1Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Paul Guérin and Léonce Celier. 1881-1958. Recueil des documents concernant le Poitou contenus dans les registres de la chancellerie de France. Société des archives historiques du Poitou, Poitiers.Google ScholarGoogle Scholar
  10. Olivier Guyotjeannin and Serge Lusignan. 2005. Le formulaire d’Odart Morchesne, dans la version du ms BNF fr. 5024. École des chartes, Paris.Google ScholarGoogle Scholar
  11. Philip Kahle, Sebastian Colutto, Günter Hackl, and Günter Mühlberger. 2017. Transkribus - A Service Platform for Transcription, Recognition and Retrieval of Historical Documents. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Vol. 04. 19–24. https://doi.org/10.1109/ICDAR.2017.307Google ScholarGoogle Scholar
  12. O. Mechi, M. Mehri, R. Ingold, and N. Essoukri Ben Amara. 2019. Text Line Segmentation in Historical Document Images Using an Adaptive U-Net Architecture. In International Conference on Document Analysis and Recognition.Google ScholarGoogle Scholar
  13. José Ramón Prieto, Vicente Bosch, Enrique Vidal, Dominique Stutzmann, and Sébastien Hamel. 2020. Text Content Based Layout Analysis. In International Conference on Frontiers in Handwriting Recognition.Google ScholarGoogle Scholar
  14. Solène Tarride, Aurélie Lemaitre, Bertrand Coüasnon, and Sophie Tardivel. 2019. Signature Detection as a Way to Recognise Historical Parish Register Structure. In International Workshop on Historical Document Imaging and Processing.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hélène Vézina and Jean-Sébastien Bournival. 2020. An Overview of the BALSAC Population Database. Past Developments, Current State and Future Prospects. In Historical Life Course Studies.Google ScholarGoogle Scholar
  16. Viard, Jules. 1899. Documents parisiens du règne de Philippe VI de Valois (1328-1350) : extraits des registres de la chancellerie de France. H. Champion, Paris.Google ScholarGoogle Scholar
  17. Gregor Wiedemann and Gerhard Heyer. 2018. Page Stream Segmentation with Convolutional Neural Nets Combining Textual and Visual Features. ArXiv abs/1710.03006(2018).Google ScholarGoogle Scholar
  18. Yiheng Xu, Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, and Ming Zhou. 2020. LayoutLM: Pre-training of Text and Layout for Document Image Understanding. In International Conference on Knowledge Discovery & Data Mining.Google ScholarGoogle Scholar
  19. Xiao Yang, Ersin Yumer, Paul Asente, Mike Kraley, Daniel Kifer, and C. Lee Giles. 2017. Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4342–4351. https://doi.org/10.1109/CVPR.2017.462Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    HIP '21: Proceedings of the 6th International Workshop on Historical Document Imaging and Processing
    September 2021
    72 pages
    ISBN:9781450386906
    DOI:10.1145/3476887

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 31 October 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate52of90submissions,58%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format