Skip to main content
Top
Published in:
Cover of the book

2021 | OriginalPaper | Chapter

HRRegionNet: Chinese Character Segmentation in Historical Documents with Regional Awareness

Authors : Chia-Wei Tang, Chao-Lin Liu, Po-Sen Chiu

Published in: Document Analysis and Recognition – ICDAR 2021

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Human beings, as the only species capable of developing high levels of civilization, the transmission of knowledge from historical documents plays an indispensable role in this process. The amount of historical documents accumulated in the last centuries is not to be belittled, and the knowledge they contain is not to be underestimated. However, these historical documents are also facing difficulties in preservation due to various factors. The digitization process was mostly performed manually in the past, but the costs made the process very slow and challenging, so how to automate the digitization process has been the focus of much research previously. The digitization of Chinese historical documents can divide into two main stages: Chinese character segmentation and Chinese character recognition. This study will only focus on Chinese character segmentation in historical documents because only accurate segmentation results can achieve high accuracy in Chinese character recognition. In this research, we further improve the model based on our previously proposed Chinese character detection model, HRCenterNet, by adding a transposed convolution module to restore the output to a higher resolution and use multi-resolution aggregation combine features in different resolutions. In addition, we also propose a new objective function such that the model can more comprehensively consider the features needed to segment Chinese characters during the learning process. In the MTHv2 dataset, our model achieves an IoU score of 0.862 and reaches state-of-the-art. Our source code is available on https://​github.​com/​Tverous/​HRRegionNet.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
4.
go back to reference Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv:1311.2524 [cs], October 2014. Accessed 25 Feb 2021 Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv:​1311.​2524 [cs], October 2014. Accessed 25 Feb 2021
6.
go back to reference Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv:1506.01497 [cs] (2016) Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv:​1506.​01497 [cs] (2016)
7.
go back to reference Tan, M., Pang, R., Le, Q.V.: EfficientDet: Scalable and Efficient Object Detection. arXiv:1911.09070 [cs, eess], July 2020. Accessed 25 Feb 2021 Tan, M., Pang, R., Le, Q.V.: EfficientDet: Scalable and Efficient Object Detection. arXiv:​1911.​09070 [cs, eess], July 2020. Accessed 25 Feb 2021
8.
go back to reference Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You Only Look Once: Unified, Real-Time Object Detection. arXiv:1506.02640 [cs], May 2016. Accessed 25 Feb 2021 Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You Only Look Once: Unified, Real-Time Object Detection. arXiv:​1506.​02640 [cs], May 2016. Accessed 25 Feb 2021
11.
go back to reference Saha, R., Mondal, A., Jawahar, C.V.: Graphical Object Detection in Document Images. arXiv:2008.10843 [cs], August 2020. Accessed 12 Feb 2021 Saha, R., Mondal, A., Jawahar, C.V.: Graphical Object Detection in Document Images. arXiv:​2008.​10843 [cs], August 2020. Accessed 12 Feb 2021
12.
go back to reference Reisswig, C., Katti, A.R., Spinaci, M., Höhne, J.: Chargrid-OCR: End-to-end Trainable Optical Character Recognition for Printed Documents using Instance Segmentation. arXiv:1909.04469 [cs], February 2020. Accessed 12 Feb 2021 Reisswig, C., Katti, A.R., Spinaci, M., Höhne, J.: Chargrid-OCR: End-to-end Trainable Optical Character Recognition for Printed Documents using Instance Segmentation. arXiv:​1909.​04469 [cs], February 2020. Accessed 12 Feb 2021
13.
16.
17.
go back to reference Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: CenterNet: Keypoint Triplets for Object Detection. arXiv:1904.08189 [cs], April 2019. Accessed 18 Feb 2021 Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: CenterNet: Keypoint Triplets for Object Detection. arXiv:​1904.​08189 [cs], April 2019. Accessed 18 Feb 2021
18.
go back to reference Newell, A., Huang, Z., Deng, J.: Associative Embedding: End-to-End Learning for Joint Detection and Grouping. arXiv:1611.05424 [cs], June 2017. Accessed 25 Feb 2021 Newell, A., Huang, Z., Deng, J.: Associative Embedding: End-to-End Learning for Joint Detection and Grouping. arXiv:​1611.​05424 [cs], June 2017. Accessed 25 Feb 2021
19.
go back to reference Kong, T., Sun, F., Liu, H., Jiang, Y., Li, L., Shi, J.: FoveaBox: beyound anchor-based object detection. IEEE Trans. Image Process. 29, 10 (2020) Kong, T., Sun, F., Liu, H., Jiang, Y., Li, L., Shi, J.: FoveaBox: beyound anchor-based object detection. IEEE Trans. Image Process. 29, 10 (2020)
21.
go back to reference Wang, J., et al.: Deep High-Resolution Representation Learning for Visual Recognition. arXiv:1908.07919 [cs], March 2020. Accessed 15 Feb 2021 Wang, J., et al.: Deep High-Resolution Representation Learning for Visual Recognition. arXiv:​1908.​07919 [cs], March 2020. Accessed 15 Feb 2021
22.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. arXiv:1512.03385 [cs], December 2015. Accessed 25 Feb 2021 He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. arXiv:​1512.​03385 [cs], December 2015. Accessed 25 Feb 2021
23.
go back to reference Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation. arXiv:1908.10357 [cs, eess], March 2020. Accessed 15 Feb 2021 Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation. arXiv:​1908.​10357 [cs, eess], March 2020. Accessed 15 Feb 2021
24.
go back to reference Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. arXiv:1611.08050 [cs], April 2017. Accessed 25 Feb 2021 Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. arXiv:​1611.​08050 [cs], April 2017. Accessed 25 Feb 2021
25.
go back to reference Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal Loss for Dense Object Detection. arXiv:1708.02002 [cs], February 2018. Accessed 25 Feb 2021 Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal Loss for Dense Object Detection. arXiv:​1708.​02002 [cs], February 2018. Accessed 25 Feb 2021
26.
go back to reference Ma, W., Zhang, H., Jin, L., Wu, S., Wang, J., Wang, Y.: Joint Layout Analysis, Character Detection and Recognition for Historical Document Digitization. arXiv:2007.06890 [cs], July 2020. Accessed 12 Feb 2021 Ma, W., Zhang, H., Jin, L., Wu, S., Wang, J., Wang, Y.: Joint Layout Analysis, Character Detection and Recognition for Historical Document Digitization. arXiv:​2007.​06890 [cs], July 2020. Accessed 12 Feb 2021
28.
29.
go back to reference Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv:1505.04597 [cs], May 2015. Accessed 03 Mar 2021 Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv:​1505.​04597 [cs], May 2015. Accessed 03 Mar 2021
30.
go back to reference Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature Pyramid Networks for Object Detection. arXiv:1612.03144 [cs], April 2017. Accessed 03 Mar 2021 Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature Pyramid Networks for Object Detection. arXiv:​1612.​03144 [cs], April 2017. Accessed 03 Mar 2021
31.
Metadata
Title
HRRegionNet: Chinese Character Segmentation in Historical Documents with Regional Awareness
Authors
Chia-Wei Tang
Chao-Lin Liu
Po-Sen Chiu
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-86337-1_1

Premium Partner