Skip to main content

2017 | OriginalPaper | Buchkapitel

Historical Document Binarization Combining Semantic Labeling and Graph Cuts

verfasst von : Kalyan Ram Ayyalasomayajula, Anders Brun

Erschienen in: Image Analysis

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Most data mining applications on collections of historical documents require binarization of the digitized images as a pre-processing step. Historical documents are often subjected to degradations such as parchment aging, smudges and bleed through from the other side. The text is sometimes printed, but more often handwritten. Mathematical modeling of appearance of the text, background and all kinds of degradations, is challenging. In the current work we try to tackle binarization as pixel classification problem. We first apply semantic segmentation, using fully convolutional neural networks. In order to improve the sharpness of the result, we then apply a graph cut algorithm. The labels from the semantic segmentation are used as approximate estimates of the text and background, with the probability map of background used for pruning the edges in the graph cut. The results obtained show significant improvement over the state of the art approach.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Ayyalasomayajula, K.R., Brun, A.: Document binarization using topological clustering guided Laplacian energy segmentation. In: Proceedings of ICFHR, pp. 523–528 (2014) Ayyalasomayajula, K.R., Brun, A.: Document binarization using topological clustering guided Laplacian energy segmentation. In: Proceedings of ICFHR, pp. 523–528 (2014)
2.
Zurück zum Zitat Bar-Yosef, I., Beckman, I., Kedem, K., Dinstein, I.: Binarization, character extraction and writer identification of historical Hebrew calligraphy documents. Int. J. Doc. Anal. Recogn. 9(2), 89–99 (2007)CrossRef Bar-Yosef, I., Beckman, I., Kedem, K., Dinstein, I.: Binarization, character extraction and writer identification of historical Hebrew calligraphy documents. Int. J. Doc. Anal. Recogn. 9(2), 89–99 (2007)CrossRef
3.
Zurück zum Zitat Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. PAMI 26(9), 1124–1137 (2004)CrossRefMATH Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. PAMI 26(9), 1124–1137 (2004)CrossRefMATH
4.
Zurück zum Zitat Howe, N.: A Laplacian energy for document binarization. In: International Conference on Document Analysis and Recognition, pp. 6–10 (2011) Howe, N.: A Laplacian energy for document binarization. In: International Conference on Document Analysis and Recognition, pp. 6–10 (2011)
6.
Zurück zum Zitat Lelore, T., Bouchara, F.: Super-resolved binarization of text based on FAIR algorithm. In: International Conference on Document Analysis and Recognition, pp. 839–843 (2011) Lelore, T., Bouchara, F.: Super-resolved binarization of text based on FAIR algorithm. In: International Conference on Document Analysis and Recognition, pp. 839–843 (2011)
7.
Zurück zum Zitat Lu, H., Kot, A.C., Shi, Y.Q.: Distance-reciprocal distortion measure for binary document images. IEEE Signal Process. Lett. 11(2), 228–231 (2004)CrossRef Lu, H., Kot, A.C., Shi, Y.Q.: Distance-reciprocal distortion measure for binary document images. IEEE Signal Process. Lett. 11(2), 228–231 (2004)CrossRef
8.
Zurück zum Zitat Lu, S., Su, B., Tan, C.L.: Document image binarization using background estimation and stroke edges. Int. J. Doc. Anal. Recogn. 13(4), 303–314 (2010)CrossRef Lu, S., Su, B., Tan, C.L.: Document image binarization using background estimation and stroke edges. Int. J. Doc. Anal. Recogn. 13(4), 303–314 (2010)CrossRef
9.
Zurück zum Zitat Mishra, A., Alahari, K., Jawahar, C.V.: An MRF model for binarization of natural scene text. In: International Conference on Document Analysis and Recognition (2011) Mishra, A., Alahari, K., Jawahar, C.V.: An MRF model for binarization of natural scene text. In: International Conference on Document Analysis and Recognition (2011)
10.
Zurück zum Zitat Niblack, W.: An Introduction to Digital Image Processing. Prentice-Hall, Englewood Cliffs (1986) Niblack, W.: An Introduction to Digital Image Processing. Prentice-Hall, Englewood Cliffs (1986)
11.
Zurück zum Zitat Otsu, N.: A threshold selection method from gray level histograms. IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979)CrossRef Otsu, N.: A threshold selection method from gray level histograms. IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979)CrossRef
12.
Zurück zum Zitat Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICDAR: document image binarization contest (DIBCO 2011). In: International Conference on Document Analysis and Recognition, pp. 1506–1510 (2011) Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICDAR: document image binarization contest (DIBCO 2011). In: International Conference on Document Analysis and Recognition, pp. 1506–1510 (2011)
13.
Zurück zum Zitat Sauvola, N., Pietikainen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)CrossRef Sauvola, N., Pietikainen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)CrossRef
14.
15.
Zurück zum Zitat Yangqing, J., Evan, S., Jeff, D., Sergey, K., Jonathan, L., Ross, G., Sergio, G., Trevor, D.: Caffe: convolutional architecture for fast feature embedding, arXiv preprint (2014). arXiv:1408.5093 Yangqing, J., Evan, S., Jeff, D., Sergey, K., Jonathan, L., Ross, G., Sergio, G., Trevor, D.: Caffe: convolutional architecture for fast feature embedding, arXiv preprint (2014). arXiv:​1408.​5093
Metadaten
Titel
Historical Document Binarization Combining Semantic Labeling and Graph Cuts
verfasst von
Kalyan Ram Ayyalasomayajula
Anders Brun
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-59126-1_32