Skip to main content
Top

2018 | OriginalPaper | Chapter

Document Image Segmentation Using Deep Features

Authors : K. V. Jobin, C. V. Jawahar

Published in: Computer Vision, Pattern Recognition, Image Processing, and Graphics

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper explores the effectiveness of deep features for document image segmentation. The document image segmentation problem is modelled as a pixel labeling task where each pixel in the document image is classified into one of the predefined labels such as text, comments, decorations and background. Our method first extracts deep features from superpixels of the document image. Then we learn an svm classifier using these features, and segment the document image. Fisher vector encoded convolutional layer features (fv-cnn) and fully connected layer features (fc-cnn) are used in our study. Experiments validate that our method is effective and yields better results for segmenting document images in comparison to the popular approaches on benchmark handwritten datasets.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Zhong, Y., Karu, K., Jain, A.K.: Locating text in complex color images. In: ICDAR (1995)CrossRef Zhong, Y., Karu, K., Jain, A.K.: Locating text in complex color images. In: ICDAR (1995)CrossRef
2.
go back to reference Chen, K., Wei, H., Hennebert, J., Ingold, R., Liwicki, M.: Page segmentation for historical handwritten document images using color and texture features. In: ICFHR (2014) Chen, K., Wei, H., Hennebert, J., Ingold, R., Liwicki, M.: Page segmentation for historical handwritten document images using color and texture features. In: ICFHR (2014)
3.
go back to reference Chen, K., Seuret, M., Liwicki, M., Hennebert, J., Ingold, R.: Page segmentation of historical document images with convolutional autoencoders. In: ICDAR (2015) Chen, K., Seuret, M., Liwicki, M., Hennebert, J., Ingold, R.: Page segmentation of historical document images with convolutional autoencoders. In: ICDAR (2015)
4.
go back to reference Ganin, Y., Lempitsky, V.: N4-fields: neural network nearest neighbor fields for image transforms. In: ACCV (2015) Ganin, Y., Lempitsky, V.: N4-fields: neural network nearest neighbor fields for image transforms. In: ACCV (2015)
5.
go back to reference Chen, K., Liu, C.L., Seuret, M., Liwicki, M., Hennebert, J., Ingold, R.: Page segmentation for historical document images based on superpixel classification with unsupervised feature learning. In: DAS (2016) Chen, K., Liu, C.L., Seuret, M., Liwicki, M., Hennebert, J., Ingold, R.: Page segmentation for historical document images based on superpixel classification with unsupervised feature learning. In: DAS (2016)
6.
go back to reference Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting, labeling sequence data. In: ICML (2001) Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting, labeling sequence data. In: ICML (2001)
7.
go back to reference Chen, K., Seuret, M., Liwicki, M., Hennebert, J., Liu, C.L., Ingold, R.: Page segmentation for historical handwritten document images using conditional random fields. In: ICFHR (2016) Chen, K., Seuret, M., Liwicki, M., Hennebert, J., Liu, C.L., Ingold, R.: Page segmentation for historical handwritten document images using conditional random fields. In: ICFHR (2016)
8.
9.
go back to reference Cimpoi, M., Maji, S., Vedaldi, A.: Deep filter banks for texture recognition, segmentation. In: CVPR (2015) Cimpoi, M., Maji, S., Vedaldi, A.: Deep filter banks for texture recognition, segmentation. In: CVPR (2015)
10.
go back to reference Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. P. R. Lett. 33, 934–942 (2012)CrossRef Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. P. R. Lett. 33, 934–942 (2012)CrossRef
11.
go back to reference Fischer, A., Frinken, V., Fornés, A., Bunke, H.: Transcription alignment of Latin manuscripts using hidden Markov models. In: Workshop on HDIP (2011) Fischer, A., Frinken, V., Fornés, A., Bunke, H.: Transcription alignment of Latin manuscripts using hidden Markov models. In: Workshop on HDIP (2011)
12.
go back to reference Fischer, A., Wuthrich, M., Liwicki, M., Frinken, V., Bunke, H., Viehhauser, G., Stolz, M.: Automatic transcription of handwritten medieval documents. In: Virtual Systems, Multimedia (2009) Fischer, A., Wuthrich, M., Liwicki, M., Frinken, V., Bunke, H., Viehhauser, G., Stolz, M.: Automatic transcription of handwritten medieval documents. In: Virtual Systems, Multimedia (2009)
13.
go back to reference Simistira, F., Seuret, M., Eichenberger, N., Garz, A., Liwicki, M., Ingold, R.: DIVA-HisDB: a precisely annotated large dataset of challenging medieval manuscripts. In: ICFHR (2016) Simistira, F., Seuret, M., Eichenberger, N., Garz, A., Liwicki, M., Ingold, R.: DIVA-HisDB: a precisely annotated large dataset of challenging medieval manuscripts. In: ICFHR (2016)
14.
go back to reference Leung, T., Malik, J.: Recognizing surfaces using three-dimensional textons. In: CVPR (1999) Leung, T., Malik, J.: Recognizing surfaces using three-dimensional textons. In: CVPR (1999)
15.
go back to reference Julez, B., Bergen, J.R.: Human factors, behavioral science: textons, the fundamental elements in preattentive vision and perception of textures. In: Readings in Computer Vision (1987) Julez, B., Bergen, J.R.: Human factors, behavioral science: textons, the fundamental elements in preattentive vision and perception of textures. In: Readings in Computer Vision (1987)
16.
go back to reference Liu, L., Wang, L., Liu, X.: In defense of soft-assignment coding. In: ICCV (2011) Liu, L., Wang, L., Liu, X.: In defense of soft-assignment coding. In: ICCV (2011)
17.
go back to reference Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV (1999) Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV (1999)
20.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: PAMI (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: PAMI (2012)
22.
go back to reference Imagenet classification with deep convolutional neural networks: visualizing data using t-SNE. In: JMLR (2008) Imagenet classification with deep convolutional neural networks: visualizing data using t-SNE. In: JMLR (2008)
23.
go back to reference Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: BMVC (2014) Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: BMVC (2014)
24.
go back to reference Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)
26.
go back to reference Chen, K., Seuret, M., Wei, H., Liwicki, M., Hennebert, J., Ingold, R.: Ground truth model, tool, and dataset for layout analysis of historical documents. In: DRR (2015) Chen, K., Seuret, M., Wei, H., Liwicki, M., Hennebert, J., Ingold, R.: Ground truth model, tool, and dataset for layout analysis of historical documents. In: DRR (2015)
Metadata
Title
Document Image Segmentation Using Deep Features
Authors
K. V. Jobin
C. V. Jawahar
Copyright Year
2018
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-13-0020-2_33

Premium Partner