Skip to main content
Top

2017 | OriginalPaper | Chapter

Feature Learning for Footnote-Based Document Image Classification

Authors : Sherif Abuelwafa, Mohamed Mhiri, Rachid Hedjam, Sara Zhalehpour, Andrew Piper, Chad Wellmon, Mohamed Cheriet

Published in: Image Analysis and Recognition

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Classifying document images is a challenging problem that is confronted by many obstacles; specifically, the pivotal need of hand-designed features and the scarcity of labeled data. In this paper, a new approach for classifying document images, based on the availability of footnotes in them, is presented. Our proposed approach depends mainly on a Deep Belief Network (DBN) that consists of two phases, unsupervised pre-training and supervised fine-tuning. The main advantage of using this approach is its capability to automatically engineer the best features to be extracted from a raw document image for the sake of generating an efficient representation of it. This feature learning approach takes advantage of the vast amount of available unlabeled data and employs it with the limited number of labeled data. The obtained results show that the proposed approach provides an effective document image classification framework with a highly reliable performance.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Grafton, A.: The Footnote: A curious history. Harvard University Press, Cambridge (1999) Grafton, A.: The Footnote: A curious history. Harvard University Press, Cambridge (1999)
2.
go back to reference Pasanek, B., Wellmon, C.: The enlightenment index. Eighteenth Century 56(3), 359–382 (2015)CrossRef Pasanek, B., Wellmon, C.: The enlightenment index. Eighteenth Century 56(3), 359–382 (2015)CrossRef
3.
go back to reference Lowe, D.G.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2. IEEE (1999) Lowe, D.G.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2. IEEE (1999)
4.
go back to reference Bay, H., Tuytelaars, T., Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). doi:10.1007/11744023_32CrossRef Bay, H., Tuytelaars, T., Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). doi:10.​1007/​11744023_​32CrossRef
5.
go back to reference Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1. IEEE (2005) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1. IEEE (2005)
6.
go back to reference Kang, L., et al.: Convolutional neural networks for document image classification. In: 2014 22nd International Conference on Pattern Recognition (ICPR). IEEE (2014) Kang, L., et al.: Convolutional neural networks for document image classification. In: 2014 22nd International Conference on Pattern Recognition (ICPR). IEEE (2014)
7.
go back to reference Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)CrossRef Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)CrossRef
8.
go back to reference Banko, M., Brill, E.: Scaling to very very large corpora for natural language disambiguation. In: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics (2001) Banko, M., Brill, E.: Scaling to very very large corpora for natural language disambiguation. In: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics (2001)
9.
go back to reference Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. Aistats. 15(106), 315–323 (2011) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. Aistats. 15(106), 315–323 (2011)
10.
go back to reference Erhan, D., et al.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11(Feb), 625–660 (2010)MathSciNet Erhan, D., et al.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11(Feb), 625–660 (2010)MathSciNet
11.
go back to reference dos Santos, R.P., et al.: Text line segmentation based on morphology and histogram projection. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009. IEEE (2009) dos Santos, R.P., et al.: Text line segmentation based on morphology and histogram projection. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009. IEEE (2009)
12.
go back to reference Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRef Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRef
13.
go back to reference Lee, H., Ekanadham, C., Ng, A.Y.: Sparse deep belief net model for visual area V2. In: Advances in Neural Information Processing Systems (2008) Lee, H., Ekanadham, C., Ng, A.Y.: Sparse deep belief net model for visual area V2. In: Advances in Neural Information Processing Systems (2008)
14.
go back to reference James, B., Olivier, B., Frédéric, B., Pascal, L., Razvan, P.: Theano: a CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference (SciPy) (2010) James, B., Olivier, B., Frédéric, B., Pascal, L., Razvan, P.: Theano: a CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference (SciPy) (2010)
Metadata
Title
Feature Learning for Footnote-Based Document Image Classification
Authors
Sherif Abuelwafa
Mohamed Mhiri
Rachid Hedjam
Sara Zhalehpour
Andrew Piper
Chad Wellmon
Mohamed Cheriet
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-59876-5_71

Premium Partner