Texture sparseness for pixel classification of business document images

Cote, Melissa; Branzan Albu, Alexandra

doi:10.1007/s10032-014-0217-8

Texture sparseness for pixel classification of business document images

Original Paper
Published: 12 February 2014

Volume 17, pages 257–273, (2014)
Cite this article

International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Melissa Cote¹ &
Alexandra Branzan Albu¹

571 Accesses
17 Citations
Explore all metrics

Abstract

Contemporary business documents contain diverse, multi-layered mixtures of textual, graphical, and pictorial elements. Existing methods for document segmentation and classification do not handle well the complexity and variety of contents, geometric layout, and elemental shapes. This paper proposes a novel document image classification approach that distributes individual pixels into four fundamental classes (text, image, graphics, and background) through support vector machines. This approach uses a novel low-dimensional feature descriptor based on textural properties. The proposed feature vector is constructed by considering the sparseness of the document image responses to a filter bank on a multi-resolution and contextual basis. Qualitative and quantitative evaluations on business document images show the benefits of adopting a contextual and multi-resolution approach. The proposed approach achieves excellent results; it is able to handle varied contents and complex document layouts, without imposing any constraint or making assumptions about the shape and spatial arrangement of document elements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

OCR-MRD: performance analysis of different optical character recognition engines for medical report digitization

Article 24 November 2023

From BoW to CNN: Two Decades of Texture Representation for Texture Classification

Article Open access 08 November 2018

A systematic review on handwritten document analysis and recognition

Article 02 June 2023

Notes

References

Haralick, R.M., Shanmugam, K., Dinstein, I.: Textural features for image classification. IEEE Trans. Syst, Man Cybern. SMC–3(6), 610–621 (1973)
Google Scholar
Galloway, M.M.: Texture analysis using gray level run lengths. Comput. Graph. Image Process. 4(2), 172–179 (1975)
Article Google Scholar
Tuceryan, M., Jain, A.K.: Texture analysis. In: Chen, C.H., Pau, L.F., Wang, P.S.P. (eds.) Handbook of Pattern Recognition and Computer Vision, pp. 235–276. World Scientific, Singapore (1993)
Chapter Google Scholar
Turner, M.R.: Texture discrimination by Gabor functions. Biol. Cybern. 55(2–3), 71–82 (1986)
Google Scholar
Liu, Y., Srihari, S.N.: Document image binarization based on texture features. IEEE Trans. Pattern Anal. Mach. Intell. 19(5), 540–544 (1997)
Article Google Scholar
Bloomberg, D.S.: Multiresolution Morphological Approach to Document Image Analysis. ICDAR, Saint-Malo (1991)
Google Scholar
Zhu, Y., Tan, T., Wang, Y.: Font recognition based on global texture analysis. IEEE Trans. Pattern Anal. Mach. Intell. 23(10), 1192–1200 (2001)
Article Google Scholar
Ma, H., Doermann, D.: Gabor filter based multi-class classifier for scanned document images. In: ICDAR, Edinburgh, UK pp. 968–72 (2003)
Aviles-Cruz, C., Rangel-Kuoppa, R., Reyes-Ayala, M., Andrade-Gonzalez, A., Escarela-Perez, R.: High-order statistical texture analysis—font recognition applied. Pattern Recognit. Lett. 26(2), 135–145 (2005)
Article Google Scholar
Peake, G.S., Tan, T.N.: Script and language identification from document images. In: DIA, San Juan, Puerto Rico pp. 10–17 (1997)
Tan, T.N.: Rotation invariant texture features and their use in automatic script identification. IEEE Trans. Pattern Anal. Mach. Intell. 20(7), 751–756 (1998)
Article Google Scholar
Busch, A., Boles, W.W., Sridharan, S.: Texture for script identification. IEEE Trans. Pattern Anal. Mach. Intell. 27(11), 1720–1732 (2005)
Article Google Scholar
Hiremath, P.S., Shivashankar, S.: Wavelet based co-occurrence histogram features for texture classification with an application to script identification in a document image. Pattern Recognit. Lett. 29(9), 1182–1189 (2008)
Article Google Scholar
Liang, J., DeMenthon, D., Doermann, D.: Geometric rectification of camera-captured document images. IEEE Trans. Pattern Anal. Mach. Intell. 30(4), 591–605 (2008)
Article Google Scholar
Tian, Y., Narasimhan, S.G.: Rectification and 3D reconstruction of curved document images. In: CVPR, Providence, USA, pp. 377–84 (2011)
Cullen, J.F., Hull, J.J., Hart, P.E.: Document image database retrieval and browsing using texture analysis. In: ICDAR, Ulm, Germany vol. 2, pp. 718–721 (1997)
Journet, N., Ramel, J., Mullot, R., Eglin, V.: Document image characterization using a multiresolution analysis of the texture: application to old documents. Int. J. Doc. Anal. Recognit. 11(1), 9–18 (2008)
Article Google Scholar
Wang, D., Srihari, S.N.: Classification of newspaper image blocks using texture analysis. Comput. Vis. Graph. Image Process. 47(3), 327–352 (1989)
Article Google Scholar
Chetverikov, D., Liang, J., Komuves, J., Haralick, R.M.: Zone classification using texture features. In: ICPR, Vienna, Austria, vol. 3, pp. 676–80 (1996)
Eglin, V., Gagneux, A.: Visual Exploration and functional document labeling. In: ICDAR, Seattle, USA pp. 816–20 (2001)
Allier, B., Duong, J., Gagneux, A., Mallet, P., Emptoz, H.: Texture feature characterization for logical pre-labeling. In: ICDAR, Edinburgh, UK, vol. 1, pp. 567–71 (2003)
Payne, J.S., Stonham, T.J., Patel, D.: Document segmentation using texture analysis. In: ICPR, Jerusalem, Israel, vol. 2, pp. 380–382 (1994)
Chen, J.L.: A simplified approach to the HMM based texture analysis and its application to document segmentation. Pattern Recognit. Lett. 18(10), 993–1007 (1997)
Article Google Scholar
Baird, H.S., Moll, M.A., An, C., Casey, M.R.: Document image content inventories. In: DRR XIV (Proc SPIE vol 6500), San Jose, USA 65000X-1-12 (2007)
Kim, B.R., Kim, W.H.: Texture-based PCA for classifying contents in document image. In: IPCV, Las Vegas, USA vol. 1, pp. 228–233 (2008)
Jain, A. K., Bhattacharjee, S.K., Chen, Y. (1992) On texture in document images. In: CVPR, Champaign, USA, pp. 677–80
Jain, A.K., Zhong, Y.: Page segmentation using texture analysis. Pattern Recognit. 29(5), 743–770 (1996)
Article Google Scholar
Vieux, R., Domenger, J.P.: Hierarchical clustering model for pixel-based classification of document images. In: ICPR, Tsukuba, Japan, pp. 290–293 (2012)
Antonacopoulos, A., Bridson, D., Papadopoulos, C., Pletschacher, S.: A realistic dataset for performance evaluation of document layout analysis. In: ICDAR, Barcelona, Spain, pp. 296–300 (2009)
Zhong, G., Cheriet, M.: Image patches analysis for text block identification. In: ISSPA, Montreal, Canada, pp. 1241–1246 (2012)
Etemad, K., Doermann, D., Chellappa, R.: Multiscale segmentation of unstructured document pages using soft decision integration. IEEE Trans. Pattern Anal. Mach. Intell. 19(1), 92–96 (1997)
Article Google Scholar
Li, J., Gray, R.M.: Context-based multiscale classification of document images using wavelet coefficient distributions. IEEE Trans. Image Process. 9(9), 1604–1616 (2000)
Article Google Scholar
Lee, S.W., Ryu, D.S.: Parameter-free geometric document layout analysis. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1240–1256 (2001)
Article Google Scholar
Acharyya, M., Kundu, M.K.: Document image segmentation using wavelet scale-space features. IEEE Trans. Circuits Syst. Video Technol. 12(12), 1117–1127 (2002)
Article Google Scholar
Sauvola, J., Kauniskangas, H.: MediaTeam Document Database II, a CD-ROM Collection of Document Images. Univ of Oulu (1999)
Ford, G, Thoma, G.R.: Ground truth data for document image analysis. In: SDIUT, Greenbelt, USA, pp. 199–205 (2003)
Todoran, L., Worring, M., Smeulders, A.W.M.: The UvA color document dataset. Int. J. Doc. Anal. Recognit. 7(4), 228–240 (2005)
Article Google Scholar
Clausner, C., Pletschacher, S., Antonacopoulos, A.: Aletheia—an advanced document layout and text ground-truthing system for production environments. In: ICDAR, Beijing, China pp. 48–52 (2011)
Pletschacher, S., Antonacopoulos, A.: The PAGE (Page Analysis and Ground-truth Elements) format framework. In: ICPR, Istanbul, Turkey, pp. 257–260 (2010)
O’Gorman, L., Kasturi, R.: Document Image Analysis. IEEE Computer Society Press, Los Alamitos (1997)
Google Scholar
Leung, T., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons. Int. J. Comput. Vis. 43(1), 29–44 (2001)
Article MATH Google Scholar
Omer, I., Werman, M.: Image specific feature similarities. In: ECCV (Lect Notes Comput Sc vol 3952), Graz, Austria, pp. 321–333 (2006)
Lu, L., Toyama, K., Hager, G.D.: A two level approach for scene recognition. In: CVPR, San Diego, USA, vol. 1, pp. 688–695 (2005)
Garcia-Pineda, O., MacDonald, I., Zimmer, B.: Synthetic aperture radar image processing using the supervised textural-neural network classification algorithm. In: IGARSS, Boston, USA, vol. 4, pp. 1265–1268 (2008)
Varma, M., Zisserman, A.: A statistical approach to texture classification from single images. Int. J. Comput. Vis. 62(1), 61–81 (2005)
Article Google Scholar
Hurley, N., Rickard, S.: Comparing measures of sparsity. IEEE Trans. Inf. Theory 55(10), 4723–4741 (2009)
Article MathSciNet Google Scholar
Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T.S., Yan, S.: Sparse representation for computer vision and pattern recognition. Proc. IEEE 98(6), 1031–1044 (2010)
Article Google Scholar
Hoang, T.V., Tabbone, S.: Text extraction from graphical document images using sparse representation. In: DAS, Boston, USA, pp. 143–150 (2010)
Zhao, M., Li, S., Kwok, J.: Text detection in images using sparse representation with discriminative dictionaries. Image Vis. Comput. 28(12), 1590–1599 (2010)
Article Google Scholar
Pan, W., Bui, T.D., Suen, C.Y.: Text detection from scene images using sparse representation. In: ICPR, Tampa, USA, pp. 1–5 (2008)
Zhang, F., Ye, X., Liu, W.: Image decomposition and texture segmentation via sparse representation. IEEE Signal Process. Lett. 15, 641–644 (2008)
Article Google Scholar
Alpert, S., Galun, M., Basri, R., Brandt, A.: Image segmentation by probabilistic bottom-up aggregation and cue integration. In: CVPR, Minneapolis, USA, pp. 1–8 (2007)
Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5(9), 1457–1469 (2004)
MATH MathSciNet Google Scholar
Bukhari, S.S., Al-Azawi, M.I.A., Shafait, F., Breuel, T.M.: Document image segmentation using discriminative learning over connected components. In: DAS, Boston, USA, pp. 183–90 (2010)
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: COLT, Pittsburgh, USA, pp. 144–152 (1992)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Mathur, A., Foody, G.M.: Multiclass and binary SVM classification: implications for training and classification users. IEEE Geosci. Remote. Sens. Lett. 5(2), 241–245 (2008)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27 (2011)
Article Google Scholar
Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classification. In: Technical Report, Dept of Comput Sci, Natl Taiwan Univ (2003)
R Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2013). http://www.R-project.org/

Download references

Author information

Authors and Affiliations

Electrical and Computer Engineering, University of Victoria, P.O. Box 3055, STN CSC, Victoria, BC, V8W 3P6, Canada
Melissa Cote & Alexandra Branzan Albu

Authors

Melissa Cote
View author publications
You can also search for this author in PubMed Google Scholar
Alexandra Branzan Albu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexandra Branzan Albu.

Additional information

This work was supported by the Natural Sciences and Engineering Research Council of Canada and SAP Canada through the Collaborative Research and Development Grants Program. Special thanks to Prof. Nicholas Journet for his help on implementing the comparison method of Sect. 5.4.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cote, M., Branzan Albu, A. Texture sparseness for pixel classification of business document images. IJDAR 17, 257–273 (2014). https://doi.org/10.1007/s10032-014-0217-8

Download citation

Received: 18 June 2013
Revised: 17 January 2014
Accepted: 22 January 2014
Published: 12 February 2014
Issue Date: September 2014
DOI: https://doi.org/10.1007/s10032-014-0217-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Texture sparseness for pixel classification of business document images

Abstract

Access this article

Similar content being viewed by others

OCR-MRD: performance analysis of different optical character recognition engines for medical report digitization

From BoW to CNN: Two Decades of Texture Representation for Texture Classification

A systematic review on handwritten document analysis and recognition

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Texture sparseness for pixel classification of business document images

Abstract

Access this article

Similar content being viewed by others

OCR-MRD: performance analysis of different optical character recognition engines for medical report digitization

From BoW to CNN: Two Decades of Texture Representation for Texture Classification

A systematic review on handwritten document analysis and recognition

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation