Skip to main content
Log in

Texture sparseness for pixel classification of business document images

  • Original Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

Contemporary business documents contain diverse, multi-layered mixtures of textual, graphical, and pictorial elements. Existing methods for document segmentation and classification do not handle well the complexity and variety of contents, geometric layout, and elemental shapes. This paper proposes a novel document image classification approach that distributes individual pixels into four fundamental classes (text, image, graphics, and background) through support vector machines. This approach uses a novel low-dimensional feature descriptor based on textural properties. The proposed feature vector is constructed by considering the sparseness of the document image responses to a filter bank on a multi-resolution and contextual basis. Qualitative and quantitative evaluations on business document images show the benefits of adopting a contextual and multi-resolution approach. The proposed approach achieves excellent results; it is able to handle varied contents and complex document layouts, without imposing any constraint or making assumptions about the shape and spatial arrangement of document elements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. http://web.uvic.ca/~mcote/BDID/.

  2. http://web.uvic.ca/~mcote/BDID/.

References

  1. Haralick, R.M., Shanmugam, K., Dinstein, I.: Textural features for image classification. IEEE Trans. Syst, Man Cybern. SMC–3(6), 610–621 (1973)

    Google Scholar 

  2. Galloway, M.M.: Texture analysis using gray level run lengths. Comput. Graph. Image Process. 4(2), 172–179 (1975)

    Article  Google Scholar 

  3. Tuceryan, M., Jain, A.K.: Texture analysis. In: Chen, C.H., Pau, L.F., Wang, P.S.P. (eds.) Handbook of Pattern Recognition and Computer Vision, pp. 235–276. World Scientific, Singapore (1993)

    Chapter  Google Scholar 

  4. Turner, M.R.: Texture discrimination by Gabor functions. Biol. Cybern. 55(2–3), 71–82 (1986)

    Google Scholar 

  5. Liu, Y., Srihari, S.N.: Document image binarization based on texture features. IEEE Trans. Pattern Anal. Mach. Intell. 19(5), 540–544 (1997)

    Article  Google Scholar 

  6. Bloomberg, D.S.: Multiresolution Morphological Approach to Document Image Analysis. ICDAR, Saint-Malo (1991)

    Google Scholar 

  7. Zhu, Y., Tan, T., Wang, Y.: Font recognition based on global texture analysis. IEEE Trans. Pattern Anal. Mach. Intell. 23(10), 1192–1200 (2001)

    Article  Google Scholar 

  8. Ma, H., Doermann, D.: Gabor filter based multi-class classifier for scanned document images. In: ICDAR, Edinburgh, UK pp. 968–72 (2003)

  9. Aviles-Cruz, C., Rangel-Kuoppa, R., Reyes-Ayala, M., Andrade-Gonzalez, A., Escarela-Perez, R.: High-order statistical texture analysis—font recognition applied. Pattern Recognit. Lett. 26(2), 135–145 (2005)

    Article  Google Scholar 

  10. Peake, G.S., Tan, T.N.: Script and language identification from document images. In: DIA, San Juan, Puerto Rico pp. 10–17 (1997)

  11. Tan, T.N.: Rotation invariant texture features and their use in automatic script identification. IEEE Trans. Pattern Anal. Mach. Intell. 20(7), 751–756 (1998)

    Article  Google Scholar 

  12. Busch, A., Boles, W.W., Sridharan, S.: Texture for script identification. IEEE Trans. Pattern Anal. Mach. Intell. 27(11), 1720–1732 (2005)

    Article  Google Scholar 

  13. Hiremath, P.S., Shivashankar, S.: Wavelet based co-occurrence histogram features for texture classification with an application to script identification in a document image. Pattern Recognit. Lett. 29(9), 1182–1189 (2008)

    Article  Google Scholar 

  14. Liang, J., DeMenthon, D., Doermann, D.: Geometric rectification of camera-captured document images. IEEE Trans. Pattern Anal. Mach. Intell. 30(4), 591–605 (2008)

    Article  Google Scholar 

  15. Tian, Y., Narasimhan, S.G.: Rectification and 3D reconstruction of curved document images. In: CVPR, Providence, USA, pp. 377–84 (2011)

  16. Cullen, J.F., Hull, J.J., Hart, P.E.: Document image database retrieval and browsing using texture analysis. In: ICDAR, Ulm, Germany vol. 2, pp. 718–721 (1997)

  17. Journet, N., Ramel, J., Mullot, R., Eglin, V.: Document image characterization using a multiresolution analysis of the texture: application to old documents. Int. J. Doc. Anal. Recognit. 11(1), 9–18 (2008)

    Article  Google Scholar 

  18. Wang, D., Srihari, S.N.: Classification of newspaper image blocks using texture analysis. Comput. Vis. Graph. Image Process. 47(3), 327–352 (1989)

    Article  Google Scholar 

  19. Chetverikov, D., Liang, J., Komuves, J., Haralick, R.M.: Zone classification using texture features. In: ICPR, Vienna, Austria, vol. 3, pp. 676–80 (1996)

  20. Eglin, V., Gagneux, A.: Visual Exploration and functional document labeling. In: ICDAR, Seattle, USA pp. 816–20 (2001)

  21. Allier, B., Duong, J., Gagneux, A., Mallet, P., Emptoz, H.: Texture feature characterization for logical pre-labeling. In: ICDAR, Edinburgh, UK, vol. 1, pp. 567–71 (2003)

  22. Payne, J.S., Stonham, T.J., Patel, D.: Document segmentation using texture analysis. In: ICPR, Jerusalem, Israel, vol. 2, pp. 380–382 (1994)

  23. Chen, J.L.: A simplified approach to the HMM based texture analysis and its application to document segmentation. Pattern Recognit. Lett. 18(10), 993–1007 (1997)

    Article  Google Scholar 

  24. Baird, H.S., Moll, M.A., An, C., Casey, M.R.: Document image content inventories. In: DRR XIV (Proc SPIE vol 6500), San Jose, USA 65000X-1-12 (2007)

  25. Kim, B.R., Kim, W.H.: Texture-based PCA for classifying contents in document image. In: IPCV, Las Vegas, USA vol. 1, pp. 228–233 (2008)

  26. Jain, A. K., Bhattacharjee, S.K., Chen, Y. (1992) On texture in document images. In: CVPR, Champaign, USA, pp. 677–80

  27. Jain, A.K., Zhong, Y.: Page segmentation using texture analysis. Pattern Recognit. 29(5), 743–770 (1996)

    Article  Google Scholar 

  28. Vieux, R., Domenger, J.P.: Hierarchical clustering model for pixel-based classification of document images. In: ICPR, Tsukuba, Japan, pp. 290–293 (2012)

  29. Antonacopoulos, A., Bridson, D., Papadopoulos, C., Pletschacher, S.: A realistic dataset for performance evaluation of document layout analysis. In: ICDAR, Barcelona, Spain, pp. 296–300 (2009)

  30. Zhong, G., Cheriet, M.: Image patches analysis for text block identification. In: ISSPA, Montreal, Canada, pp. 1241–1246 (2012)

  31. Etemad, K., Doermann, D., Chellappa, R.: Multiscale segmentation of unstructured document pages using soft decision integration. IEEE Trans. Pattern Anal. Mach. Intell. 19(1), 92–96 (1997)

    Article  Google Scholar 

  32. Li, J., Gray, R.M.: Context-based multiscale classification of document images using wavelet coefficient distributions. IEEE Trans. Image Process. 9(9), 1604–1616 (2000)

    Article  Google Scholar 

  33. Lee, S.W., Ryu, D.S.: Parameter-free geometric document layout analysis. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1240–1256 (2001)

    Article  Google Scholar 

  34. Acharyya, M., Kundu, M.K.: Document image segmentation using wavelet scale-space features. IEEE Trans. Circuits Syst. Video Technol. 12(12), 1117–1127 (2002)

    Article  Google Scholar 

  35. Sauvola, J., Kauniskangas, H.: MediaTeam Document Database II, a CD-ROM Collection of Document Images. Univ of Oulu (1999)

  36. Ford, G, Thoma, G.R.: Ground truth data for document image analysis. In: SDIUT, Greenbelt, USA, pp. 199–205 (2003)

  37. Todoran, L., Worring, M., Smeulders, A.W.M.: The UvA color document dataset. Int. J. Doc. Anal. Recognit. 7(4), 228–240 (2005)

    Article  Google Scholar 

  38. Clausner, C., Pletschacher, S., Antonacopoulos, A.: Aletheia—an advanced document layout and text ground-truthing system for production environments. In: ICDAR, Beijing, China pp. 48–52 (2011)

  39. Pletschacher, S., Antonacopoulos, A.: The PAGE (Page Analysis and Ground-truth Elements) format framework. In: ICPR, Istanbul, Turkey, pp. 257–260 (2010)

  40. O’Gorman, L., Kasturi, R.: Document Image Analysis. IEEE Computer Society Press, Los Alamitos (1997)

    Google Scholar 

  41. Leung, T., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons. Int. J. Comput. Vis. 43(1), 29–44 (2001)

    Article  MATH  Google Scholar 

  42. Omer, I., Werman, M.: Image specific feature similarities. In: ECCV (Lect Notes Comput Sc vol 3952), Graz, Austria, pp. 321–333 (2006)

  43. Lu, L., Toyama, K., Hager, G.D.: A two level approach for scene recognition. In: CVPR, San Diego, USA, vol. 1, pp. 688–695 (2005)

  44. Garcia-Pineda, O., MacDonald, I., Zimmer, B.: Synthetic aperture radar image processing using the supervised textural-neural network classification algorithm. In: IGARSS, Boston, USA, vol. 4, pp. 1265–1268 (2008)

  45. Varma, M., Zisserman, A.: A statistical approach to texture classification from single images. Int. J. Comput. Vis. 62(1), 61–81 (2005)

    Article  Google Scholar 

  46. Hurley, N., Rickard, S.: Comparing measures of sparsity. IEEE Trans. Inf. Theory 55(10), 4723–4741 (2009)

    Article  MathSciNet  Google Scholar 

  47. Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T.S., Yan, S.: Sparse representation for computer vision and pattern recognition. Proc. IEEE 98(6), 1031–1044 (2010)

    Article  Google Scholar 

  48. Hoang, T.V., Tabbone, S.: Text extraction from graphical document images using sparse representation. In: DAS, Boston, USA, pp. 143–150 (2010)

  49. Zhao, M., Li, S., Kwok, J.: Text detection in images using sparse representation with discriminative dictionaries. Image Vis. Comput. 28(12), 1590–1599 (2010)

    Article  Google Scholar 

  50. Pan, W., Bui, T.D., Suen, C.Y.: Text detection from scene images using sparse representation. In: ICPR, Tampa, USA, pp. 1–5 (2008)

  51. Zhang, F., Ye, X., Liu, W.: Image decomposition and texture segmentation via sparse representation. IEEE Signal Process. Lett. 15, 641–644 (2008)

    Article  Google Scholar 

  52. Alpert, S., Galun, M., Basri, R., Brandt, A.: Image segmentation by probabilistic bottom-up aggregation and cue integration. In: CVPR, Minneapolis, USA, pp. 1–8 (2007)

  53. Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5(9), 1457–1469 (2004)

    MATH  MathSciNet  Google Scholar 

  54. Bukhari, S.S., Al-Azawi, M.I.A., Shafait, F., Breuel, T.M.: Document image segmentation using discriminative learning over connected components. In: DAS, Boston, USA, pp. 183–90 (2010)

  55. Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: COLT, Pittsburgh, USA, pp. 144–152 (1992)

  56. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  57. Mathur, A., Foody, G.M.: Multiclass and binary SVM classification: implications for training and classification users. IEEE Geosci. Remote. Sens. Lett. 5(2), 241–245 (2008)

    Article  Google Scholar 

  58. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27 (2011)

    Article  Google Scholar 

  59. Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classification. In: Technical Report, Dept of Comput Sci, Natl Taiwan Univ (2003)

  60. R Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2013). http://www.R-project.org/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexandra Branzan Albu.

Additional information

This work was supported by the Natural Sciences and Engineering Research Council of Canada and SAP Canada through the Collaborative Research and Development Grants Program. Special thanks to Prof. Nicholas Journet for his help on implementing the comparison method of Sect. 5.4.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cote, M., Branzan Albu, A. Texture sparseness for pixel classification of business document images. IJDAR 17, 257–273 (2014). https://doi.org/10.1007/s10032-014-0217-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-014-0217-8

Keywords

Navigation