skip to main content
research-article

Allograph modeling for online handwritten characters in devanagari using constrained stroke clustering

Published:03 October 2014Publication History
Skip Abstract Section

Abstract

Writer-specific character writing variations such as those of stroke order and stroke number are an important source of variability in the input when handwriting is captured “online” via a stylus and a challenge for robust online recognition of handwritten characters and words. It has been shown by several studies that explicit modeling of character allographs is important for achieving high recognition accuracies in a writer-independent recognition system. While previous approaches have relied on unsupervised clustering at the character or stroke level to find the allographs of a character, in this article we propose the use of constrained clustering using automatically derived domain constraints to find a minimal set of stroke clusters. The allographs identified have been applied to Devanagari character recognition using Hidden Markov Models and Nearest Neighbor classifiers, and the results indicate substantial improvement in recognition accuracy and/or reduction in memory and computation time when compared to alternate modeling techniques.

References

  1. V. J. Babu, L. Prasanth, R. R. Sharma, G. V. P. Rao, and A. Bharath. 2007. HMM-based online handwriting recognition system for Telugu symbols. In Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR’07). 63--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Claus Bahlmann and Hans Burkhardt. 2004. The writer independent online handwriting recognition system frog on hand and cluster generative statistical dynamic time warping. IEEE Trans. Pattern Anal. Mach. Intell. 26, 3, 299--310. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Basu. 2005. Semi-Supervised Clustering: Probabilistic Models, Algorithms and Experiments. Ph.D. Dissertation. University of Texas at Austin. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Basu, M. Bilenko, A. Banerjee, and R. Mooney. 2006. Probabilistic semi-supervised clustering with constraints. In Semi-Supervised Learning, O. Chapelle, B. Scholkopf, and A. Zien, Eds., MIT Press, Cambridge, MA, 73--102.Google ScholarGoogle Scholar
  5. S. Basu and I. Davidson. 2006. Clustering under constraints: Theory and practice. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06).Google ScholarGoogle Scholar
  6. A. Bharath, V. Deepu, and Sriganesh Madhvanath. 2005. An approach to identify unique styles in online handwriting recognition. In Proceedings of the 8th International Conference on Document Analysis and Recognition (ICDAR’05). 775--778. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Bharath and Sriganesh Madhvanath. 2009. A framework based on semi-supervised clustering for discovering unique writing styles. In Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR’09). 891--895. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Bharath and Sriganesh Madhvanath. 2012. HMM-based lexicon-driven and lexicon-free word recognition for online handwritten Indic scripts. IEEE Trans. Pattern Anal. Mach. Intell. 34, 4, 670--682. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Nilanjana Bhattacharya and Umapada Pal. 2012. Stroke segmentation and recognition from bangla online handwritten text. In Proceedings of the 13th International Conference on Frontiers in Handwriting Recognition (ICFHR’12). 736--741. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Alain Biem. 2006. Minimum classification error training for online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28, 7, 1041--1051. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kumar Chellapilla, Patrice Simard, and Ahmad Abdulkader. 2006. Allograph based writer adaptation for handwritten character recognition. In Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR’06).Google ScholarGoogle Scholar
  12. S. D. Connell. 2000. Online handwriting recognition using multiple pattern class models. Ph.D. Dissertation, Michigan State Univ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. D. Connell and A. K. Jain. 1998. Learning prototypes for on-line handwritten digits. In Proceedings of the 14th International Conference on Pattern Recognition (ICPR’98). 182--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. D. Connell, R. M. K. Sinha, and A. K. Jain. 2000. Recognition of unconstrained on-line Devanagari characters. In Proceedings of the 15th International Conference on Pattern Recognition (ICPR’00). 368--371.Google ScholarGoogle Scholar
  15. F. Coulmas. 1996. The Blackwell Encyclopedia of Writing Systems. Blackwell, Oxford.Google ScholarGoogle Scholar
  16. Richard O. Duda, Peter E. Hart, and David G. Stork. 2001. Pattern Classification. Wiley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. L. N. Fred and A. K. Jain. 2005. Combining multiple clusterings using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27, 6, 835--850. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jianying Hu, Sok Gek Lim, and Michael K. Brown. 2000. Writer independent on-line handwriting recognition using an HMM approach. Pattern Recogn. 33, 1, 133--147.Google ScholarGoogle ScholarCross RefCross Ref
  19. S. Jaeger, S. Manke, J. Reichert, and A. Waibel. 2001. Online handwriting recognition: The NPen++ Recognizer. Int. J. Doc. Anal. Recogn. 3, 3, 169--180.Google ScholarGoogle ScholarCross RefCross Ref
  20. D. Klein, S. D. Kamvar, and C. D. Manning. 2002. From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In Proceedings of the 19th International Conference on Machine Learning (ICML’02). 307--314. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T. Kohonen. 1990. The self-organizing map. Proc. IEEE 78, 9, 1464--1480.Google ScholarGoogle ScholarCross RefCross Ref
  22. B. Kulis, S. Basu, I. Dhillon, and R. J. Mooney. 2005. Semi-supervised graph clustering: A kernel approach. In Proceedings of the 22nd International Conference on Machine Learning (ICML’05). 457--464. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. J. Lee, J. Kim, and J. H. Kim. 2000. Data-driven Design of HMM Topology for On-line Handwriting Recognition. In Proceedings of the 7th International Workshop on Frontiers in Handwriting Recognition (IWFHR’00). 107--121.Google ScholarGoogle Scholar
  24. Cheng-Lin Liu and Masaki Nakagawa. 2001. Evaluation of prototype learning algorithms for nearest-neighbor classifier in application to handwritten character recognition. Pattern Recog. 34, 3, 601--615.Google ScholarGoogle ScholarCross RefCross Ref
  25. N. Matic, J. Platt, and T. Wang. 2002. QuickStroke: An incremental on-line Chinese handwriting recognition system. In Proceedings of the 16th International Conference on Pattern Recognition (ICPR’02). 435--439. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Nakai, N. Akira, H. Shimodaira, and S. Sagayama. 2001. Substroke approach to HMM-based on-line Kanji handwriting recognition. In Proceedings of the 6th International Conference on Document Analysis and Recognition (ICDAR’01). 491--495. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Nakai, H. Shimodaira, and S. Sagayama. 2003. Generation of hierarchical dictionary for stroke-order free Kanji handwriting recognition based on substroke HMM. In Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR’03). 514--518. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Michael P. Perrone and S. D. Connell. 2000. K-means clustering for hidden Markov models. In Proceedings of the 7th International Workshop on Frontiers in Handwriting Recognition (IWFHR’00). 229--238.Google ScholarGoogle Scholar
  29. R. Rabiner. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 2, 257--286.Google ScholarGoogle ScholarCross RefCross Ref
  30. J. Rajkumar, K. Mariraja, K. Kanakapriya, S. Nishanthini, and V. S. Chakravarthy. 2012. Two schemas for online character recognition of Telugu script based on support vector machines. In Proceedings of the 13th International Conference on Frontiers in Handwriting Recognition (ICFHR’12). 563--568. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. S. Salvador and P. Chan. 2004. Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms. In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence. 18521857. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. K. C. Santosh, C. Natteey, and B. Lamiroyz. 2010. Spatial similarity based stroke number and order free clustering. In Proceedings of the 12th International Conference on Frontiers in Handwriting Recognition (ICFHR’10). 652--657. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. H. Swethalakshmi. 2007. Online handwritten character recognition for Devanagari and Tamil scripts using support vector machines. Master’s thesis, Indian Institute of Technology, Madras, India.Google ScholarGoogle Scholar
  34. K. Takahashi, H. Yasuda, and T. Matsumoto. 1997. A fast HMM algorithm for on-line handwritten character recognition. In Proceedings of the 4th International Conference on Document Analysis and Recognition (ICDAR’97). 369--375. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Christian Viard-Gaudin, Pierre Michel Lallican, Philippe Binter, and Stefan Knerr. 1999. The IRESTE On/Off (IRONOFF) dual handwriting database. In Proceedings of the 5th International Conference on Document Analysis and Recognition (ICDAR’99). 455--458. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. V. Vuori. 2002. Clustering writing styles with a self-organizing map. In Proceedings of the 8th International Workshop on Frontiers in Handwriting Recognition (IWFHR’02). 345--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. V. Vuori and J. Laaksonen. 2002. A comparison of techniques for automatic clustering of handwritten characters. In Proceedings of the 16th International Conference on Pattern Recognition (ICPR’02). 168--171. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. L. Vuurpijl and L. Schomaker. 1997. Finding structure in diversity: A hierarchical clustering method for the categorization of allographs in handwriting. In Proceedings of the 4th International Conference on Document Analysis and Recognition (ICDAR’97). 387--393. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. L. G. Vuurpijl and L. R. B. Schomaker. 1997. Coarse writing-style clustering based on simple stroke-related features. In Progress in Handwriting Recognition, A. C. Downton and S. Impedovo Eds., World Scientific, London, UK, 37--44.Google ScholarGoogle Scholar
  40. K. Wagstaff, C. Cardie, S. Rogers, and S. Schrdl. 2001. Constrained K-means clustering with background knowledge. In Proceedings of the 18th International Conference on Machine Learning (ICML’01). 577--584. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. K. Yamasaki. 1999. Automatic prototype stroke generation based on stroke clustering for on-line handwritten japanese character recognition. In Proceedings of the 5th International Conference on Document Analysis and Recognition (ICDAR’99). 673--676. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. L. Yi, J. Rong, and A. K. Jain. 2007. BoostCluster: Boosting clustering by pairwise constraint. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07). 450--459. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. L. Zelnik-Manor and P. Peronam. 2004. Self-tuning spectral clustering. In Proceedings of the 18th Annual Conference on Neural Information Processing Systems (NIPS’04). 1601--1608.Google ScholarGoogle Scholar

Index Terms

  1. Allograph modeling for online handwritten characters in devanagari using constrained stroke clustering

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian Language Information Processing
      ACM Transactions on Asian Language Information Processing  Volume 13, Issue 3
      September 2014
      83 pages
      ISSN:1530-0226
      EISSN:1558-3430
      DOI:10.1145/2676410
      Issue’s Table of Contents

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 October 2014
      • Revised: 1 April 2014
      • Accepted: 1 April 2014
      • Received: 1 August 2013
      Published in talip Volume 13, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader