research-article

Allograph modeling for online handwritten characters in devanagari using constrained stroke clustering

Authors:
Bharath A.

Hewlett-Packard Labs, Bangalore, India

Hewlett-Packard Labs, Bangalore, India
View Profile

,
Sriganesh Madhvanath

Hewlett-Packard Labs, Bangalore, India

Hewlett-Packard Labs, Bangalore, India
View Profile

ACM Transactions on Asian Language Information Processing Volume 13 Issue 3Article No.: 12pp 1–21https://doi.org/10.1145/2629622

Published:03 October 2014Publication History

ACM Transactions on Asian Language Information Processing

Abstract

Writer-specific character writing variations such as those of stroke order and stroke number are an important source of variability in the input when handwriting is captured “online” via a stylus and a challenge for robust online recognition of handwritten characters and words. It has been shown by several studies that explicit modeling of character allographs is important for achieving high recognition accuracies in a writer-independent recognition system. While previous approaches have relied on unsupervised clustering at the character or stroke level to find the allographs of a character, in this article we propose the use of constrained clustering using automatically derived domain constraints to find a minimal set of stroke clusters. The allographs identified have been applied to Devanagari character recognition using Hidden Markov Models and Nearest Neighbor classifiers, and the results indicate substantial improvement in recognition accuracy and/or reduction in memory and computation time when compared to alternate modeling techniques.

References

V. J. Babu, L. Prasanth, R. R. Sharma, G. V. P. Rao, and A. Bharath. 2007. HMM-based online handwriting recognition system for Telugu symbols. In Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR’07). 63--67. Google ScholarDigital Library
Claus Bahlmann and Hans Burkhardt. 2004. The writer independent online handwriting recognition system frog on hand and cluster generative statistical dynamic time warping. IEEE Trans. Pattern Anal. Mach. Intell. 26, 3, 299--310. Google ScholarDigital Library
S. Basu. 2005. Semi-Supervised Clustering: Probabilistic Models, Algorithms and Experiments. Ph.D. Dissertation. University of Texas at Austin. Google ScholarDigital Library
S. Basu, M. Bilenko, A. Banerjee, and R. Mooney. 2006. Probabilistic semi-supervised clustering with constraints. In Semi-Supervised Learning, O. Chapelle, B. Scholkopf, and A. Zien, Eds., MIT Press, Cambridge, MA, 73--102.Google Scholar
S. Basu and I. Davidson. 2006. Clustering under constraints: Theory and practice. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06).Google Scholar
A. Bharath, V. Deepu, and Sriganesh Madhvanath. 2005. An approach to identify unique styles in online handwriting recognition. In Proceedings of the 8th International Conference on Document Analysis and Recognition (ICDAR’05). 775--778. Google ScholarDigital Library
A. Bharath and Sriganesh Madhvanath. 2009. A framework based on semi-supervised clustering for discovering unique writing styles. In Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR’09). 891--895. Google ScholarDigital Library
A. Bharath and Sriganesh Madhvanath. 2012. HMM-based lexicon-driven and lexicon-free word recognition for online handwritten Indic scripts. IEEE Trans. Pattern Anal. Mach. Intell. 34, 4, 670--682. Google ScholarDigital Library
Nilanjana Bhattacharya and Umapada Pal. 2012. Stroke segmentation and recognition from bangla online handwritten text. In Proceedings of the 13th International Conference on Frontiers in Handwriting Recognition (ICFHR’12). 736--741. Google ScholarDigital Library
Alain Biem. 2006. Minimum classification error training for online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28, 7, 1041--1051. Google ScholarDigital Library
Kumar Chellapilla, Patrice Simard, and Ahmad Abdulkader. 2006. Allograph based writer adaptation for handwritten character recognition. In Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR’06).Google Scholar
S. D. Connell. 2000. Online handwriting recognition using multiple pattern class models. Ph.D. Dissertation, Michigan State Univ. Google ScholarDigital Library
S. D. Connell and A. K. Jain. 1998. Learning prototypes for on-line handwritten digits. In Proceedings of the 14th International Conference on Pattern Recognition (ICPR’98). 182--184. Google ScholarDigital Library
S. D. Connell, R. M. K. Sinha, and A. K. Jain. 2000. Recognition of unconstrained on-line Devanagari characters. In Proceedings of the 15th International Conference on Pattern Recognition (ICPR’00). 368--371.Google Scholar
F. Coulmas. 1996. The Blackwell Encyclopedia of Writing Systems. Blackwell, Oxford.Google Scholar
Richard O. Duda, Peter E. Hart, and David G. Stork. 2001. Pattern Classification. Wiley. Google ScholarDigital Library
A. L. N. Fred and A. K. Jain. 2005. Combining multiple clusterings using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27, 6, 835--850. Google ScholarDigital Library
Jianying Hu, Sok Gek Lim, and Michael K. Brown. 2000. Writer independent on-line handwriting recognition using an HMM approach. Pattern Recogn. 33, 1, 133--147.Google ScholarCross Ref
S. Jaeger, S. Manke, J. Reichert, and A. Waibel. 2001. Online handwriting recognition: The NPen++ Recognizer. Int. J. Doc. Anal. Recogn. 3, 3, 169--180.Google ScholarCross Ref
D. Klein, S. D. Kamvar, and C. D. Manning. 2002. From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In Proceedings of the 19th International Conference on Machine Learning (ICML’02). 307--314. Google ScholarDigital Library
T. Kohonen. 1990. The self-organizing map. Proc. IEEE 78, 9, 1464--1480.Google ScholarCross Ref
B. Kulis, S. Basu, I. Dhillon, and R. J. Mooney. 2005. Semi-supervised graph clustering: A kernel approach. In Proceedings of the 22nd International Conference on Machine Learning (ICML’05). 457--464. Google ScholarDigital Library
J. J. Lee, J. Kim, and J. H. Kim. 2000. Data-driven Design of HMM Topology for On-line Handwriting Recognition. In Proceedings of the 7th International Workshop on Frontiers in Handwriting Recognition (IWFHR’00). 107--121.Google Scholar
Cheng-Lin Liu and Masaki Nakagawa. 2001. Evaluation of prototype learning algorithms for nearest-neighbor classifier in application to handwritten character recognition. Pattern Recog. 34, 3, 601--615.Google ScholarCross Ref
N. Matic, J. Platt, and T. Wang. 2002. QuickStroke: An incremental on-line Chinese handwriting recognition system. In Proceedings of the 16th International Conference on Pattern Recognition (ICPR’02). 435--439. Google ScholarDigital Library
M. Nakai, N. Akira, H. Shimodaira, and S. Sagayama. 2001. Substroke approach to HMM-based on-line Kanji handwriting recognition. In Proceedings of the 6th International Conference on Document Analysis and Recognition (ICDAR’01). 491--495. Google ScholarDigital Library
M. Nakai, H. Shimodaira, and S. Sagayama. 2003. Generation of hierarchical dictionary for stroke-order free Kanji handwriting recognition based on substroke HMM. In Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR’03). 514--518. Google ScholarDigital Library
Michael P. Perrone and S. D. Connell. 2000. K-means clustering for hidden Markov models. In Proceedings of the 7th International Workshop on Frontiers in Handwriting Recognition (IWFHR’00). 229--238.Google Scholar
R. Rabiner. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 2, 257--286.Google ScholarCross Ref
J. Rajkumar, K. Mariraja, K. Kanakapriya, S. Nishanthini, and V. S. Chakravarthy. 2012. Two schemas for online character recognition of Telugu script based on support vector machines. In Proceedings of the 13th International Conference on Frontiers in Handwriting Recognition (ICFHR’12). 563--568. Google ScholarDigital Library
S. Salvador and P. Chan. 2004. Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms. In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence. 18521857. Google ScholarDigital Library
K. C. Santosh, C. Natteey, and B. Lamiroyz. 2010. Spatial similarity based stroke number and order free clustering. In Proceedings of the 12th International Conference on Frontiers in Handwriting Recognition (ICFHR’10). 652--657. Google ScholarDigital Library
H. Swethalakshmi. 2007. Online handwritten character recognition for Devanagari and Tamil scripts using support vector machines. Master’s thesis, Indian Institute of Technology, Madras, India.Google Scholar
K. Takahashi, H. Yasuda, and T. Matsumoto. 1997. A fast HMM algorithm for on-line handwritten character recognition. In Proceedings of the 4th International Conference on Document Analysis and Recognition (ICDAR’97). 369--375. Google ScholarDigital Library
Christian Viard-Gaudin, Pierre Michel Lallican, Philippe Binter, and Stefan Knerr. 1999. The IRESTE On/Off (IRONOFF) dual handwriting database. In Proceedings of the 5th International Conference on Document Analysis and Recognition (ICDAR’99). 455--458. Google ScholarDigital Library
V. Vuori. 2002. Clustering writing styles with a self-organizing map. In Proceedings of the 8th International Workshop on Frontiers in Handwriting Recognition (IWFHR’02). 345--350. Google ScholarDigital Library
V. Vuori and J. Laaksonen. 2002. A comparison of techniques for automatic clustering of handwritten characters. In Proceedings of the 16th International Conference on Pattern Recognition (ICPR’02). 168--171. Google ScholarDigital Library
L. Vuurpijl and L. Schomaker. 1997. Finding structure in diversity: A hierarchical clustering method for the categorization of allographs in handwriting. In Proceedings of the 4th International Conference on Document Analysis and Recognition (ICDAR’97). 387--393. Google ScholarDigital Library
L. G. Vuurpijl and L. R. B. Schomaker. 1997. Coarse writing-style clustering based on simple stroke-related features. In Progress in Handwriting Recognition, A. C. Downton and S. Impedovo Eds., World Scientific, London, UK, 37--44.Google Scholar
K. Wagstaff, C. Cardie, S. Rogers, and S. Schrdl. 2001. Constrained K-means clustering with background knowledge. In Proceedings of the 18th International Conference on Machine Learning (ICML’01). 577--584. Google ScholarDigital Library
K. Yamasaki. 1999. Automatic prototype stroke generation based on stroke clustering for on-line handwritten japanese character recognition. In Proceedings of the 5th International Conference on Document Analysis and Recognition (ICDAR’99). 673--676. Google ScholarDigital Library
L. Yi, J. Rong, and A. K. Jain. 2007. BoostCluster: Boosting clustering by pairwise constraint. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07). 450--459. Google ScholarDigital Library
L. Zelnik-Manor and P. Peronam. 2004. Self-tuning spectral clustering. In Proceedings of the 18th Annual Conference on Neural Information Processing Systems (NIPS’04). 1601--1608.Google Scholar

Index Terms

Allograph modeling for online handwritten characters in devanagari using constrained stroke clustering
1. Computing methodologies
  1. Machine learning

Recommendations

On the Significance of Stroke Size and Position for Online Handwritten Devanagari Word Recognition: An Empirical Study
ICPR '10: Proceedings of the 2010 20th International Conference on Pattern Recognition

Stroke size and position are considered as important information for online recognition of handwritten characters and words in oriental and Indic family of scripts especially because of their multi-stroke and two-dimensional nature. In an Indic script ...
Read More
Online Handwritten Gurmukhi Words Recognition: An Inclusive Study

Identification of offline and online handwritten words is a challenging and complex task. In comparison to Latin and Oriental scripts, the research and study of handwriting recognition at word level in Indic scripts is at its initial phases. The two ...
Read More
Unconstrained handwritten Devanagari character recognition using convolutional neural networks
MOCR '13: Proceedings of the 4th International Workshop on Multilingual OCR

In this paper, we introduce a novel offline strategy for recognition of online handwritten Devanagari characters entered in an unconstrained manner. Unlike the previous approaches based on standard classifiers - SVM, HMM, ANN and trained on statistical, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Asian Language Information Processing Volume 13, Issue 3
September 2014
83 pages
ISSN:1530-0226
EISSN:1558-3430
DOI:10.1145/2676410
Editor:
Richard Sproat
Google, Inc., USA
Issue’s Table of Contents
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 October 2014
- Revised: 1 April 2014
- Accepted: 1 April 2014
- Received: 1 August 2013
Published in talip Volume 13, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Devanagari character recognition
allograph modeling
constrained stroke clustering
online handwriting recognition
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 164
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Allograph modeling for online handwritten characters in devanagari using constrained stroke clustering

ACM Transactions on Asian Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

On the Significance of Stroke Size and Position for Online Handwritten Devanagari Word Recognition: An Empirical Study

Online Handwritten Gurmukhi Words Recognition: An Inclusive Study

Unconstrained handwritten Devanagari character recognition using convolutional neural networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Allograph modeling for online handwritten characters in devanagari using constrained stroke clustering

ACM Transactions on Asian Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

On the Significance of Stroke Size and Position for Online Handwritten Devanagari Word Recognition: An Empirical Study

Online Handwritten Gurmukhi Words Recognition: An Inclusive Study

Unconstrained handwritten Devanagari character recognition using convolutional neural networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media