Abstract
Writer-specific character writing variations such as those of stroke order and stroke number are an important source of variability in the input when handwriting is captured “online” via a stylus and a challenge for robust online recognition of handwritten characters and words. It has been shown by several studies that explicit modeling of character allographs is important for achieving high recognition accuracies in a writer-independent recognition system. While previous approaches have relied on unsupervised clustering at the character or stroke level to find the allographs of a character, in this article we propose the use of constrained clustering using automatically derived domain constraints to find a minimal set of stroke clusters. The allographs identified have been applied to Devanagari character recognition using Hidden Markov Models and Nearest Neighbor classifiers, and the results indicate substantial improvement in recognition accuracy and/or reduction in memory and computation time when compared to alternate modeling techniques.
- V. J. Babu, L. Prasanth, R. R. Sharma, G. V. P. Rao, and A. Bharath. 2007. HMM-based online handwriting recognition system for Telugu symbols. In Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR’07). 63--67. Google ScholarDigital Library
- Claus Bahlmann and Hans Burkhardt. 2004. The writer independent online handwriting recognition system frog on hand and cluster generative statistical dynamic time warping. IEEE Trans. Pattern Anal. Mach. Intell. 26, 3, 299--310. Google ScholarDigital Library
- S. Basu. 2005. Semi-Supervised Clustering: Probabilistic Models, Algorithms and Experiments. Ph.D. Dissertation. University of Texas at Austin. Google ScholarDigital Library
- S. Basu, M. Bilenko, A. Banerjee, and R. Mooney. 2006. Probabilistic semi-supervised clustering with constraints. In Semi-Supervised Learning, O. Chapelle, B. Scholkopf, and A. Zien, Eds., MIT Press, Cambridge, MA, 73--102.Google Scholar
- S. Basu and I. Davidson. 2006. Clustering under constraints: Theory and practice. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06).Google Scholar
- A. Bharath, V. Deepu, and Sriganesh Madhvanath. 2005. An approach to identify unique styles in online handwriting recognition. In Proceedings of the 8th International Conference on Document Analysis and Recognition (ICDAR’05). 775--778. Google ScholarDigital Library
- A. Bharath and Sriganesh Madhvanath. 2009. A framework based on semi-supervised clustering for discovering unique writing styles. In Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR’09). 891--895. Google ScholarDigital Library
- A. Bharath and Sriganesh Madhvanath. 2012. HMM-based lexicon-driven and lexicon-free word recognition for online handwritten Indic scripts. IEEE Trans. Pattern Anal. Mach. Intell. 34, 4, 670--682. Google ScholarDigital Library
- Nilanjana Bhattacharya and Umapada Pal. 2012. Stroke segmentation and recognition from bangla online handwritten text. In Proceedings of the 13th International Conference on Frontiers in Handwriting Recognition (ICFHR’12). 736--741. Google ScholarDigital Library
- Alain Biem. 2006. Minimum classification error training for online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28, 7, 1041--1051. Google ScholarDigital Library
- Kumar Chellapilla, Patrice Simard, and Ahmad Abdulkader. 2006. Allograph based writer adaptation for handwritten character recognition. In Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR’06).Google Scholar
- S. D. Connell. 2000. Online handwriting recognition using multiple pattern class models. Ph.D. Dissertation, Michigan State Univ. Google ScholarDigital Library
- S. D. Connell and A. K. Jain. 1998. Learning prototypes for on-line handwritten digits. In Proceedings of the 14th International Conference on Pattern Recognition (ICPR’98). 182--184. Google ScholarDigital Library
- S. D. Connell, R. M. K. Sinha, and A. K. Jain. 2000. Recognition of unconstrained on-line Devanagari characters. In Proceedings of the 15th International Conference on Pattern Recognition (ICPR’00). 368--371.Google Scholar
- F. Coulmas. 1996. The Blackwell Encyclopedia of Writing Systems. Blackwell, Oxford.Google Scholar
- Richard O. Duda, Peter E. Hart, and David G. Stork. 2001. Pattern Classification. Wiley. Google ScholarDigital Library
- A. L. N. Fred and A. K. Jain. 2005. Combining multiple clusterings using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27, 6, 835--850. Google ScholarDigital Library
- Jianying Hu, Sok Gek Lim, and Michael K. Brown. 2000. Writer independent on-line handwriting recognition using an HMM approach. Pattern Recogn. 33, 1, 133--147.Google ScholarCross Ref
- S. Jaeger, S. Manke, J. Reichert, and A. Waibel. 2001. Online handwriting recognition: The NPen++ Recognizer. Int. J. Doc. Anal. Recogn. 3, 3, 169--180.Google ScholarCross Ref
- D. Klein, S. D. Kamvar, and C. D. Manning. 2002. From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In Proceedings of the 19th International Conference on Machine Learning (ICML’02). 307--314. Google ScholarDigital Library
- T. Kohonen. 1990. The self-organizing map. Proc. IEEE 78, 9, 1464--1480.Google ScholarCross Ref
- B. Kulis, S. Basu, I. Dhillon, and R. J. Mooney. 2005. Semi-supervised graph clustering: A kernel approach. In Proceedings of the 22nd International Conference on Machine Learning (ICML’05). 457--464. Google ScholarDigital Library
- J. J. Lee, J. Kim, and J. H. Kim. 2000. Data-driven Design of HMM Topology for On-line Handwriting Recognition. In Proceedings of the 7th International Workshop on Frontiers in Handwriting Recognition (IWFHR’00). 107--121.Google Scholar
- Cheng-Lin Liu and Masaki Nakagawa. 2001. Evaluation of prototype learning algorithms for nearest-neighbor classifier in application to handwritten character recognition. Pattern Recog. 34, 3, 601--615.Google ScholarCross Ref
- N. Matic, J. Platt, and T. Wang. 2002. QuickStroke: An incremental on-line Chinese handwriting recognition system. In Proceedings of the 16th International Conference on Pattern Recognition (ICPR’02). 435--439. Google ScholarDigital Library
- M. Nakai, N. Akira, H. Shimodaira, and S. Sagayama. 2001. Substroke approach to HMM-based on-line Kanji handwriting recognition. In Proceedings of the 6th International Conference on Document Analysis and Recognition (ICDAR’01). 491--495. Google ScholarDigital Library
- M. Nakai, H. Shimodaira, and S. Sagayama. 2003. Generation of hierarchical dictionary for stroke-order free Kanji handwriting recognition based on substroke HMM. In Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR’03). 514--518. Google ScholarDigital Library
- Michael P. Perrone and S. D. Connell. 2000. K-means clustering for hidden Markov models. In Proceedings of the 7th International Workshop on Frontiers in Handwriting Recognition (IWFHR’00). 229--238.Google Scholar
- R. Rabiner. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 2, 257--286.Google ScholarCross Ref
- J. Rajkumar, K. Mariraja, K. Kanakapriya, S. Nishanthini, and V. S. Chakravarthy. 2012. Two schemas for online character recognition of Telugu script based on support vector machines. In Proceedings of the 13th International Conference on Frontiers in Handwriting Recognition (ICFHR’12). 563--568. Google ScholarDigital Library
- S. Salvador and P. Chan. 2004. Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms. In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence. 18521857. Google ScholarDigital Library
- K. C. Santosh, C. Natteey, and B. Lamiroyz. 2010. Spatial similarity based stroke number and order free clustering. In Proceedings of the 12th International Conference on Frontiers in Handwriting Recognition (ICFHR’10). 652--657. Google ScholarDigital Library
- H. Swethalakshmi. 2007. Online handwritten character recognition for Devanagari and Tamil scripts using support vector machines. Master’s thesis, Indian Institute of Technology, Madras, India.Google Scholar
- K. Takahashi, H. Yasuda, and T. Matsumoto. 1997. A fast HMM algorithm for on-line handwritten character recognition. In Proceedings of the 4th International Conference on Document Analysis and Recognition (ICDAR’97). 369--375. Google ScholarDigital Library
- Christian Viard-Gaudin, Pierre Michel Lallican, Philippe Binter, and Stefan Knerr. 1999. The IRESTE On/Off (IRONOFF) dual handwriting database. In Proceedings of the 5th International Conference on Document Analysis and Recognition (ICDAR’99). 455--458. Google ScholarDigital Library
- V. Vuori. 2002. Clustering writing styles with a self-organizing map. In Proceedings of the 8th International Workshop on Frontiers in Handwriting Recognition (IWFHR’02). 345--350. Google ScholarDigital Library
- V. Vuori and J. Laaksonen. 2002. A comparison of techniques for automatic clustering of handwritten characters. In Proceedings of the 16th International Conference on Pattern Recognition (ICPR’02). 168--171. Google ScholarDigital Library
- L. Vuurpijl and L. Schomaker. 1997. Finding structure in diversity: A hierarchical clustering method for the categorization of allographs in handwriting. In Proceedings of the 4th International Conference on Document Analysis and Recognition (ICDAR’97). 387--393. Google ScholarDigital Library
- L. G. Vuurpijl and L. R. B. Schomaker. 1997. Coarse writing-style clustering based on simple stroke-related features. In Progress in Handwriting Recognition, A. C. Downton and S. Impedovo Eds., World Scientific, London, UK, 37--44.Google Scholar
- K. Wagstaff, C. Cardie, S. Rogers, and S. Schrdl. 2001. Constrained K-means clustering with background knowledge. In Proceedings of the 18th International Conference on Machine Learning (ICML’01). 577--584. Google ScholarDigital Library
- K. Yamasaki. 1999. Automatic prototype stroke generation based on stroke clustering for on-line handwritten japanese character recognition. In Proceedings of the 5th International Conference on Document Analysis and Recognition (ICDAR’99). 673--676. Google ScholarDigital Library
- L. Yi, J. Rong, and A. K. Jain. 2007. BoostCluster: Boosting clustering by pairwise constraint. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07). 450--459. Google ScholarDigital Library
- L. Zelnik-Manor and P. Peronam. 2004. Self-tuning spectral clustering. In Proceedings of the 18th Annual Conference on Neural Information Processing Systems (NIPS’04). 1601--1608.Google Scholar
Index Terms
- Allograph modeling for online handwritten characters in devanagari using constrained stroke clustering
Recommendations
On the Significance of Stroke Size and Position for Online Handwritten Devanagari Word Recognition: An Empirical Study
ICPR '10: Proceedings of the 2010 20th International Conference on Pattern RecognitionStroke size and position are considered as important information for online recognition of handwritten characters and words in oriental and Indic family of scripts especially because of their multi-stroke and two-dimensional nature. In an Indic script ...
Online Handwritten Gurmukhi Words Recognition: An Inclusive Study
Identification of offline and online handwritten words is a challenging and complex task. In comparison to Latin and Oriental scripts, the research and study of handwriting recognition at word level in Indic scripts is at its initial phases. The two ...
Unconstrained handwritten Devanagari character recognition using convolutional neural networks
MOCR '13: Proceedings of the 4th International Workshop on Multilingual OCRIn this paper, we introduce a novel offline strategy for recognition of online handwritten Devanagari characters entered in an unconstrained manner. Unlike the previous approaches based on standard classifiers - SVM, HMM, ANN and trained on statistical, ...
Comments