Abstract
Offline handwritten Chinese character recognition is a very hard pattern-recognition problem of considerable practical importance. Two popular approaches are to extract features holistically from the character image or to decompose characters structurally into component parts---usually strokes. Here we take a novel approach, that of decomposing into radicals on the basis of image information (i.e., without first decomposing into strokes). During training, 60 examples of each radical were represented by "landmark" points, labeled semiautomatically, with radicals in different characteristic positions treated as distinctly different radicals. Kernel principal-component analysis then captured the main (nonlinear) variations around the mean radical. During the recognition, the dynamic tunneling algorithm was used to search for optimal shape parameters in terms of chamfer distance minimization. Considering character composition as a Markov process in which up to four radicals are combined in some assumed sequential order, we can recognize complete, hierarchically-composed characters by using the Viterbi algorithm. This gave a character recognition rate of 93.5% characters correct (writer-independent) on a test set of 430,800 characters from 2,154 character classes composed of 200 radical categories, which is comparable to the best reported results in the literature. Although the initial semiautomatic landmark labeling is time consuming, the decomposition approach is theoretically well-motivated and allows the different sources of variability in Chinese handwriting to be handled separately and by the most appropriate means--either learned from example data or incorporated as prior knowledge. Hence, high generalizability is obtained from small amounts of training data, and only simple prior knowledge needs to be incorporated, thus promising robust recognition performance. As such, there is very considerable potential for further development and improvement in the direction of larger character sets and less constrained writing conditions.
- BARROW, H. G., TENENBAUM, J. M., BOLLES, R. C., AND WOLF, H. C. 1977. Parametric correspondence and chamfer matching: Two new techniques for image matching. In Proceedings of the 5th International Joint Conference on Artificial Intelligence (Cambridge, MA). 659--663.Google Scholar
- BELLMAN, R. 1957. Dynamic Programming. Princeton University Press, Princeton, NJ. Google Scholar
- BORGEFORS, G. 1988. Hierarchical chamfer matching: A parametric edge matching algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 10, 6, 849--865. Google Scholar
- CHANG, S. K. 1973. An interactive system for Chinese character generation and retrieval. IEEE Trans. Syst. Man Cybern. 3, 3, 257--265.Google Scholar
- CHEN, J.-W. AND LEE, S.-Y. 1997. On-line Chinese character recognition via a representation of spatial relationships between strokes. Int. J. Pattern Recog. Artif. Intell. 11, 3, 329--357.Google Scholar
- CHERKASSKY, V. AND MULIER, F. 1998. Learning from Data. Wiley, New York.Google Scholar
- CHUNG, F. AND IP, W. W. S. 2001. Complex character decomposition using deformable model. IEEE Trans. Syst. Man, Cybern.---Part C: Applications and reviews 31, 1, 126--132. Google Scholar
- COOTES, T. F., TAYLOR, C. J., COOPER, D. H., AND GARAHAM, J. 1995. Active shape models---their training and application. Comput. Vision Image Understanding 61, 1, 38--59. Google Scholar
- DUDA, R. O. AND HART, P. E. 1973. Pattern Classification and Scene Analysis. Wiley, New York.Google Scholar
- FORNEY, G. D. 1973. The Viterbi algorithm. Proc. IEEE 61, 3, 268--278.Google Scholar
- GE, Y., HUO, Q., AND FENG, Z.-D. 2002. Offline recognition of handwritten Chinese characters using Gabor features, CDHMM modeling and MCE training. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP'02, vol. 1 (Orlando, FL). 1053--1056.Google Scholar
- HILDEBRANDT, T. H. AND LIU, W. 1993. Optical recognition of handwritten Chinese characters: Advances since 1980. Pattern Recogn. 26, 2, 205--225.Google Scholar
- HUTTENLOCHER, D. P., KLANDERMAN, G. A., AND RUCKLIDGE, W. J. 1993. Comparing images using the Hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell. 15, 9, 850--863. Google Scholar
- IP, W. W. S., CHUNG, K. F. L., AND YEUNG, D. S. 1997. Offline handwritten Chinese character recognition via radical extraction and recognition. In Proceedings of the Third International Conference on Document Analysis and Recognition (Ulm, Germany). 185--189. Google Scholar
- JANG, B. K. AND CHIN, R. T. 1992. One-pass parallel thinning: Analysis, properties and quantitative evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 14, 11, 1129--1140. Google Scholar
- JOLLIFFE, I. T. 1986. Principal Component Analysis. Springer-Verlag, New York.Google Scholar
- JUNG, K. AND KIM, H. J. 2000. On-line recognition of cursive Korean characters using graph representation. Pattern Recogn. 33, 3, 399--412.Google Scholar
- KIM, H. J., JUNG, J. W., AND KIM, S. K. 1996. On-line Chinese character recognition using ART-based stroke classification. Pattern Recogn. Lett. 17, 12, 1311--1322. Google Scholar
- KIM, H. J., JUNG, J. W., AND KIM, S. K. 1997. On-line recognition of Chinese characters based on hidden Markov models. Pattern Recogn. 30, 9, 1489--1500.Google Scholar
- LADES, M., VORBRÜGGEN, J. C., BUHMAN, J., LANGE, J., VANDER MALSBURG, C., WÜRTZ, R. P., AND KONEN, W. 1993. Distortion invariant object recognition in the dynamic link architecture. IEEE Trans. Computers 42, 3, 300--311. Google Scholar
- LIAO, C. W. AND HUANG, J. S. 1990. A transformation invariant matching algorithm for handwritten Chinese character recognition. Pattern Recogn. 23, 11, 1167--1188. Google Scholar
- NEUHOFF, D. L. 1975. The Viterbi algorithm as an aid in text recognition. IEEE Trans. Inf. Theory IT-21, 222--226.Google Scholar
- ROYCHODHURY, P., SINGH, Y. P., AND CHANSARKAR, R. A. 2000. Hybridization of gradient descent algorithms with dynamic tunneling methods for global optimization. IEEE Trans. Syst., Man, Cybern.---Part A: Systems and Humans 30, 3, 384--390. Google Scholar
- SAMPSON, G. 1985. Writing Systems. Hutchinson, London.Google Scholar
- SCHÖLKOPF, B., SMOLA, A. J., AND MÜLLER, K. 1998. Kernel principal component analysis. In Advances in Kernel Methods, B. Schölkopf, C. J. C. Burges, and A. J. Smola, Eds., Cambridge, MA: MIT Press, 327--352. Google Scholar
- SHAN, L. 1995. Passport to Chinese: 100 Most Commonly Used Chinese Characters, Book 1. EPB Publishers, Singapore.Google Scholar
- SHI, D., GUNN, S. R., AND DAMPER, R. I. 2001a. Active radical modeling for handwritten Chinese characters. In Proceedings of the Sixth International Conference on Document Analysis and Recognition, ICDAR'01 (Seattle, WA). 236--240. Google Scholar
- SHI, D., GUNN, S. R., AND DAMPER, R. I. 2001b. A radical approach to handwritten Chinese character recognition using active handwriting models. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Vol. 1 (Kauai, Hawaii). 670--675.Google Scholar
- SHI, D., GUNN, S. R., AND DAMPER, R. I. 2002. Handwritten Chinese character recognition using nonlinear active shape models and the Viterbi algorithm. Pattern Recogn. Lett. 23, 14, 1853--1862. Google Scholar
- SHI, D., GUNN, S. R., AND DAMPER, R. I. 2003. Handwritten Chinese radical recognition using nonlinear active shape models. IEEE Trans. Pattern Anal. Mach. Intell. 25, 2, 277--280. Google Scholar
- SHI, D. M., GUNN, S. R., DAMPER, R. I., AND SHU, W. H. 2000. Recognition rule acquisition by an advanced extension matrix algorithm. Eng. Intell. Syst. Electrical Eng. Commun. 8, 2, 97--101.Google Scholar
- SUEN, Y. AND HUANG, E. M. 1984. Computational analysis of the structural compositions of frequently used Chinese characters. Comput. Process. Chinese Oriental Lang. 1, 3, 1--10.Google Scholar
- TANG, Y. Y., TU, L. T., LIU, J., LEE, S. W., LIN, W. W., AND SHYU, I. S. 1998. Offline recognition of Chinese handwriting by multifeature and multilevel classification. IEEE Trans. Pattern Anal. Mach. Intell. 20, 5, 556--561. Google Scholar
- VITERBI, A. J. 1967. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inf. Theory IT-13, 2, 260--269.Google Scholar
- XIONG, Y., HUO, Q., AND CHAN, C. K. 2001. A discrete contextual stochastic model for the off line recognition of handwritten Chinese characters. IEEE Trans. Pattern Anal. Mach. Intell. 23, 7, 774--782. Google Scholar
- YAO, Y. 1989. Dynamic tunneling algorithm for global optimization. IEEE Trans. Syst., Man, Cybern. 19, 5, 1222--1230.Google Scholar
Index Terms
- Offline handwritten Chinese character recognition by radical decomposition
Recommendations
Radical aggregation network for few-shot offline handwritten Chinese character recognition
Highlights- A radical aggregation network for radical-level Chinese character recognition.
- ...
Graphical abstractOverall architecture of our radical aggregation network (RAN). The RAN consists of three components, a radical mapping encoder (RME), a radical aggregation module (RAM), and a character analysis decoder (CAD). The RME module ...
AbstractOffline handwritten Chinese character recognition has attracted much interest due to its various applications. The most cutting-edge methods treat Chinese character as a whole, ignoring the structures and radicals that compose ...
Offline handwritten Gurmukhi character recognition: study of different feature-classifier combinations
DAR '12: Proceeding of the workshop on Document Analysis and RecognitionOffline handwritten character recognition (OHCR) is the method of converting handwritten text into machine processable layout. Since late sixties, efforts have been made for offline handwritten character recognition throughout the world. Principal ...
Online and offline handwritten Chinese character recognition: Benchmarking on new databases
Recently, the Institute of Automation of Chinese Academy of Sciences (CASIA) released the unconstrained online and offline Chinese handwriting databases CASIA-OLHWDB and CASIA-HWDB, which contain isolated character samples and handwritten texts produced ...
Comments