Training of an on-line handwritten Japanese character recognizer by artificial patterns

doi:10.1016/j.patrec.2012.07.012

Pattern Recognition Letters

Volume 35, 1 January 2014, Pages 178-185

https://doi.org/10.1016/j.patrec.2012.07.012 Get rights and content

Abstract

This paper presents effects of a large amount of training patterns artificially generated to train an on-line handwritten Japanese character recognizer, which is based on the Markov Random Field model. In general, the more training patterns, the higher the recognition accuracy. In reality, however, the existing pattern samples are not enough, especially for languages with large sets of characters, for which a higher number of parameters needs to be adjusted. We use six types of linear distortion models and combine them among themselves and with a non-linear distortion model to generate a large amount of artificial patterns. These models are based on several geometry transform models, which are considered to simulate distortions in real handwriting. We apply these models to the TUAT Nakayosi database and expand its volume by up to 300 times while evaluating the notable effect of the TUAT Kuchibue database for improving recognition accuracy. The effect is analyzed for subgroups in the character set and a significant effect is observed for Kanji, ideographic characters of Chinese origin. This paper also considers the order of linear and non-linear distortion models and the strategy to select patterns in the original database from patterns close to character class models to those away from them or vice versa. For this consideration, we merge the Nakayosi and Kuchibue databases. We take 100 patterns existed in the merged database to form the testing set, while the remaining samples to form the training set. For the order, linear then non-linear distortions produce higher recognition accuracy. For the strategy, selecting patterns away from character class models to those close to them produce higher accuracy.

Highlights

• We enhance the accuracy of online handwriting recognizer by using artificial pattern. • Linear distortion model, nonlinear distortion model and combined model are tested. • Two types of sequences: LMDs then NLDM or vice versa and two original pattern select strategy are tested. • For the order, linear then nonlinear distortions produce higher recognition accuracy. • For the strategy, selecting patterns away from prototypes obtain higher accuracy.

Introduction

Research on on-line handwritten Japanese character recognition has pursued recognition accuracy high enough to be accepted by users of real applications (Plamondon and Srihari, 2000, Liu et al., 2004). To deal with the problem that character patterns are often distorted, there are four main methods. One is to decrease distortion by non-linear normalization (Yamada et al., 1984, Tsukumo and Tanaka, 1988, Liu et al., 2003) or try to remove distortion by reverse distortion in a normalization step (Wakahara and Odaka, 1996, Satoh et al., 1999). The second is to improve discriminant functions such as MQDF for off-line recognition (Kimura et al., 1987), HMM (Jaeger et al., 2001) or MRF for on-line recognition (Zhu and Nakagawa, 2011). The third is to select or extract stable features (Liu and Zhou, 2006). The fourth is to train classifiers by an increased amount of training patterns (Smith et al., 1994). To collect training patterns is very costly, however, so that artificial pattern generation has been used (Ha and Bunke, 1997, Mori et al., 2000, Leung et al., 1985, Leung and Leung, 2009).

This paper focuses on artificial pattern generation. In general, the more training patterns are employed, the higher the recognition accuracy is achieved. In reality, however, the existing pattern samples are not enough, especially for languages with large sets of characters, for which a higher number of parameters need to be adjusted. Thus, we consider artificial pattern generation. Several works have been proposed to transform character patterns in accordance with some models and produce artificial patterns. Ha and Bunke (1997) used the concept of perturbation due to writing habits and instruments for off-line handwritten numeral recognition, where they proposed six types of linear distortion models to reverse an input image back to its standard form to solve the problem of patterns variation. Mori et al. (2000) proposed a character pattern generation method based on point correspondence between patterns. Leung et al., 1985, Leung and Leung, 2009 generated a huge number of training samples artificially in accordance with a non-linear distortion model for off-line handwritten Chinese characters recognition, which demonstrates that applying distorted sample generation is effective in addition to regularization of class covariance matrices and feature dimension reduction, when the dimension of the feature vector is high while the number of training samples is not sufficient. Velek et al. (2002) proposed a method to generate brush-written off-line patterns from on-line patterns. Postal address recognition had problems reading characters written with a traditional brush for new year cards, since the amount of training patterns was limited for such patterns.

In this paper, we consider on-line pattern generation for on-line handwritten Japanese character recognition. We propose six types of linear distortion models (LDMs) as proposed by Ha and Bunke (1997) and use them to generate a great deal of artificial patterns, with which we train a handwritten Japanese character recognizer. Then, we combine LDMs with non-linear distortion model (NLDM) proposed by Leung et al., 1985, Leung and Leung, 2009 to obtain combined distortion models (CDMs) and generate artificial patterns again to train the above recognizer.

Here it is worth noting that the basic LDMs proposed by Ha and Bunke (1997) were applied in preprocessing to reverse an input image back to its standard form; they were applied to just numerical patterns; and they were employed in recognition stage so that additional recognition time was incurred. On the other hand, we employed them for pattern generation so that the recognition time is not affected. Moreover, Leung et al., 1985, Leung and Leung, 2009 proposed the non-linear distortion model for off-line Chinese character recognition while we combined them with LDMs for on-line recognition.

This paper is an extension to the conference papers (Chen et al., 2010, Chen et al., 2011), which reported the increase of recognition rate by employing the proposed method to generate artificial patterns for training. This paper shows them in more detail and considers effects of selecting the combination sequence for CDMs and original pattern selection strategy. There are two combination sequences: LDMs then NLDM and NLDM then LDMs. Moreover, there are two original pattern selection strategies: selecting patterns in the original database, from patterns close to character class models to those away from them and vice versa. These two combination sequences and two original pattern selection strategies are combined pairwisely. For this consideration, we merge the Nakayosi and Kuchibue databases, and take 100 patterns in the merged database to form the testing set, while the remaining samples to form the training set. Moreover, we also attempt to find a generating method with relatively less real patterns employed while increasing recognition accuracy efficiently. The detailed performance evaluations and discussions will be presented that show the effectiveness of the proposed method.

The rest of this paper is organized as follows: Section 2 describes basic ideas of our proposed method. Section 3 briefly describes databases, pattern transformation. Section 4 introduces our recognition classifier that we used. Section 5 details 12 LDMs, NLDM, and CDMs and experimental results for increasing their recognition accuracy. Section 6 presents experiments on the two combination sequences and two original pattern selection strategies. Section 7 describes the results and analysis. Section 6 draws our concluding remarks.

Section snippets

Basic ideas

Our approach to generating artificial patterns is based on the observation of how people write and deform character patterns. First, people try to write characters beautifully in accordance with the rules of calligraphy. As far as calligraphy is concerned, characters should be written by following several types of distortion, different with printed type. Fig. 1(a) shows calligraphy styles corresponding to printed types. Samples of shear along the X-direction and Y-direction and shrink toward

Databases

An on-line handwritten character pattern is composed of a sequence of strokes and each stroke is composed of a time-sequence of coordinates sampled from a tablet or touch sensitive device. TUAT HANDS Nakayosi and Kuchibue databases of on-line handwritten Japanese characters patterns (Nakagawa and Matsumoto, 2004) are applied in this experiment. The Kuchibue database contains the patterns of 120 writers: 11,962 patterns per writer covering 3356 categories. Excluding the JIS level-2 Kanji

On-line handwritten character recognition system employed

We adopt a linear-chain Markov random field MRF model with weighting parameters optimized by CRFs to recognize character patterns (Zhu and Nakagawa, 2011). Here we summarize the system. It extracts feature points along the pen-tip trace from pen-down to pen-up and sets each feature point from an input pattern as a site and each state from a character class as a label. It uses the coordinates of feature points as unary features and the differences in coordinates between the neighboring feature

Distortion models and their evaluation

We use distortion models to generate a large amount of artificial patterns form on-line handwritten samples to train the character recognizer. The Japanese character set consists of different types of characters: symbols, numerals, upper case Roman letters, lower case Roman letters, upper case Greek, lower case Greek, hiragana, katakana, and Kanji characters of Chinese origin. With the above nine character subgroups, we apply these models to the TUAT Nakayosi database and obtain the overall

Combination sequence and original pattern quality to artificial pattern

In this section, we investigate the influence of the combining LDMs and NLDM, i.e., LDMs then NLDM and vice versa, denoted as L-NL and NL-L, respectively. CDM without a shear part: DM(10, 300, 1) belongs to L-NL. Similarly, we change the combining order of LMDs and NLDM inversely and obtain DM (11, 300, 1), which belongs to NL-L.

In this section, we investigate the influence of the combining LDMs and NLDM, i.e., LDMs then NLDM and vice versa, denoted as L-NL and NL-L, respectively. CDM without a

Conclusion

We have presented an effective approach to enhance the accuracy of on-line handwriting Japanese recognition by using a large amount of artificial patterns generated by 12 linear distortion models and combination with a non-linear distortion model. With experiments on nine character subgroups of the Kuchibue database, the recognition accuracies are improved for most of the subgroups, which demonstrates the effectiveness of our approach. Our 12 LDMs improved the recognition accuracy of all the

References (22)

U. Ramer
An iterative procedure for the polygonal approximation of plan closed curves
Comput. Graphics Image Process.
(1972)
Chen, B., Zhu, Bilan, Nakagawa, M., 2010. Effects of a large amount of artificial patterns for on-line handwritten...
Chen, B., Zhu, Bilan, Nakagawa, M., 2011. Effects of generating a large amount of artificial patterns for on-line...
T.M. Ha et al.
Off-line, handwritten numeral recognition by perturbation method
IEEE Trans. Pattern Anal. Machine Intell.
(1997)
Jaeger, S. Nakagawa, M. 2001. Two on-line Japanese character databases in unipen format. In: Proc. 6th International...
S. Jaeger et al.
On-line handwriting recognition: the NPen++ recognizer
Int. J. Document Anal. Recognit.
(2001)
F. Kimura
Modified quadratic discriminant functions and the application to Chinese character recognition
IEEE Trans. Pattern Anal. Machine Intell.
(1987)
Leung, K.C., Leung, C.H., 2009. Recognition of handwritten Chinese characters by combining regularization, Fisher’s...
Leung, C.H., Cheung, Y.S., Chan, K.P., 1985. A distortion model for chinese character generation. In: Proceeding of the...
Liu. C.L., Zhou, X.D., 2006. Online Japanese character recognition using trajectory-based normalization and direction...

Liu, C.L., Sako, H., Fujisawa, H., 2003. Handwritten Chinese Character Recognition Alternatives to Nonlinear...

Cited by (13)

Syntactic data generation for handwritten mathematical expression recognition
2022, Pattern Recognition Letters
This paper proposes tree-based decomposition and sub-expression interchange for generating new syntactically valid handwritten mathematical expressions (HMEs) from a given set of HMEs to train an HME recognition model and a mathematical language model (LM). The recognition model is dual trained using weakly supervised learning and encoder-decoder attention loss on the generated samples. Recognition experiments indicate that the proposed data generation method is superior to other such methods for offline HMEs. The HME recognition model increases the expression recognition rates by 1.47, 2.88, and 2.67 points on the Competition on Recognition of Online Handwritten Mathematical Expressions (CROHME) 2014, 2016, and 2019 testing sets, respectively. The LM also increases them by 8.92, 6.88, and 2.59 points on the testing sets. Further adding extra LaTeX sequences is cost effective in strengthening the LM for the expression recognition rates being 2.54, 2.8, and 1.25 points higher than without them on the CROHME testing sets, respectively. Among academic systems, the trained HME recognition system achieves the best performance with 64.60% and 66.08% expression recognition rates on the CROHME 2014 and 2016 testing sets and a comparable expression recognition rate of 58.72% on the CROHME 2019 testing set, respectively. Comparison with top systems from companies on CROHME 2019 suggests that more real and/or generated HME patterns will improve the performance of HME recognition models as well as mathematical language models.
A new hybrid-parameter recurrent neural network for online handwritten chinese character recognition
2019, Pattern Recognition Letters
Citation Excerpt :
Various methods have demonstrated their success in the conventional online handwritten Chinese character recognition [3,4,7–10]. These methods can be classified into two categories: traditional methods [7–9] and deep learning based methods [3,4,10–12]. In the traditional methods, 8-direction-feature has been shown to be a discriminative feature for the conventional HCCR [9,13,14].
Bidirectional RNN (BRNN) is a common method of modeling time series and has been widely applied in many areas, including speech recognition, machine translation, natural language processing, scene text recognition. Compared with the unidirectional RNN, the bidirectional RNN usually obtains better performance but higher computational cost due to additional backward processing of an input sequence. In this paper, we present a new hybrid-parameter RNN which consists of two virtual unidirectional recurrent neural networks. The computational cost of the proposed RNN is only three-fourths of that of the bidirectional RNNs. In addition, we accumulate the feature vectors from different layers to obtain the output of the RNN system, which is an efficient way to combine all the feature vectors without increasing the model size. The experimental results on IAHCC-UCAS2016 dataset and ICDAR2013 competition database show that the hybrid-parameter RNN obtains a better recognition performance with lower computational cost, compared with the bidirectional RNN.
Pattern generation strategies for improving recognition of Handwritten Mathematical Expressions
2019, Pattern Recognition Letters
Citation Excerpt :
Data synthesis is one solution for generating more data from a small training set like CROHME. Leung et al. and Cheng et al. proposed distortion models to generate more training data, respectively, for Chinese and Japanese character recognition [17,18]. Plamondon et al. proposed delta-lognormal model for handwriting analysis [19], which can be used for data augmentation [20].
Recognition of Handwritten Mathematical Expressions (HMEs) is a challenging problem because of the ambiguity and complexity of two-dimensional handwriting. Moreover, the lack of large training data is a serious issue, especially for academic recognition systems. In this paper, we propose pattern generation strategies that generate shape and structural variations to improve the performance of recognition systems based on a small training set. For data generation, we employ the public databases: CROHME 2014 and 2016 of online HMEs. The first strategy employs local and global distortions to generate shape variations. The second strategy decomposes an online HME into sub-online HMEs to get more structural variations. The hybrid strategy combines both these strategies to maximize shape and structural variations. The generated online HMEs are converted to images for offline HME recognition. We tested our strategies in an end-to-end recognition system constructed from a recent deep learning model: Convolutional Neural Network and attention-based encoder-decoder. The results of experiments on the CROHME 2014 and 2016 databases demonstrate the superiority and effectiveness of our strategies: our hybrid strategy achieved classification rates of 48.78% and 45.60%, respectively, on these databases. These results are competitive compared to others reported in recent literature. Our generated datasets are openly available for research community and constitute a useful resource for the HME recognition research in future.
Data augmentation and directional feature maps extraction for in-air handwritten Chinese character recognition based on convolutional neural network
2018, Pattern Recognition Letters
Citation Excerpt :
Therefore artificially generated training samples from existing patterns are needed. In fact, artificially generated virtual training samples are effective for classifiers to further boost the recognition rate [5,34]. The sample generation method presented in [34] can effectively boost recognition rate for IAHCCR, which enlarges training dataset by changing corner positions of online trajectory of a character.
Recently convolutional neural networks (CNN) have demonstrated remarkable performance in various classification problems. In this paper, we also introduce CNN into in-air handwritten Chinese character recognition (IAHCCR) and propose new directional feature maps, named bend directional feature maps. Then we integrate the combination of various types of directional feature maps with the CNN and obtain better recognition performance compared with other methods reported for IAHCCR. For further improving recognition rate, we propose a new data augmentation method dedicated to in-air handwritten Chinese characters. The proposed data augmentation method combines global transformation with local distortion and effectively enlarges the training dataset. Experimental results demonstrate that our proposed methods can greatly improve the recognition rate for IAHCCR.
Handwriting recognition of digits, signs, and numerical strings in Persian
2016, Computers and Electrical Engineering
Citation Excerpt :
Consequently, standard databases can strongly provide advancement in OCR research works. In the last few decades, numerous methods have been proposed for machine recognition of handwritten characters, especially for the more popular languages such as English [3,4], Japanese [5,6], and Chinese [6,7]. The number of countries with the English language is not small.
This paper presents an important step towards the standardization of research works on Optical Character Recognition in Persian language. It describes the formations of a standard handwritten database, including isolated digits, isolated signs, multi-digit numbers, numerical strings, courtesy amounts, and postal codes. In this regard, binary images of 72,180 samples were extracted from the designed forms. These forms were filled by 180 writers selected from different ages, genders, and jobs. Then these forms were scanned at 300 dpi with a high-speed scanner. Finally, forms are segmented into samples and are stored in bitmap format. This database is named PHOND, Persian Handwritten Optical Numbers & Digits, and it is available to the research community. Comparisons with the previous related databases illustrate the advantages of PHOND against other databases. Different experiments are done using PHOND database and the results are compared with other research works in handwritten recognition.
Survey on Handwritten Mathematical Expression Recognition in the Last Decade: Grammar- and Graph-Based Parsing, and the Rise of Encoder-Decoder Models and Graph Neural Networks
2023, SSRN

View all citing articles on Scopus

View full text

Training of an on-line handwritten Japanese character recognizer by artificial patterns

Abstract

Highlights

Introduction

Section snippets

Basic ideas

Databases

On-line handwritten character recognition system employed

Distortion models and their evaluation

Combination sequence and original pattern quality to artificial pattern

Conclusion

Comput. Graphics Image Process.

Off-line, handwritten numeral recognition by perturbation method

IEEE Trans. Pattern Anal. Machine Intell.

On-line handwriting recognition: the NPen++ recognizer

Int. J. Document Anal. Recognit.

Modified quadratic discriminant functions and the application to Chinese character recognition

IEEE Trans. Pattern Anal. Machine Intell.