Abstract
A post-processing system for OCR of Gurmukhi script has been developed. Statistical information of Punjabi language syllable combinations, corpora look-up and certain heuristics based on Punjabi grammar rules have been combined to design the post-processor. An improvement of 3% in recognition rate, from 94.35% to 97.34%, has been reported on clean images using the post-processing techniques.
Similar content being viewed by others
References
Bansal V, Sinha R M K 1999 Partitioning and searching dictionary for correction of optically read Devnagri character strings. InProceedings Fifth International Conference on Document Analysis and Recognition (IEEE Comput. Soc. Press) pp 653–656
Church K W, Gale W, Hank P, Hindle D 1990 Word association norms, mutual information and lexicography.Comput. Linguistics 16: 22–29
Hong T 1995Degraded text recognition using visual and linguistic context. Ph D thesis, Faculty of Graduate School, State University of New York, Buffalo, NY
Hull J J, Srihari S N 1982 Experiments in text recognition with binary n-gram and Viterbi algorithm.IEEE Trans. Pattern Anal. Machine Intell. 4: 520–530
Lehal G S, Chandan Singh 1999 Feature extraction and classification for OCR of Gurmukhi script.Vivek 12: 2–12
Lehal G S, Chandan Singh 2000 A Gurmukhi script recognition system. InProceedings 15th International Conference on Pattern Recogniton, Barcelona, Spain, vol 2, pp 557–560
Mayes E, Dameran F J, Mercer R L 1991 Context based spelling correction.Inf. Process. Manage. 27: 517–522
Riseman E M, Hanson A R 1974 A contextual postprocessing system for error correction using binary n-grams.IEEE Trans. Comput. C-23: 480–93
Sinha R M K 1987 Rule based contextual post-processing for Devanagri text recognition.Pattern Recog. 20: 475–85
Suen C Y 1979 N-gram statistics for natural language understanding and text processing.IEEE Trans. Pattern Anal. Machine Intell. 1: 164–172
Tong X, Evans D A 1996 A statistical approach to automatic OCR error correction in context.Proceedings of the 4th Workshop on Very Large Corpora, pp. 88–100
Wells C J, Evett L J, Whitby P E, Whitrow R J 1990 Fast dictionary lookup for contextual word recognition.Pattern Recogn. 23: 501–508
Yannakoudakis E J, Tsomokos I, Hutton P J 1990 N-grams and their implication to natural language understanding.Pattern Recogn. 23: 509–528
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Lehal, G.S., Singh, C. A post-processor for Gurmukhi OCR. Sadhana 27, 99–111 (2002). https://doi.org/10.1007/BF02703315
Issue Date:
DOI: https://doi.org/10.1007/BF02703315