Skip to main content
Top
Published in: International Journal on Document Analysis and Recognition (IJDAR) 4/2015

01-12-2015 | Original Paper

Handwritten word recognition using Web resources and recurrent neural networks

Authors: Cristina Oprean, Laurence Likforman-Sulem, Adrian Popescu, Chafic Mokbel

Published in: International Journal on Document Analysis and Recognition (IJDAR) | Issue 4/2015

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Handwriting recognition systems usually rely on static dictionaries and language models. Full coverage of these dictionaries is generally not achieved when dealing with unrestricted document corpora due to the presence of Out-Of-Vocabulary (OOV) words. We propose an approach which uses the World Wide Web as a corpus to improve dictionary coverage. We exploit the very large and freely available Wikipedia corpus in order to obtain dynamic dictionaries on the fly. We rely on recurrent neural network (RNN) recognizers, with and without linguistic resources, to detect words that are non-reliably recognized within a word sequence. Such words are labeled as non-anchor words (NAWs) and include OOVs and In-Vocabulary words recognized with low confidence. To recognize a non-anchor word, a dynamic dictionary is built by selecting words from the Web resource based on their string similarity with the NAW image, and their linguistic relevance in the NAW context. Similarity is evaluated by computing the edit distance between the sequence of characters generated by the RNN recognizer exploited as a filler model, and the Wikipedia words. Linguistic relevance is based on an N-gram language model estimated from the Wikipedia corpus. Experiments conducted on a word-segmented version of the publicly available RIMES database show that the proposed approach can improve recognition accuracy compared to systems based on static dictionaries only. The proposed approach shows even better behavior as the proportion of OOVs increases, in terms of both accuracy and dictionary coverage.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Brakensiek, A., Willett, D., Rigoll, G.: Unlimited vocabulary script recognition using character N-grams. In: DAGM, pp. 436–443 (2000) Brakensiek, A., Willett, D., Rigoll, G.: Unlimited vocabulary script recognition using character N-grams. In: DAGM, pp. 436–443 (2000)
2.
go back to reference Bazzi, I., Schwartz, R.M., Makhoul, J.: An omnifont open-vocabulary OCR system for english and arabic. IEEE Trans. Pattern Anal. Mach. Intell. 21(6), 495–504 (1999)CrossRef Bazzi, I., Schwartz, R.M., Makhoul, J.: An omnifont open-vocabulary OCR system for english and arabic. IEEE Trans. Pattern Anal. Mach. Intell. 21(6), 495–504 (1999)CrossRef
3.
go back to reference Hamdani, M., El-Desoky Mousa, A., Ney, H.: Open vocabulary arabic handwriting recognition using morphological decomposition. In: ICDAR, pp. 280–284 (2013) Hamdani, M., El-Desoky Mousa, A., Ney, H.: Open vocabulary arabic handwriting recognition using morphological decomposition. In: ICDAR, pp. 280–284 (2013)
4.
go back to reference Parada, C., Sethy, A., Dredze, M., Jelinek, F.: A spoken term detection framework for recovering out-of-vocabulary words using the web. In: INTERSPEECH (2010) Parada, C., Sethy, A., Dredze, M., Jelinek, F.: A spoken term detection framework for recovering out-of-vocabulary words using the web. In: INTERSPEECH (2010)
5.
go back to reference Oger, S., Popescu, V., Linarés, G.: Using the world wide web for learning new words in continuous speech recognition tasks: two case studies. In: SPECOM (2009) Oger, S., Popescu, V., Linarés, G.: Using the world wide web for learning new words in continuous speech recognition tasks: two case studies. In: SPECOM (2009)
6.
go back to reference Kaufmann, G., Bunke, H., Hadorn, M.: Lexicon reduction in an HMM-framework based on quantized feature vectors. In: ICDAR, pp. 1097–1101 (1997) Kaufmann, G., Bunke, H., Hadorn, M.: Lexicon reduction in an HMM-framework based on quantized feature vectors. In: ICDAR, pp. 1097–1101 (1997)
7.
go back to reference Guillevic, D., Nishiwaki, D., Yamada, K.: Word lexicon reduction by character spotting. In: IWFHR, pp. 373–382 (2000) Guillevic, D., Nishiwaki, D., Yamada, K.: Word lexicon reduction by character spotting. In: IWFHR, pp. 373–382 (2000)
8.
go back to reference Powalka, R.K., Sherkat, N., Whitrow, R.J.: Word shape analysis for a hybrid recognition system. Pattern Recogn. 30(3), 421–445 (1997)CrossRef Powalka, R.K., Sherkat, N., Whitrow, R.J.: Word shape analysis for a hybrid recognition system. Pattern Recogn. 30(3), 421–445 (1997)CrossRef
9.
go back to reference Seni, G., Srihari, R.K., Nasrabadi, N.M.: Large vocabulary recognition of on-line handwritten cursive words. IEEE Trans. Pattern Anal. Mach. Intell. 18(7), 757–762 (1996)CrossRef Seni, G., Srihari, R.K., Nasrabadi, N.M.: Large vocabulary recognition of on-line handwritten cursive words. IEEE Trans. Pattern Anal. Mach. Intell. 18(7), 757–762 (1996)CrossRef
10.
go back to reference Leroy, A.: Lexicon reduction based on global features for on-line handwriting. In: IWFHR, pp. 431–440 (1994) Leroy, A.: Lexicon reduction based on global features for on-line handwriting. In: IWFHR, pp. 431–440 (1994)
11.
go back to reference Vinciarelli, A.: Noisy text categorization. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1882–1895 (2005)CrossRef Vinciarelli, A.: Noisy text categorization. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1882–1895 (2005)CrossRef
12.
go back to reference Milewski, R., Govindaraju, V., Bhardwaj, A.: Automatic recognition of handwritten medical forms for search engines. IJDAR 11(4), 203–218 (2009)CrossRef Milewski, R., Govindaraju, V., Bhardwaj, A.: Automatic recognition of handwritten medical forms for search engines. IJDAR 11(4), 203–218 (2009)CrossRef
13.
go back to reference Farooq, F., Chandalia, G., Govindaraju, V.: Lexicon reduction in handwriting recognition using topic categorization. In: DAS, pp. 369–375 ( 2008) Farooq, F., Chandalia, G., Govindaraju, V.: Lexicon reduction in handwriting recognition using topic categorization. In: DAS, pp. 369–375 ( 2008)
14.
go back to reference Farooq, F., Bhardwaj, A., Govindaraju, V.: Using topic models for ocr correction. IJDAR 12(3), 153–164 (2009)CrossRef Farooq, F., Bhardwaj, A., Govindaraju, V.: Using topic models for ocr correction. IJDAR 12(3), 153–164 (2009)CrossRef
15.
go back to reference Whitelaw, C., Hutchinson, B., Chung, G., Ellis, G.: Using the web for language independent spellchecking and autocorrection. In: EMNLP, pp. 890–899 (2009) Whitelaw, C., Hutchinson, B., Chung, G., Ellis, G.: Using the web for language independent spellchecking and autocorrection. In: EMNLP, pp. 890–899 (2009)
16.
go back to reference Soricut, R., Brill, E.: Automatic question answering using the web: beyond the factoid. Inf. Retrieval 9(2), 191–206 (2006)CrossRef Soricut, R., Brill, E.: Automatic question answering using the web: beyond the factoid. Inf. Retrieval 9(2), 191–206 (2006)CrossRef
17.
go back to reference Rigau, G., Magnini, B., Agirre, E., Vossen, P., Carroll, J.: Meaning: a roadmap to knowledge technologies. In: COLING-02 on A roadmap for computational linguistics, pp. 1–7 (2002) Rigau, G., Magnini, B., Agirre, E., Vossen, P., Carroll, J.: Meaning: a roadmap to knowledge technologies. In: COLING-02 on A roadmap for computational linguistics, pp. 1–7 (2002)
18.
go back to reference Grefenstette, G.: The World Wide Web as a resource for example-based machine translation tasks. In: Translating and the Computer 21: Proceedings of the 21st International Conference on Translating and the Computer (1999) Grefenstette, G.: The World Wide Web as a resource for example-based machine translation tasks. In: Translating and the Computer 21: Proceedings of the 21st International Conference on Translating and the Computer (1999)
19.
go back to reference Cao, Y.: Base noun phrase translation using web data and the EM algorithm. In: Proceedings of CoLing, pp. 127–133 (2002) Cao, Y.: Base noun phrase translation using web data and the EM algorithm. In: Proceedings of CoLing, pp. 127–133 (2002)
20.
go back to reference Oger, S., Popescu, V., Linarés, G.: Using the world wide web for learning new words in continuous speech recognition tasks: two case studies. In: SPECOM (2009) Oger, S., Popescu, V., Linarés, G.: Using the world wide web for learning new words in continuous speech recognition tasks: two case studies. In: SPECOM (2009)
21.
go back to reference Keller, F., Lapata, M.: Using the web to obtain frequencies for unseen bigrams. Comput. Linguistics 29(3), 459–484 (2003)CrossRef Keller, F., Lapata, M.: Using the web to obtain frequencies for unseen bigrams. Comput. Linguistics 29(3), 459–484 (2003)CrossRef
22.
go back to reference Adler, M., Goldberg, Y., Gabay, D., Elhadad, M.: Unsupervised lexicon-based resolution of unknown words for full morphological analysis. In: ACL, pp. 728–736 (2008) Adler, M., Goldberg, Y., Gabay, D., Elhadad, M.: Unsupervised lexicon-based resolution of unknown words for full morphological analysis. In: ACL, pp. 728–736 (2008)
23.
go back to reference Umansky-Pesin, S., Reichart, R., Rappoport, A.: A multi-domain web-based algorithm for POS tagging of unknown words. Beijing, pp. 1274–1282 (2010) Umansky-Pesin, S., Reichart, R., Rappoport, A.: A multi-domain web-based algorithm for POS tagging of unknown words. Beijing, pp. 1274–1282 (2010)
24.
go back to reference Taghva, K., Agarwal, S.: Utilizing web data in identification and correction of OCR errors. In: Proceedings of DRR (2014) Taghva, K., Agarwal, S.: Utilizing web data in identification and correction of OCR errors. In: Proceedings of DRR (2014)
25.
go back to reference Feild, J. L., Learned-Miller, E. G.: Improving open-vocabulary scene text recognition. In: ICDAR, pp. 604–608 (2013) Feild, J. L., Learned-Miller, E. G.: Improving open-vocabulary scene text recognition. In: ICDAR, pp. 604–608 (2013)
26.
go back to reference Oprean, C., Likforman-Sulem, L., Popescu, A., Mokbel, C.: Using the web to create dynamic dictionaries in handwritten out-of-vocabulary word recognition. In: ICDAR, pp. 989–993 (2013) Oprean, C., Likforman-Sulem, L., Popescu, A., Mokbel, C.: Using the web to create dynamic dictionaries in handwritten out-of-vocabulary word recognition. In: ICDAR, pp. 989–993 (2013)
27.
go back to reference Vinciarelli, A., Luettin, J.: A new normalization technique for cursive handwritten words. Pattern Recogn. Lett. 22(9), 1043–1050 (2001)MATHCrossRef Vinciarelli, A., Luettin, J.: A new normalization technique for cursive handwritten words. Pattern Recogn. Lett. 22(9), 1043–1050 (2001)MATHCrossRef
28.
go back to reference Bianne-Bernard, A.-L., Menasri, F., El-Hajj, R., Mokbel, C., Kermorvant, C., Likforman-Sulem, L.: Dynamic and contextual information in HMM modeling for handwritten word recognition. IEEE PAMI 99(10), 2066–2080 (2011)CrossRef Bianne-Bernard, A.-L., Menasri, F., El-Hajj, R., Mokbel, C., Kermorvant, C., Likforman-Sulem, L.: Dynamic and contextual information in HMM modeling for handwritten word recognition. IEEE PAMI 99(10), 2066–2080 (2011)CrossRef
29.
go back to reference Oprean, C., Likforman-Sulem, L., Mokbel, C.: Handwritten word preprocessing for database adaptation. In: DRR XX, pp. 808–865 (2013) Oprean, C., Likforman-Sulem, L., Mokbel, C.: Handwritten word preprocessing for database adaptation. In: DRR XX, pp. 808–865 (2013)
30.
go back to reference Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45, 2673–2681 (1997) Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45, 2673–2681 (1997)
31.
go back to reference Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kolen, J., Kremer, S. (eds.) Field Guide to Dynamical Recurrent Networks. IEEE Press, New York (2001) Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kolen, J., Kremer, S. (eds.) Field Guide to Dynamical Recurrent Networks. IEEE Press, New York (2001)
32.
go back to reference Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. PAMI 31(5), 855–868 (2009)CrossRef Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. PAMI 31(5), 855–868 (2009)CrossRef
33.
go back to reference Werbos, P.J.: Generalization of backpropagation with application to a recurrent gas market model. Neural Netw. 1(4), 339–356 (1988)CrossRef Werbos, P.J.: Generalization of backpropagation with application to a recurrent gas market model. Neural Netw. 1(4), 339–356 (1988)CrossRef
34.
go back to reference Williams, R. J., Zipser, D.: Backpropagation: theory, architecture and applications. In: Chauvin, Y., Rumelhart, D.E. (eds.) Gradient-Based Learning Algorithms for Recurrent Networks and Their Computational Complexity, pp. 433–486. Lawrence Erlbaum Associates, Hillsdale, New Jersey (1995) Williams, R. J., Zipser, D.: Backpropagation: theory, architecture and applications. In: Chauvin, Y., Rumelhart, D.E. (eds.) Gradient-Based Learning Algorithms for Recurrent Networks and Their Computational Complexity, pp. 433–486. Lawrence Erlbaum Associates, Hillsdale, New Jersey (1995)
35.
go back to reference Graves, A., Fernández, S., Gomez, F.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: ICML, pp. 369–376 (2006) Graves, A., Fernández, S., Gomez, F.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: ICML, pp. 369–376 (2006)
36.
go back to reference Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 211–224 (2012)CrossRef Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 211–224 (2012)CrossRef
37.
go back to reference Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)MATHCrossRef Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)MATHCrossRef
38.
go back to reference Singhal, A.: Modern information retrieval: a brief overview. IEEE Data Eng. Bull. 24(4), 35–43 (2001) Singhal, A.: Modern information retrieval: a brief overview. IEEE Data Eng. Bull. 24(4), 35–43 (2001)
39.
go back to reference Grosicki, E., El-Abed, H.: ICDAR 2011-French handwriting recognition competition. In: ICDAR, pp. 1459–1463 (2011) Grosicki, E., El-Abed, H.: ICDAR 2011-French handwriting recognition competition. In: ICDAR, pp. 1459–1463 (2011)
40.
go back to reference Hayamizu, S., Itou, K., Tanaka, K.: Detection of unknown words in large vocabulary speech recognition. In: EUROSPEECH (1993) Hayamizu, S., Itou, K., Tanaka, K.: Detection of unknown words in large vocabulary speech recognition. In: EUROSPEECH (1993)
41.
go back to reference White, C. M., Zweig, G., Burget, L., Schwarz, P., Hermansky, H.: Confidence estimation, OOV detection and language ID using phone-to-word transduction and phone-level alignments. In: ICASSP, pp. 4085–4088 (2008) White, C. M., Zweig, G., Burget, L., Schwarz, P., Hermansky, H.: Confidence estimation, OOV detection and language ID using phone-to-word transduction and phone-level alignments. In: ICASSP, pp. 4085–4088 (2008)
42.
go back to reference Burget, L., Schwarz, P., Matějka, P., Hannemann, M., Rastrow, A., White, C., Khudanpur, S., Heřmanský, H., Černocký, J.: Combination of strongly and weakly constrained recognizers for reliable detection of OOVs. In: ICASSP (2008) Burget, L., Schwarz, P., Matějka, P., Hannemann, M., Rastrow, A., White, C., Khudanpur, S., Heřmanský, H., Černocký, J.: Combination of strongly and weakly constrained recognizers for reliable detection of OOVs. In: ICASSP (2008)
43.
go back to reference Levenshtein, V.: Binary codes capable of correcting deletions, insertions and reversals. Soviet Phys. Doklady 10, 707 (1966)MathSciNet Levenshtein, V.: Binary codes capable of correcting deletions, insertions and reversals. Soviet Phys. Doklady 10, 707 (1966)MathSciNet
44.
go back to reference Damerau, F.: A technique for computer detection and correction of spelling errors. Commun. ACM 7, 171–176 (1964)CrossRef Damerau, F.: A technique for computer detection and correction of spelling errors. Commun. ACM 7, 171–176 (1964)CrossRef
45.
go back to reference Grosicki, E., Carré, M., Geoffrois, E., Augustin, E., Preteux, F.: La campagne d’évaluation RIMES pour la reconnaissance de courriers manuscrits. In: CIFED (2006) Grosicki, E., Carré, M., Geoffrois, E., Augustin, E., Preteux, F.: La campagne d’évaluation RIMES pour la reconnaissance de courriers manuscrits. In: CIFED (2006)
46.
go back to reference Grosicki, E., Abed, H. E.: ICDAR 2009 handwriting recognition competition. In: ICDAR (2009) Grosicki, E., Abed, H. E.: ICDAR 2009 handwriting recognition competition. In: ICDAR (2009)
47.
go back to reference Brakensiek, A., Rottland, J., Kosmala, A., Rigoll, G.: Off-line handwriting recognition using various hybrid modeling techniques and character n-grams. In: IWFHR, pp. 343–352 (2000) Brakensiek, A., Rottland, J., Kosmala, A., Rigoll, G.: Off-line handwriting recognition using various hybrid modeling techniques and character n-grams. In: IWFHR, pp. 343–352 (2000)
Metadata
Title
Handwritten word recognition using Web resources and recurrent neural networks
Authors
Cristina Oprean
Laurence Likforman-Sulem
Adrian Popescu
Chafic Mokbel
Publication date
01-12-2015
Publisher
Springer Berlin Heidelberg
Published in
International Journal on Document Analysis and Recognition (IJDAR) / Issue 4/2015
Print ISSN: 1433-2833
Electronic ISSN: 1433-2825
DOI
https://doi.org/10.1007/s10032-015-0251-1

Other articles of this Issue 4/2015

International Journal on Document Analysis and Recognition (IJDAR) 4/2015 Go to the issue

Premium Partner