Skip to main content
Erschienen in: Pattern Analysis and Applications 4/2015

01.11.2015 | Industrial and Commercial Application

A deep HMM model for multiple keywords spotting in handwritten documents

verfasst von: Simon Thomas, Clément Chatelain, Laurent Heutte, Thierry Paquet, Yousri Kessentini

Erschienen in: Pattern Analysis and Applications | Ausgabe 4/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we propose a query by string word spotting system able to extract arbitrary keywords in handwritten documents, taking both segmentation and recognition decisions at the line level. The system relies on the combination of a HMM line model made of keyword and non-keyword (filler) models, with a deep neural network that estimates the state-dependent observation probabilities. Experiments are carried out on RIMES database, an unconstrained handwritten document database that is used for benchmarking different handwriting recognition tasks. The obtained results show the superiority of the proposed framework over the classical GMM–HMM and standard HMM hybrid architectures.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
An extension of this approach has been proposed in [6], where the HMM is replaced by a combination of a LSTM neural network with a CTC algorithm.
 
2
Note that a Restricted Boltzman Machine (RBM) can also be used, in this case they are called Deep Belief Networks [33].
 
3
A vertical scaling procedure is applied to normalize the height of the text line image to \(h=54\).
 
4
An alternative to this strategy is to use a small \(\varepsilon\), with a maximum number of iterations, as it is the case in [45].
 
Literatur
1.
Zurück zum Zitat Cao H, Govindaraju V (2007) Template-free word spotting in low-quality manuscripts. In: ICDAR, pp 392–396 Cao H, Govindaraju V (2007) Template-free word spotting in low-quality manuscripts. In: ICDAR, pp 392–396
2.
Zurück zum Zitat Choisy C (2007) Dynamic handwritten keyword spotting based on the NSHP-HMM. In: Proceedings of the ninth ICDAR, vol 1, pp 242–246 Choisy C (2007) Dynamic handwritten keyword spotting based on the NSHP-HMM. In: Proceedings of the ninth ICDAR, vol 1, pp 242–246
3.
Zurück zum Zitat Rodríguez-Serrano JA, Perronnin F, Llados J (2009) A similarity measure between vector sequences with application to handwritten word image retrieval. In: CVPR, pp 1722–1729 Rodríguez-Serrano JA, Perronnin F, Llados J (2009) A similarity measure between vector sequences with application to handwritten word image retrieval. In: CVPR, pp 1722–1729
4.
Zurück zum Zitat Rodríguez-Serrano JA, Perronnin F (2009) Handwritten word-spotting using hidden markov models and universal vocabularies. Pattern Recognit pp 2106–2116 Rodríguez-Serrano JA, Perronnin F (2009) Handwritten word-spotting using hidden markov models and universal vocabularies. Pattern Recognit pp 2106–2116
5.
Zurück zum Zitat Chatelain C, Heutte L, Paquet T (2008) Recognition-based vs syntax-directed models for numerical field extraction in handwritten documents. In: ICFHR, Montreal, p 6 Chatelain C, Heutte L, Paquet T (2008) Recognition-based vs syntax-directed models for numerical field extraction in handwritten documents. In: ICFHR, Montreal, p 6
6.
Zurück zum Zitat Frinken V, Fischer A, Manmatha R, Bunke H (2012) A novel word spotting method based on recurrent neural networks. IEEE Trans Pattern Anal Mach Intell 34(2):211–224CrossRef Frinken V, Fischer A, Manmatha R, Bunke H (2012) A novel word spotting method based on recurrent neural networks. IEEE Trans Pattern Anal Mach Intell 34(2):211–224CrossRef
7.
Zurück zum Zitat Fischer A, Keller A, Frinken V, Bunke H (2012) Lexicon-free handwritten word spotting using character HMMS. Pattern Recognit Lett 33(7):934–942CrossRef Fischer A, Keller A, Frinken V, Bunke H (2012) Lexicon-free handwritten word spotting using character HMMS. Pattern Recognit Lett 33(7):934–942CrossRef
8.
Zurück zum Zitat Paquet T, Heutte L, Koch G, Chatelain C (2012) A categorization system for handwritten documents. Int J Doc Anal Recognit 15(4):315–330CrossRef Paquet T, Heutte L, Koch G, Chatelain C (2012) A categorization system for handwritten documents. Int J Doc Anal Recognit 15(4):315–330CrossRef
9.
Zurück zum Zitat Vinciarelli A, Bengio S, Bunke H (2004) Offline recognition of unconstrained handwritten texts using hmms and statistical langage models. IEEE Trans Pattern Anal Mach Intell 26(6):709–720CrossRef Vinciarelli A, Bengio S, Bunke H (2004) Offline recognition of unconstrained handwritten texts using hmms and statistical langage models. IEEE Trans Pattern Anal Mach Intell 26(6):709–720CrossRef
10.
Zurück zum Zitat Rath T, Manmatha R (2003) Features for word spotting in historical manuscripts. In: ICDAR, pp 218–222 Rath T, Manmatha R (2003) Features for word spotting in historical manuscripts. In: ICDAR, pp 218–222
11.
Zurück zum Zitat Adamek T, Connor N, Smeaton A (2007) Word matching using single closed contours for indexing handwritten historical documents. Int J Doc Anal Recognit 9(2):153–165CrossRef Adamek T, Connor N, Smeaton A (2007) Word matching using single closed contours for indexing handwritten historical documents. Int J Doc Anal Recognit 9(2):153–165CrossRef
12.
Zurück zum Zitat Rusinol M, Aldavert D, Toledo R, Lladós J (2011) Browsing heterogeneous document collections by a segmentation-free word spotting method. In: ICDAR, pp 63–67 Rusinol M, Aldavert D, Toledo R, Lladós J (2011) Browsing heterogeneous document collections by a segmentation-free word spotting method. In: ICDAR, pp 63–67
13.
Zurück zum Zitat Thomas S, Chatelain C, Heutte L, Paquet T (2010) An information extraction model for unconstrained handwritten documents. In: ICPR, Istanbul, p 4 Thomas S, Chatelain C, Heutte L, Paquet T (2010) An information extraction model for unconstrained handwritten documents. In: ICPR, Istanbul, p 4
14.
Zurück zum Zitat Fischer A, Keller A, Frinken V, Bunke H (2010) HMM-based word spotting in handwritten documents using subword models. In: ICPR, pp 3416–3419 Fischer A, Keller A, Frinken V, Bunke H (2010) HMM-based word spotting in handwritten documents using subword models. In: ICPR, pp 3416–3419
15.
Zurück zum Zitat Woodland P, Povey D (2002) Large scale discriminative training of hidden markov models for speech recognition. Comput Speech Lang 16(1):25–47CrossRef Woodland P, Povey D (2002) Large scale discriminative training of hidden markov models for speech recognition. Comput Speech Lang 16(1):25–47CrossRef
16.
Zurück zum Zitat Do TMT, Artières T (2009) Maximum margin training of gaussian HMMS for handwriting recognition. In: ICDAR, pp 976–980 Do TMT, Artières T (2009) Maximum margin training of gaussian HMMS for handwriting recognition. In: ICDAR, pp 976–980
17.
Zurück zum Zitat Keshet J, Grangier D, Bengio S (2009) Discriminative keyword spotting. Speech Commun 51:317–329CrossRef Keshet J, Grangier D, Bengio S (2009) Discriminative keyword spotting. Speech Commun 51:317–329CrossRef
18.
Zurück zum Zitat Ganapathiraju A, Hamaker J, Picone J (2000) Hybrid SVM/HMM architectures for speech recognition. In: Interspeech pp 504–507 Ganapathiraju A, Hamaker J, Picone J (2000) Hybrid SVM/HMM architectures for speech recognition. In: Interspeech pp 504–507
19.
Zurück zum Zitat Huang BQ, Du CJ, Zhang YB, Kechadi MT (2006) A hybrid hmm-svm method for online handwriting symbol recognition. In: IEEE international conference on intelligent systems design and applications, pp 887–891 Huang BQ, Du CJ, Zhang YB, Kechadi MT (2006) A hybrid hmm-svm method for online handwriting symbol recognition. In: IEEE international conference on intelligent systems design and applications, pp 887–891
20.
Zurück zum Zitat Boquera S, Bleda M, Gorbe-Moya J, Zamora-Martínez F (2011) Improving offline handwritten text recognition with hybrid HMM/ANN models. IEEE Trans Pattern Anal Mach Intell 33(4):767–779CrossRef Boquera S, Bleda M, Gorbe-Moya J, Zamora-Martínez F (2011) Improving offline handwritten text recognition with hybrid HMM/ANN models. IEEE Trans Pattern Anal Mach Intell 33(4):767–779CrossRef
21.
Zurück zum Zitat Graves A, Liwicki M, Fernandez S, Bertolami R, Bunke H, Schmidhuber J (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31(5):855–868CrossRef Graves A, Liwicki M, Fernandez S, Bertolami R, Bunke H, Schmidhuber J (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31(5):855–868CrossRef
22.
Zurück zum Zitat Marukatat S, Artières T, Gallinari P, Dorizzi B (2001) Sentence recognition through hybrid neuro-markovian modeling. In: ICDAR, pp 731–735 Marukatat S, Artières T, Gallinari P, Dorizzi B (2001) Sentence recognition through hybrid neuro-markovian modeling. In: ICDAR, pp 731–735
23.
Zurück zum Zitat Grosicki E, El-Abed H (2009) ICDAR 2009 handwriting recognition competition. In: ICDAR, pp 1398–1402 Grosicki E, El-Abed H (2009) ICDAR 2009 handwriting recognition competition. In: ICDAR, pp 1398–1402
24.
Zurück zum Zitat El-Yacoubi MA, Gilloux M, Bertille J-M (2002) A statistical approach for phrase location and recognition within a text line: an application to street name recognition. IEEE Trans Pattern Anal Mach Intell 24(2):172–188CrossRef El-Yacoubi MA, Gilloux M, Bertille J-M (2002) A statistical approach for phrase location and recognition within a text line: an application to street name recognition. IEEE Trans Pattern Anal Mach Intell 24(2):172–188CrossRef
25.
Zurück zum Zitat Rabiner LR (1990) A tutorial on hidden markov models and selected applications in speech recognition. In: Readings in speech recognition, pp 267–296 Rabiner LR (1990) A tutorial on hidden markov models and selected applications in speech recognition. In: Readings in speech recognition, pp 267–296
26.
Zurück zum Zitat Kessentini Y, Paquet T, Benhamadou A (2010) Off-line handwritten word recognition using multi-stream hidden Markov models. Pattern Recognit Lett 31:60–70CrossRef Kessentini Y, Paquet T, Benhamadou A (2010) Off-line handwritten word recognition using multi-stream hidden Markov models. Pattern Recognit Lett 31:60–70CrossRef
27.
Zurück zum Zitat Bengio Y, LeCun Y, Nohl C, Burges C (1995) LeRec: a NN/HMM hybrid for on-line handwriting recognition. Neural Comput 7:1289–1303CrossRef Bengio Y, LeCun Y, Nohl C, Burges C (1995) LeRec: a NN/HMM hybrid for on-line handwriting recognition. Neural Comput 7:1289–1303CrossRef
28.
Zurück zum Zitat Knerr S, Augustin E (1998) A neural network-hidden Markov model hybrid for cursive word recognition. ICPR 2:1518–1520 Knerr S, Augustin E (1998) A neural network-hidden Markov model hybrid for cursive word recognition. ICPR 2:1518–1520
29.
Zurück zum Zitat Ganapathiraju A, Hamaker J, Picone J (2000) Hybrid SVM/HMM architectures for speech recognition. In: Speech transcription, workshop, pp 504–507 Ganapathiraju A, Hamaker J, Picone J (2000) Hybrid SVM/HMM architectures for speech recognition. In: Speech transcription, workshop, pp 504–507
30.
Zurück zum Zitat Mohamed A, Dahl G, Hinton G (2012) Acoustic modeling using deep belief networks. IEEE Trans Audio Speech Lang Process 20(1):14–22CrossRef Mohamed A, Dahl G, Hinton G (2012) Acoustic modeling using deep belief networks. IEEE Trans Audio Speech Lang Process 20(1):14–22CrossRef
31.
Zurück zum Zitat Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20(1):30–42CrossRef Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20(1):30–42CrossRef
32.
Zurück zum Zitat Ortmanns S, Ney H, Aubert X (1997) A word graph algorithm for large vocabulary continuous speech recognition. Comput Speech Lang 11(1):43–72CrossRef Ortmanns S, Ney H, Aubert X (1997) A word graph algorithm for large vocabulary continuous speech recognition. Comput Speech Lang 11(1):43–72CrossRef
33.
34.
Zurück zum Zitat Ciresan DC, Meier U, Gambardella LM, Schmidhuber J (2011) Convolutional neural network committees for handwritten character classification. In: ICDAR, pp 1135–1139 Ciresan DC, Meier U, Gambardella LM, Schmidhuber J (2011) Convolutional neural network committees for handwritten character classification. In: ICDAR, pp 1135–1139
35.
Zurück zum Zitat Niu X, Suen C (2012) A novel hybrid CNN SVM classifier for recognizing handwritten digits. Pattern Recognit 45(4):1318–1325CrossRef Niu X, Suen C (2012) A novel hybrid CNN SVM classifier for recognizing handwritten digits. Pattern Recognit 45(4):1318–1325CrossRef
36.
Zurück zum Zitat Lee H, Pham P, Largman Y, Ng A (2009) Unsupervised feature learning for audio classification using convolutional deep belief networks. In: NIPS, pp 1096–1104 Lee H, Pham P, Largman Y, Ng A (2009) Unsupervised feature learning for audio classification using convolutional deep belief networks. In: NIPS, pp 1096–1104
37.
Zurück zum Zitat Le Q, Zou W, Yeung S, Ng A (2011) Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: CVPR, pp 3361–3368 Le Q, Zou W, Yeung S, Ng A (2011) Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: CVPR, pp 3361–3368
38.
Zurück zum Zitat Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: NIPS, pp 153–160 Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: NIPS, pp 153–160
39.
Zurück zum Zitat Ranzato M, Boureau Y-L, LeCun Y (2007) Sparse feature learning for deep belief networks. In: NIPS Ranzato M, Boureau Y-L, LeCun Y (2007) Sparse feature learning for deep belief networks. In: NIPS
40.
Zurück zum Zitat Schenk J, Rigoll G (2006) Novel hybrid NN/HMM modelling techniques for on-line handwriting recognition. In: IWFHR Schenk J, Rigoll G (2006) Novel hybrid NN/HMM modelling techniques for on-line handwriting recognition. In: IWFHR
41.
Zurück zum Zitat Dreuw P, Doetsch P, Plahl C, Ney H (2011) Hierarchical hybrid MLP/HMM or rather MLP features for a discriminatively trained gaussian HMM: a comparison for offline handwriting recognition. In: IEEE international conference on image processing Dreuw P, Doetsch P, Plahl C, Ney H (2011) Hierarchical hybrid MLP/HMM or rather MLP features for a discriminatively trained gaussian HMM: a comparison for offline handwriting recognition. In: IEEE international conference on image processing
42.
Zurück zum Zitat Kimura F, Tsuruoka S, Miyake Y, Shridhar M (1994) A lexicon directed algorithm for recognition of unconstrained handwritten words. IEICE Trans Inf Syst E77-D(7):785–793 Kimura F, Tsuruoka S, Miyake Y, Shridhar M (1994) A lexicon directed algorithm for recognition of unconstrained handwritten words. IEICE Trans Inf Syst E77-D(7):785–793
43.
Zurück zum Zitat Al-Hajj R, Mokbel C, Likforman-Sulem L (2007) Combination of HMM-based classifiers for the recognition of arabic handwritten words. In: ICDAR, pp 959–963 Al-Hajj R, Mokbel C, Likforman-Sulem L (2007) Combination of HMM-based classifiers for the recognition of arabic handwritten words. In: ICDAR, pp 959–963
44.
Zurück zum Zitat Bergstra J, Breuleux O, Bastien F, Lamblin P, Pascanu R, Desjardins G, Turian J, Warde-Farley D, Bengio Y (2010) Theano: a CPU and GPU math expression compiler Bergstra J, Breuleux O, Bastien F, Lamblin P, Pascanu R, Desjardins G, Turian J, Warde-Farley D, Bengio Y (2010) Theano: a CPU and GPU math expression compiler
45.
Zurück zum Zitat Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2:1–127MATHCrossRef Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2:1–127MATHCrossRef
46.
Zurück zum Zitat Larochelle H, Bengio Y, Louradour J, Lamblin P (June 2009) Exploring strategies for training deep neural networks. J Mach Learn Res 10:1–40MATH Larochelle H, Bengio Y, Louradour J, Lamblin P (June 2009) Exploring strategies for training deep neural networks. J Mach Learn Res 10:1–40MATH
Metadaten
Titel
A deep HMM model for multiple keywords spotting in handwritten documents
verfasst von
Simon Thomas
Clément Chatelain
Laurent Heutte
Thierry Paquet
Yousri Kessentini
Publikationsdatum
01.11.2015
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 4/2015
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-014-0433-3

Weitere Artikel der Ausgabe 4/2015

Pattern Analysis and Applications 4/2015 Zur Ausgabe

Premium Partner