Skip to main content
Erschienen in: International Journal of Multimedia Information Retrieval 1/2017

15.10.2016 | Trends and Surveys

A Survey on handwritten documents word spotting

verfasst von: Rashad Ahmed, Wasfi G. Al-Khatib, Sabri Mahmoud

Erschienen in: International Journal of Multimedia Information Retrieval | Ausgabe 1/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Along with the explosive growth of the amount of handwritten documents that are preserved, processed and accessed in a digital form, handwritten document images word spotting has attracted many researchers of various research communities, such as pattern recognition, computer vision and information retrieval. Work on the problem of handwritten documents word spotting has been an active research area and significant progress has been made in the last few years. However, in spite of the great progress achieved, handwritten document word spotting still can hardly achieve acceptable performance on real-world handwritten document images that vary widely in writing style and quality. This paper gives an overview of published research efforts in the area of handwritten document image word spotting and on the technologies used in the field. We first start by describing a general model for document word spotting followed by discussing present challenges in handwritten document word spotting. Then the used databases for handwritten document word spotting and other handwritten text tasks are discussed. After that, research works on handwritten document word spotting are presented. Finally, several summary tables of published research work are provided for used handwritten documents databases and reported performance results on handwritten documents word spotting. These tables summarize different aspects and the reported accuracy for each technique.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Doermann D (1998) The indexing and retrieval of document images: a survey. Comput Vis Image Underst 70(3):287–298CrossRef Doermann D (1998) The indexing and retrieval of document images: a survey. Comput Vis Image Underst 70(3):287–298CrossRef
2.
Zurück zum Zitat Kameshiro T, Hirano T, Okada Y, Yoda F (1999) A document image retrieval method tolerating recognition and segmentation errors of OCR using shape-feature and multiple candidates. In: Proceedings of the fifth international conference on document analysis and recognition, 1999. ICDAR ’99, 681–684 Kameshiro T, Hirano T, Okada Y, Yoda F (1999) A document image retrieval method tolerating recognition and segmentation errors of OCR using shape-feature and multiple candidates. In: Proceedings of the fifth international conference on document analysis and recognition, 1999. ICDAR ’99, 681–684
3.
Zurück zum Zitat Lavrenko V, Rath TM, Manmatha R (2004) Holistic word recognition for handwritten historical documents. In: Proceedings of the first international workshop on document image analysis for libraries, 278–287 Lavrenko V, Rath TM, Manmatha R (2004) Holistic word recognition for handwritten historical documents. In: Proceedings of the first international workshop on document image analysis for libraries, 278–287
4.
Zurück zum Zitat Bai S, Li L, Tan C (2009) Keyword spotting in document images through word shape coding. In: 10th international conference on document analysis and recognition, 331–335 Bai S, Li L, Tan C (2009) Keyword spotting in document images through word shape coding. In: 10th international conference on document analysis and recognition, 331–335
5.
Zurück zum Zitat Almazán J, Gordo A, Fornés A, Valveny E (2014) Word spotting and recognition with embedded attributes. IEEE Trans Pattern Anal Mach Intell 36(12):2552–2566CrossRef Almazán J, Gordo A, Fornés A, Valveny E (2014) Word spotting and recognition with embedded attributes. IEEE Trans Pattern Anal Mach Intell 36(12):2552–2566CrossRef
6.
Zurück zum Zitat Aghbari ZA, Brook S (2009) HAH manuscripts: a holistic paradigm for classifying and retrieving historical arabic handwritten documents. Expert Syst Appl 36(8):10942–10951CrossRef Aghbari ZA, Brook S (2009) HAH manuscripts: a holistic paradigm for classifying and retrieving historical arabic handwritten documents. Expert Syst Appl 36(8):10942–10951CrossRef
7.
Zurück zum Zitat Khayyat M, Lam L, Suen CY (2014) Learning-based word spotting system for arabic handwritten documents. Pattern Recognit 47(3):1021–1030CrossRef Khayyat M, Lam L, Suen CY (2014) Learning-based word spotting system for arabic handwritten documents. Pattern Recognit 47(3):1021–1030CrossRef
8.
Zurück zum Zitat Konidaris T, Gatos B, Ntzios K, Pratikakis I (2007) Theodoridis, and perantonis, keyword-guided word spotting in historical printed documents using synthetic data and user feedback. Int J Doc Anal Recognit 9(2–4):167–177CrossRef Konidaris T, Gatos B, Ntzios K, Pratikakis I (2007) Theodoridis, and perantonis, keyword-guided word spotting in historical printed documents using synthetic data and user feedback. Int J Doc Anal Recognit 9(2–4):167–177CrossRef
9.
Zurück zum Zitat Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407CrossRef Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407CrossRef
10.
Zurück zum Zitat He X, Cai D, Liu H, Ma W-Y (2004) Locality preserving indexing for document representation. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’04, (New York, NY, USA), 96–103, ACM He X, Cai D, Liu H, Ma W-Y (2004) Locality preserving indexing for document representation. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’04, (New York, NY, USA), 96–103, ACM
11.
Zurück zum Zitat Marinai S, Marino E, Soda G (2006) Font adaptive word indexing of modern printed documents. IEEE Trans Pattern Anal Mach Intell 28:1187–1199CrossRef Marinai S, Marino E, Soda G (2006) Font adaptive word indexing of modern printed documents. IEEE Trans Pattern Anal Mach Intell 28:1187–1199CrossRef
12.
Zurück zum Zitat Syeda-Mahmood T (1997) Indexing of handwritten document images. Proc Workshop Doc Image Anal 1997:66–73CrossRef Syeda-Mahmood T (1997) Indexing of handwritten document images. Proc Workshop Doc Image Anal 1997:66–73CrossRef
13.
Zurück zum Zitat Andoni A, Indyk P (2006) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: 47th annual IEEE symposium on foundations of computer science, 2006. FOCS ’06, 459–468 Andoni A, Indyk P (2006) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: 47th annual IEEE symposium on foundations of computer science, 2006. FOCS ’06, 459–468
14.
Zurück zum Zitat Athitsos V, Potamias M, Papapetrou P, Kollios G (2008) Nearest neighbor retrieval using distance-based hashing. In: IEEE 24th international conference on data engineering, 2008. ICDE 2008, 327–336 Athitsos V, Potamias M, Papapetrou P, Kollios G (2008) Nearest neighbor retrieval using distance-based hashing. In: IEEE 24th international conference on data engineering, 2008. ICDE 2008, 327–336
15.
Zurück zum Zitat Rath TM, Manmatha R, Lavrenko V (2004) A search engine for historical manuscript images. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, (Sheffield, United Kingdom), 369–376 Rath TM, Manmatha R, Lavrenko V (2004) A search engine for historical manuscript images. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, (Sheffield, United Kingdom), 369–376
16.
Zurück zum Zitat Alamri H, Sadri J, Suen CY, Nobile N (2008) A novel comprehensive database for arabic off-line handwriting recognition. In: Proceedings of the 11th international conference on frontiers in handwriting recognition (ICFHR 2008), 664–669 Alamri H, Sadri J, Suen CY, Nobile N (2008) A novel comprehensive database for arabic off-line handwriting recognition. In: Proceedings of the 11th international conference on frontiers in handwriting recognition (ICFHR 2008), 664–669
17.
Zurück zum Zitat Marti U-V, Bunke H (2002) The IAM-database: an english sentence database for offline handwriting recognition. Int J Doc Anal Recognit 5(1):39–46MATHCrossRef Marti U-V, Bunke H (2002) The IAM-database: an english sentence database for offline handwriting recognition. Int J Doc Anal Recognit 5(1):39–46MATHCrossRef
18.
Zurück zum Zitat Wuthrich M, Liwicki M, Fischer A, Indermuhle E, Bunke H, Viehhauser G, Stolz M (2009) Language model integration for the recognition of handwritten medieval documents. In: 10th international conference on document analysis and recognition, 2009. ICDAR ’09, 211–215 Wuthrich M, Liwicki M, Fischer A, Indermuhle E, Bunke H, Viehhauser G, Stolz M (2009) Language model integration for the recognition of handwritten medieval documents. In: 10th international conference on document analysis and recognition, 2009. ICDAR ’09, 211–215
19.
Zurück zum Zitat Manmatha R, Croft WB (1997) Word spotting: indexing handwritten manuscripts. In: Maybury MT (ed) Intelligent multimedia information retrieval, MIT Press, Cambridge, pp 43–64 Manmatha R, Croft WB (1997) Word spotting: indexing handwritten manuscripts. In: Maybury MT (ed) Intelligent multimedia information retrieval, MIT Press, Cambridge, pp 43–64
20.
Zurück zum Zitat Rath TM, Manmatha R (2003) Features for word spotting in historical manuscripts. In: Proceedings of seventh international conference on document analysis and recognition, 2003, 1, 218–222 Rath TM, Manmatha R (2003) Features for word spotting in historical manuscripts. In: Proceedings of seventh international conference on document analysis and recognition, 2003, 1, 218–222
21.
Zurück zum Zitat Rath TM, Manmatha R (2007) Word spotting for historical documents. Int J Doc Anal Recognit 9(2–4):139–152CrossRef Rath TM, Manmatha R (2007) Word spotting for historical documents. Int J Doc Anal Recognit 9(2–4):139–152CrossRef
22.
Zurück zum Zitat Kolcz A, Alspector J, Augusteijn M, Carlson R, Viorel G (2000) Popescu, a line-oriented approach to word spotting in handwritten documents. Pattern Anal Appl 3(2):153–168CrossRef Kolcz A, Alspector J, Augusteijn M, Carlson R, Viorel G (2000) Popescu, a line-oriented approach to word spotting in handwritten documents. Pattern Anal Appl 3(2):153–168CrossRef
23.
Zurück zum Zitat Sigappi A, Palanivel S, Ramalingam V (2011) Handwritten document retrieval system for tamil language. Int J Comput Appl 31:42–47 Sigappi A, Palanivel S, Ramalingam V (2011) Handwritten document retrieval system for tamil language. Int J Comput Appl 31:42–47
24.
Zurück zum Zitat Shah M, Suen C (2010) Word spotting in gray scale handwritten pashto documents. Int Conf Front Handwrit Recognit 2010:136–141 Shah M, Suen C (2010) Word spotting in gray scale handwritten pashto documents. Int Conf Front Handwrit Recognit 2010:136–141
25.
Zurück zum Zitat Abidi A, Jamil A, Siddiqi I, Khurshid K (2012) Word spotting based retrieval of urdu handwritten documents. In: Proceedings of the 2012 international conference on frontiers in handwriting recognition, ICFHR ’12, (Washington, DC, USA), 331–336, IEEE Computer Society Abidi A, Jamil A, Siddiqi I, Khurshid K (2012) Word spotting based retrieval of urdu handwritten documents. In: Proceedings of the 2012 international conference on frontiers in handwriting recognition, ICFHR ’12, (Washington, DC, USA), 331–336, IEEE Computer Society
26.
Zurück zum Zitat Wei H, Gao G (2014) A keyword retrieval system for historical mongolian document images. IJDAR 17(1):33–45 Wei H, Gao G (2014) A keyword retrieval system for historical mongolian document images. IJDAR 17(1):33–45
27.
Zurück zum Zitat Kesidis A, Galiotou E, Gatos B, Lampropoulos A, Pratikakis I, Manolessou I, Ralli A (2009) Accessing the content of greek historical documents. In: Proceedings of the third workshop on analytics for noisy unstructured text data, AND ’09, (New York, NY, USA), 55–62, ACM Kesidis A, Galiotou E, Gatos B, Lampropoulos A, Pratikakis I, Manolessou I, Ralli A (2009) Accessing the content of greek historical documents. In: Proceedings of the third workshop on analytics for noisy unstructured text data, AND ’09, (New York, NY, USA), 55–62, ACM
28.
Zurück zum Zitat Cao H, Bhardwaj A, Govindaraju V (2009) A probabilistic method for keyword retrieval in handwritten document images. Pattern Recognit 42(12):3374–3382MATHCrossRef Cao H, Bhardwaj A, Govindaraju V (2009) A probabilistic method for keyword retrieval in handwritten document images. Pattern Recognit 42(12):3374–3382MATHCrossRef
29.
Zurück zum Zitat Rath TM, Manmatha R (2003) Word image matching using dynamic time warping. In: Proceedings of 2003 IEEE computer society conference on computer vision and pattern recognition, 2003, vol 2, pp II-521–II-527 Rath TM, Manmatha R (2003) Word image matching using dynamic time warping. In: Proceedings of 2003 IEEE computer society conference on computer vision and pattern recognition, 2003, vol 2, pp II-521–II-527
30.
Zurück zum Zitat Srihari S, Srinivasan H, Babu P, Bhole C (2006) Spotting words in handwritten arabic documents. In: Document recognition and retrieval XIII: Proceedings SPIE Srihari S, Srinivasan H, Babu P, Bhole C (2006) Spotting words in handwritten arabic documents. In: Document recognition and retrieval XIII: Proceedings SPIE
31.
Zurück zum Zitat Srihari S, Srinivasan H, Babu P, Bhole C (2005) Handwritten arabic word spotting using the cedarabic document analysis system. In: Proceedings 2005 symposium on document image understanding technology Srihari S, Srinivasan H, Babu P, Bhole C (2005) Handwritten arabic word spotting using the cedarabic document analysis system. In: Proceedings 2005 symposium on document image understanding technology
32.
Zurück zum Zitat Srihari S, Ball G (2008) Language independent word spotting in scanned documents”, in Digital Libraries: universal and ubiquitous access to information, 5362 of lecture notes in computer science. Springer, Berlin Srihari S, Ball G (2008) Language independent word spotting in scanned documents”, in Digital Libraries: universal and ubiquitous access to information, 5362 of lecture notes in computer science. Springer, Berlin
33.
Zurück zum Zitat Zhang B, Srihari SN, Huang C (2003) Word image retrieval using binary features. Proc SPIE 5296:45–53CrossRef Zhang B, Srihari SN, Huang C (2003) Word image retrieval using binary features. Proc SPIE 5296:45–53CrossRef
34.
Zurück zum Zitat Kefali A, Chemmam C (2011) A semi-automatic approach of old arabic documents indexing. In: CIIA’11, 1 Kefali A, Chemmam C (2011) A semi-automatic approach of old arabic documents indexing. In: CIIA’11, 1
35.
Zurück zum Zitat Liang Y, Fairhurst M, Guest R (2012) A synthesised word approach to word retrieval in handwritten documents. Pattern Recognit 45(12):4225–4236CrossRef Liang Y, Fairhurst M, Guest R (2012) A synthesised word approach to word retrieval in handwritten documents. Pattern Recognit 45(12):4225–4236CrossRef
36.
Zurück zum Zitat Moghaddam R, Cheriet M (2009) Application of multi-level classifiers and clustering for automatic word spotting in historical document images. In: 10th international conference on document analysis and recognition, 511–515 Moghaddam R, Cheriet M (2009) Application of multi-level classifiers and clustering for automatic word spotting in historical document images. In: 10th international conference on document analysis and recognition, 511–515
37.
Zurück zum Zitat Llados J, Sanchez G (2007) Indexing historical documents by word shape signatures. In: Ninth international conference on document analysis and recognition, 2007. ICDAR 2007, 1, 362–366 Llados J, Sanchez G (2007) Indexing historical documents by word shape signatures. In: Ninth international conference on document analysis and recognition, 2007. ICDAR 2007, 1, 362–366
38.
Zurück zum Zitat Fornés, A, Frinken V, Fischer A, Almazán J, Jackson G, Bunke H (2011) A keyword spotting approach using blurred shape model-based descriptors. In: Proceedings of the 2011 workshop on historical document imaging and processing, HIP ’11, (New York, NY, USA), 83–90, ACM Fornés, A, Frinken V, Fischer A, Almazán J, Jackson G, Bunke H (2011) A keyword spotting approach using blurred shape model-based descriptors. In: Proceedings of the 2011 workshop on historical document imaging and processing, HIP ’11, (New York, NY, USA), 83–90, ACM
39.
Zurück zum Zitat Lladós J, Rodríguez Partha J, Sánchez (2007) Word spotting in archive documents using shape contexts. In: Pattern recognition and image analysis, 4478 of lecture notes in computer science, 290–297 Lladós J, Rodríguez Partha J, Sánchez (2007) Word spotting in archive documents using shape contexts. In: Pattern recognition and image analysis, 4478 of lecture notes in computer science, 290–297
40.
Zurück zum Zitat Roy PP, Rayar F, Ramel J-Y (2015) Word spotting in historical documents using primitive codebook and dynamic programming. Image Vis Comput 44:15–28CrossRef Roy PP, Rayar F, Ramel J-Y (2015) Word spotting in historical documents using primitive codebook and dynamic programming. Image Vis Comput 44:15–28CrossRef
41.
Zurück zum Zitat Giotis A, Sfikas G, Nikou C, Gatos B (2015) Shape-based word spotting in handwritten document images. In: 13th international conference on document analysis and recognition (ICDAR), 2015, 561–565 Giotis A, Sfikas G, Nikou C, Gatos B (2015) Shape-based word spotting in handwritten document images. In: 13th international conference on document analysis and recognition (ICDAR), 2015, 561–565
42.
Zurück zum Zitat Adamek T, O’Connor N, Smeaton A (2007) Word matching using single closed contours for indexing handwritten historical documents. Int J Doc Anal Recognit 9(2–4):153–165CrossRef Adamek T, O’Connor N, Smeaton A (2007) Word matching using single closed contours for indexing handwritten historical documents. Int J Doc Anal Recognit 9(2–4):153–165CrossRef
43.
Zurück zum Zitat Can EF, Duygulu P (2011) A line-based representation for matching words in historical manuscripts. Pattern Recognit Lett 32(8):1126–1138CrossRef Can EF, Duygulu P (2011) A line-based representation for matching words in historical manuscripts. Pattern Recognit Lett 32(8):1126–1138CrossRef
44.
Zurück zum Zitat Casey RG, Lecolinet E (1996) A survey of methods and strategies in character segmentation. IEEE Trans Pattern Anal Mach Intell 18:690–706CrossRef Casey RG, Lecolinet E (1996) A survey of methods and strategies in character segmentation. IEEE Trans Pattern Anal Mach Intell 18:690–706CrossRef
45.
Zurück zum Zitat Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24:509–522CrossRef Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24:509–522CrossRef
46.
Zurück zum Zitat Escalera S, Fornés A, Pujol O, Radeva P, Saánchez G, Lladaós J (2009) Blurred shape model for binary and grey-level symbol recognition. Pattern Recognit Lett 30(15):1424–1433CrossRef Escalera S, Fornés A, Pujol O, Radeva P, Saánchez G, Lladaós J (2009) Blurred shape model for binary and grey-level symbol recognition. Pattern Recognit Lett 30(15):1424–1433CrossRef
47.
Zurück zum Zitat Gatos B, Stamatopoulos N, Louloudis G, Sfikas G, Retsinas G, Papavassiliou V, Sunistira F, Katsouros V (2015) Grpoly-db: an old greek polytonic document image database. In: 13th international conference on document analysis and recognition (ICDAR), 2015, 646–650 Gatos B, Stamatopoulos N, Louloudis G, Sfikas G, Retsinas G, Papavassiliou V, Sunistira F, Katsouros V (2015) Grpoly-db: an old greek polytonic document image database. In: 13th international conference on document analysis and recognition (ICDAR), 2015, 646–650
48.
Zurück zum Zitat Adamek T, O’Connor N (2004) A multiscale representation method for nonrigid shapes with a single closed contour. IEEE Trans Circuits Syst Video Technol 14:742–753CrossRef Adamek T, O’Connor N (2004) A multiscale representation method for nonrigid shapes with a single closed contour. IEEE Trans Circuits Syst Video Technol 14:742–753CrossRef
49.
Zurück zum Zitat Agarwal PK, Varadarajan KR (2000) Efficient algorithms for approximating polygonal chains. Discrete Comput Geom 23(2):273–291MathSciNetMATHCrossRef Agarwal PK, Varadarajan KR (2000) Efficient algorithms for approximating polygonal chains. Discrete Comput Geom 23(2):273–291MathSciNetMATHCrossRef
50.
Zurück zum Zitat Ferrari V, Fevrier L, Jurie F, Schmid C (2008) Groups of adjacent contour segments for object detection. IEEE Trans Pattern Anal Mach Intell 30:36–51CrossRef Ferrari V, Fevrier L, Jurie F, Schmid C (2008) Groups of adjacent contour segments for object detection. IEEE Trans Pattern Anal Mach Intell 30:36–51CrossRef
51.
Zurück zum Zitat Ataer E, Duygulu P (2007) Matching ottoman words. In: Signal processing and communications applications, 2007. SIU 2007. IEEE 15th, 1–4 Ataer E, Duygulu P (2007) Matching ottoman words. In: Signal processing and communications applications, 2007. SIU 2007. IEEE 15th, 1–4
52.
Zurück zum Zitat Nol MR, Aldavert D, Toledo R, Lladós J (2015) Efficient segmentation-free keyword spotting in historical document collections. Pattern Recognit 48(2):545–555CrossRef Nol MR, Aldavert D, Toledo R, Lladós J (2015) Efficient segmentation-free keyword spotting in historical document collections. Pattern Recognit 48(2):545–555CrossRef
53.
Zurück zum Zitat Yalniz IZ, Manmatha R (2012) An efficient framework for searching text in noisy document images. In: 10th IAPR international workshop on document analysis systems, DAS 2012, Gold Coast, Queenslands, Australia, March 27-29, 2012, 48–52 Yalniz IZ, Manmatha R (2012) An efficient framework for searching text in noisy document images. In: 10th IAPR international workshop on document analysis systems, DAS 2012, Gold Coast, Queenslands, Australia, March 27-29, 2012, 48–52
54.
Zurück zum Zitat Rothacker L, Rusiñol M, Fink G (2013) Bag-of-features HMMs for segmentation-free word spotting in handwritten documents. In: 12th international conference on document analysis and recognition (ICDAR), 2013, 1305–1309 Rothacker L, Rusiñol M, Fink G (2013) Bag-of-features HMMs for segmentation-free word spotting in handwritten documents. In: 12th international conference on document analysis and recognition (ICDAR), 2013, 1305–1309
55.
Zurück zum Zitat Leydier M, Aldavert D, Toledo R, Llados J ( 2011) Browsing heterogeneous document collections by a segmentation-free word spotting method. In: International conference on document analysis and recognition, 63–67 Leydier M, Aldavert D, Toledo R, Llados J ( 2011) Browsing heterogeneous document collections by a segmentation-free word spotting method. In: International conference on document analysis and recognition, 63–67
56.
Zurück zum Zitat Czuni L, Kiss P, Gal M, Lipovits A (2013) Local feature based word spotting in handwritten archive documents. In: 2013 11th international workshop on content-based multimedia indexing (CBMI), 179–184 Czuni L, Kiss P, Gal M, Lipovits A (2013) Local feature based word spotting in handwritten archive documents. In: 2013 11th international workshop on content-based multimedia indexing (CBMI), 179–184
57.
Zurück zum Zitat Rodriguez-Serrano J, Perronnin F (2009) Handwritten word image retrieval with synthesized typed queries. In: 10th international conference on document analysis and recognition, 2009. ICDAR ’09, 351–355 Rodriguez-Serrano J, Perronnin F (2009) Handwritten word image retrieval with synthesized typed queries. In: 10th international conference on document analysis and recognition, 2009. ICDAR ’09, 351–355
58.
Zurück zum Zitat Rodriguez-Serrano JA, Perronnin F (2012) Synthesizing queries for handwritten word image retrieval. Pattern Recognit 45(9):3270–3276CrossRef Rodriguez-Serrano JA, Perronnin F (2012) Synthesizing queries for handwritten word image retrieval. Pattern Recognit 45(9):3270–3276CrossRef
59.
Zurück zum Zitat Rodrıguez JA, Perronnin F (2008) Local gradient histogram features for word spotting in unconstrained handwritten documents. In: International conference on frontiers in handwriting recognition Rodrıguez JA, Perronnin F (2008) Local gradient histogram features for word spotting in unconstrained handwritten documents. In: International conference on frontiers in handwriting recognition
60.
Zurück zum Zitat Zhang X, Tan C (2013) Segmentation-free keyword spotting for handwritten documents based on heat kernel signature. In: 12th international conference on document analysis and recognition (ICDAR), 2013, 827–831 Zhang X, Tan C (2013) Segmentation-free keyword spotting for handwritten documents based on heat kernel signature. In: 12th international conference on document analysis and recognition (ICDAR), 2013, 827–831
61.
Zurück zum Zitat Khayyat M, Lam L, Suen CY (2012) Arabic handwritten word spotting using language models. In: Proceedings of the 2012 international conference on frontiers in handwriting recognition, ICFHR ’12, (Washington, DC, USA), 43–48, IEEE Computer Society Khayyat M, Lam L, Suen CY (2012) Arabic handwritten word spotting using language models. In: Proceedings of the 2012 international conference on frontiers in handwriting recognition, ICFHR ’12, (Washington, DC, USA), 43–48, IEEE Computer Society
62.
Zurück zum Zitat Almazaán J, Gordo A, Fornés A, Valveny E (2014) Segmentation-free word spotting with exemplar SVMs. Pattern Recognit 47(12):3967–3978CrossRef Almazaán J, Gordo A, Fornés A, Valveny E (2014) Segmentation-free word spotting with exemplar SVMs. Pattern Recognit 47(12):3967–3978CrossRef
63.
Zurück zum Zitat Yao S, Wen Y, Lu Y (2015) Hog based two-directional dynamic time warping for handwritten word spotting. In: 13th international conference on document analysis and recognition (ICDAR), 2015, 161–165 Yao S, Wen Y, Lu Y (2015) Hog based two-directional dynamic time warping for handwritten word spotting. In: 13th international conference on document analysis and recognition (ICDAR), 2015, 161–165
64.
Zurück zum Zitat Rothacker L, Fink G (2015) Segmentation-free query-by-string word spotting with bag-of-features HMMs. In: 13th international conference on document analysis and recognition, 661–665 Rothacker L, Fink G (2015) Segmentation-free query-by-string word spotting with bag-of-features HMMs. In: 13th international conference on document analysis and recognition, 661–665
65.
Zurück zum Zitat Rabaev I, Kedem K, El-Sana J (2016) Keyword retrieval using scale-space pyramid. In: 12th IAPR workshop on document analysis systems (DAS), 2016, 144–149 Rabaev I, Kedem K, El-Sana J (2016) Keyword retrieval using scale-space pyramid. In: 12th IAPR workshop on document analysis systems (DAS), 2016, 144–149
66.
Zurück zum Zitat Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE computer society conference on computer vision and pattern recognition, 2: 2169–2178 Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE computer society conference on computer vision and pattern recognition, 2: 2169–2178
67.
Zurück zum Zitat Pratikakis I, Zagoris K, Gatos B, Louloudis G, Stamatopoulos N (2014) ICFHR 2014 competition on handwritten keyword spotting (h-kws 2014). In: 14th international conference on frontiers in handwriting recognition (ICFHR), 2014, 814–819 Pratikakis I, Zagoris K, Gatos B, Louloudis G, Stamatopoulos N (2014) ICFHR 2014 competition on handwritten keyword spotting (h-kws 2014). In: 14th international conference on frontiers in handwriting recognition (ICFHR), 2014, 814–819
68.
Zurück zum Zitat Sagheer M, Nobile N, He CL, Suen C (2010) A novel handwritten urdu word spotting based on connected components analysis. In: 20th international conference on pattern recognition (ICPR), 2010, 2013–2016 Sagheer M, Nobile N, He CL, Suen C (2010) A novel handwritten urdu word spotting based on connected components analysis. In: 20th international conference on pattern recognition (ICPR), 2010, 2013–2016
69.
Zurück zum Zitat Fischer A, Keller A, Frinken V, Bunke H (2012) Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit Lett 33(7):934–942CrossRef Fischer A, Keller A, Frinken V, Bunke H (2012) Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit Lett 33(7):934–942CrossRef
70.
Zurück zum Zitat Fischer A, Keller A, Frinken V, Bunke H (2010) Hmm-based word spotting in handwritten documents using subword models. In: 20th international conference on pattern recognition (ICPR), 2010, 3416–3419 Fischer A, Keller A, Frinken V, Bunke H (2010) Hmm-based word spotting in handwritten documents using subword models. In: 20th international conference on pattern recognition (ICPR), 2010, 3416–3419
71.
Zurück zum Zitat Wshah S, Kumar G, Govindaraju V (2012) Script independent word spotting in offline handwritten documents based on hidden markov models. Int Conf Front Handwrit Recognit 2012:14–19 Wshah S, Kumar G, Govindaraju V (2012) Script independent word spotting in offline handwritten documents based on hidden markov models. Int Conf Front Handwrit Recognit 2012:14–19
72.
Zurück zum Zitat Wshaha S, Kumar G, Govindaraju V (2014) Statistical script independent word spotting in offline handwritten documents. Pattern Recognit 47(3):1039–1050CrossRef Wshaha S, Kumar G, Govindaraju V (2014) Statistical script independent word spotting in offline handwritten documents. Pattern Recognit 47(3):1039–1050CrossRef
73.
Zurück zum Zitat Rodriguez-Serrano J, Perronnin F (2012) A model-based sequence similarity with application to handwritten word spotting. IEEE Trans Pattern Anal Mach Intell 34:2108–2120CrossRef Rodriguez-Serrano J, Perronnin F (2012) A model-based sequence similarity with application to handwritten word spotting. IEEE Trans Pattern Anal Mach Intell 34:2108–2120CrossRef
74.
Zurück zum Zitat Saykol E, Sinop AK, Gudukbay U, Ulusoy O, Çetin AE (2004) Content-based retrieval of historical ottoman documents stored as textual images. IEEE Trans Image Proc 13(3):314–325CrossRef Saykol E, Sinop AK, Gudukbay U, Ulusoy O, Çetin AE (2004) Content-based retrieval of historical ottoman documents stored as textual images. IEEE Trans Image Proc 13(3):314–325CrossRef
75.
Zurück zum Zitat Shahab S, Al-Khatib W, Mahmoud S (2006) Computer aided indexing of historical manuscripts. Int Conf Comput Graph Imaging Vis 2006:287–295 Shahab S, Al-Khatib W, Mahmoud S (2006) Computer aided indexing of historical manuscripts. Int Conf Comput Graph Imaging Vis 2006:287–295
76.
Zurück zum Zitat Rodríguez-Serrano JA, Perronnin F (2009) Handwritten word-spotting using hidden markov models and universal vocabularies. Pattern Recognit 42(9):2106–2116MATHCrossRef Rodríguez-Serrano JA, Perronnin F (2009) Handwritten word-spotting using hidden markov models and universal vocabularies. Pattern Recognit 42(9):2106–2116MATHCrossRef
77.
Zurück zum Zitat Huang L, Yin F, Chen Q-H, Liu C-L (2013) Keyword spotting in unconstrained handwritten chinese documents using contextual word model. Image Vis Comput 31(12):958–968CrossRef Huang L, Yin F, Chen Q-H, Liu C-L (2013) Keyword spotting in unconstrained handwritten chinese documents using contextual word model. Image Vis Comput 31(12):958–968CrossRef
78.
Zurück zum Zitat Kesidis A, Galiotou E, Gatos B, Pratikakis I (2011) A word spotting framework for historical machine-printed documents. Int J Doc Anal Recognit 14(2):131–144CrossRef Kesidis A, Galiotou E, Gatos B, Pratikakis I (2011) A word spotting framework for historical machine-printed documents. Int J Doc Anal Recognit 14(2):131–144CrossRef
79.
Zurück zum Zitat Gatos B, Konidaris T, Pratikakis I, Perantonis S (2006) A holistic methodology for keyword search in historical typewritten documents. In: Antoniou G, Potamias G, Spyropoulos C, Plexousakis D (eds) Advances in artificial intelligence, 3955 of lecture notes in computer science. Springer, Berlin, pp 490–493 Gatos B, Konidaris T, Pratikakis I, Perantonis S (2006) A holistic methodology for keyword search in historical typewritten documents. In: Antoniou G, Potamias G, Spyropoulos C, Plexousakis D (eds) Advances in artificial intelligence, 3955 of lecture notes in computer science. Springer, Berlin, pp 490–493
80.
Zurück zum Zitat Frinken V, Fischer A, Bunke H (2010) A novel word spotting algorithm using bidirectional long short-term memory neural networks. In: Schwenker F, El Gayar N (eds) Artificial neural networks in pattern recognition, 5998 of lecture notes in computer science. Springer, Berlin, pp 185–196 Frinken V, Fischer A, Bunke H (2010) A novel word spotting algorithm using bidirectional long short-term memory neural networks. In: Schwenker F, El Gayar N (eds) Artificial neural networks in pattern recognition, 5998 of lecture notes in computer science. Springer, Berlin, pp 185–196
81.
Zurück zum Zitat Frinken V, Fischer A, Manmatha R, Bunke H (2012) A novel word spotting method based on recurrent neural networks. IEEE Trans Pattern Anal Mach Intell 34:211–224CrossRef Frinken V, Fischer A, Manmatha R, Bunke H (2012) A novel word spotting method based on recurrent neural networks. IEEE Trans Pattern Anal Mach Intell 34:211–224CrossRef
82.
Zurück zum Zitat Retsinas G, Louloudis G, Stamatopoulos, Gatos B (2016) Keyword spotting in handwritten documents using projections of oriented gradients. In: 12th IAPR workshop on document analysis systems (DAS), 411–416 Retsinas G, Louloudis G, Stamatopoulos, Gatos B (2016) Keyword spotting in handwritten documents using projections of oriented gradients. In: 12th IAPR workshop on document analysis systems (DAS), 411–416
83.
Zurück zum Zitat Fischer A, Frinken V, Bunke H, Suen C (2013) Improving HMM-based keyword spotting with character language models. In: 12th international conference on document analysis and recognition (ICDAR), 2013, 506–510 Fischer A, Frinken V, Bunke H, Suen C (2013) Improving HMM-based keyword spotting with character language models. In: 12th international conference on document analysis and recognition (ICDAR), 2013, 506–510
84.
Zurück zum Zitat Marti U-V, Bunke H (2001) Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int J Pattern Recognit Artif Intell 15(01):65–90CrossRef Marti U-V, Bunke H (2001) Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int J Pattern Recognit Artif Intell 15(01):65–90CrossRef
85.
Zurück zum Zitat Bunke H, Bengio S, Vinciarelli A (2004) Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans Pattern Anal Mach Intell 26(6):709–720CrossRef Bunke H, Bengio S, Vinciarelli A (2004) Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans Pattern Anal Mach Intell 26(6):709–720CrossRef
86.
Zurück zum Zitat Liu C-L, Yin F, Wang D-H, Wang Q-F (2011) Casia online and offline chinese handwriting databases. Int Conf Doc Anal Recognit 2011:37–41 Liu C-L, Yin F, Wang D-H, Wang Q-F (2011) Casia online and offline chinese handwriting databases. Int Conf Doc Anal Recognit 2011:37–41
87.
Zurück zum Zitat Perronnin F, Rodriguez-Serrano JA (2009) Fisher kernels for handwritten word spotting. In: Proceedings of the 2009 10th international conference on document analysis and recognition, ICDAR ’09, (Washington, DC, USA), 106–110, IEEE computer society Perronnin F, Rodriguez-Serrano JA (2009) Fisher kernels for handwritten word spotting. In: Proceedings of the 2009 10th international conference on document analysis and recognition, ICDAR ’09, (Washington, DC, USA), 106–110, IEEE computer society
88.
Zurück zum Zitat Zhang H, Wang D-H, Liu C-L (2010) Keyword spotting from online chinese handwritten documents using one-vs-all trained character classifier. Int Conf Front Handwrit Recognit 2010:271–276 Zhang H, Wang D-H, Liu C-L (2010) Keyword spotting from online chinese handwritten documents using one-vs-all trained character classifier. Int Conf Front Handwrit Recognit 2010:271–276
89.
Zurück zum Zitat Zhang H, Zhou X-D, Liu C-L (2013) Keyword spotting in online chinese handwritten documents with candidate scoring based on semi-CRF model. In: Document analysis and recognition (ICDAR), 2013 12th international conference on, 567–571 Zhang H, Zhou X-D, Liu C-L (2013) Keyword spotting in online chinese handwritten documents with candidate scoring based on semi-CRF model. In: Document analysis and recognition (ICDAR), 2013 12th international conference on, 567–571
90.
Zurück zum Zitat Terasawa K, Nagasaki T, Kawashima T (2005) Eig-enspace method for text retrieval in historical document images. Proc Eighth Int Conf Doc Anal Recognit 1:437–441CrossRef Terasawa K, Nagasaki T, Kawashima T (2005) Eig-enspace method for text retrieval in historical document images. Proc Eighth Int Conf Doc Anal Recognit 1:437–441CrossRef
91.
Zurück zum Zitat Terasawa K, Nagasaki T, Kawashima T (2006) Automatic keyword extraction from historical document images. Document analysis systems VII, 3872 of lecture notes in computer science. Springer, Berlin, pp 413–424 Terasawa K, Nagasaki T, Kawashima T (2006) Automatic keyword extraction from historical document images. Document analysis systems VII, 3872 of lecture notes in computer science. Springer, Berlin, pp 413–424
92.
Zurück zum Zitat Nabil Aouadi AK (2011) Word spotting for arabic handwritten historical document retrieval using generalized hough transform. Third Int Conf Pervasive Patterns Appl 2011:67–71 Nabil Aouadi AK (2011) Word spotting for arabic handwritten historical document retrieval using generalized hough transform. Third Int Conf Pervasive Patterns Appl 2011:67–71
93.
Zurück zum Zitat Sousa J, Gil J, Pinto J (2007) Word indexing of ancient documents using fuzzy classification. Fuzzy Syst IEEE Trans 15:852–862CrossRef Sousa J, Gil J, Pinto J (2007) Word indexing of ancient documents using fuzzy classification. Fuzzy Syst IEEE Trans 15:852–862CrossRef
94.
Zurück zum Zitat Fernández D, Lladós J, Fornés A (2011) Handwritten word spotting in old manuscript images using a pseudo-structural descriptor organized in a hash structure. Pattern recognition and image analysis, 6669 of lecture notes in computer science. Springer, Berlin, pp 628–635 Fernández D, Lladós J, Fornés A (2011) Handwritten word spotting in old manuscript images using a pseudo-structural descriptor organized in a hash structure. Pattern recognition and image analysis, 6669 of lecture notes in computer science. Springer, Berlin, pp 628–635
95.
Zurück zum Zitat Bilane P, Bres S, Challita K, Emptoz H (2009) Indexation of syriac manuscripts using directional features. In: 16th IEEE international conference on image processing (ICIP), 2009, 1841–1844 Bilane P, Bres S, Challita K, Emptoz H (2009) Indexation of syriac manuscripts using directional features. In: 16th IEEE international conference on image processing (ICIP), 2009, 1841–1844
96.
Zurück zum Zitat Bilane P, Bres S, Emptoz H (2008) Robust directional features for wordspotting in degraded syriac manuscripts. In: International workshop on content-based multimedia indexing, 526–533 Bilane P, Bres S, Emptoz H (2008) Robust directional features for wordspotting in degraded syriac manuscripts. In: International workshop on content-based multimedia indexing, 526–533
97.
Zurück zum Zitat Zant T, Schomaker L, Haak K (2008) Handwritten-word spotting using biologically inspired features. IEEE Trans Pattern Anal Mach Intell 30:1945–1957CrossRef Zant T, Schomaker L, Haak K (2008) Handwritten-word spotting using biologically inspired features. IEEE Trans Pattern Anal Mach Intell 30:1945–1957CrossRef
98.
Zurück zum Zitat Leydier Y, Lebourgeois F, Emptoz H (2007) Text search for medieval manuscript images. Pattern Recognit 40(12):3552–3567MATHCrossRef Leydier Y, Lebourgeois F, Emptoz H (2007) Text search for medieval manuscript images. Pattern Recognit 40(12):3552–3567MATHCrossRef
99.
Zurück zum Zitat Leydier Y, Ouji A, LeBourgeois F, Emptoz H (2009) Towards an omnilingual word retrieval system for ancient manuscripts. Pattern Recognit 42(9):2089–2105MATHCrossRef Leydier Y, Ouji A, LeBourgeois F, Emptoz H (2009) Towards an omnilingual word retrieval system for ancient manuscripts. Pattern Recognit 42(9):2089–2105MATHCrossRef
100.
Zurück zum Zitat Ghosh SK, Valveny E (2015) Query by string word spotting based on character bi-gram indexing. In: 13th international conference on document analysis and recognition (ICDAR), 2015, 881–885 Ghosh SK, Valveny E (2015) Query by string word spotting based on character bi-gram indexing. In: 13th international conference on document analysis and recognition (ICDAR), 2015, 881–885
101.
Zurück zum Zitat Bui QA, Visani M, Mullot R (2015) Unsupervised word spotting using a graph representation based on invariants. In: 13th international conference on document analysis and recognition (ICDAR), 2015, 616–620 Bui QA, Visani M, Mullot R (2015) Unsupervised word spotting using a graph representation based on invariants. In: 13th international conference on document analysis and recognition (ICDAR), 2015, 616–620
102.
Zurück zum Zitat Riba P, Lladãs J, Fornés A (2015) Handwritten word spotting by inexact matching of grapheme graphs. In: 13th international conference on document analysis and recognition (ICDAR), 2015, 781–785 Riba P, Lladãs J, Fornés A (2015) Handwritten word spotting by inexact matching of grapheme graphs. In: 13th international conference on document analysis and recognition (ICDAR), 2015, 781–785
103.
Zurück zum Zitat Sharma A, Sankar KP (2015) Adapting off-the-shelf cnns for word spotting and recognition. In: 13th international conference on document analysis and recognition (ICDAR), 986–990, x Sharma A, Sankar KP (2015) Adapting off-the-shelf cnns for word spotting and recognition. In: 13th international conference on document analysis and recognition (ICDAR), 986–990, x
105.
Zurück zum Zitat Zhou X-D, Wang D-H, Tian F, Liu C-L, Nakagawa M (2013) Handwritten Chinese/Japanese text recognition using semi-Markov conditional random fields. IEEE Trans Pattern Anal Mach Intell 35:2413–2426CrossRef Zhou X-D, Wang D-H, Tian F, Liu C-L, Nakagawa M (2013) Handwritten Chinese/Japanese text recognition using semi-Markov conditional random fields. IEEE Trans Pattern Anal Mach Intell 35:2413–2426CrossRef
106.
Zurück zum Zitat Glucksman H (1967) Classification of mixed-font alphabets by characteristic loci. In: Proceedings of conference IEEE comput, pp 138–141 Glucksman H (1967) Classification of mixed-font alphabets by characteristic loci. In: Proceedings of conference IEEE comput, pp 138–141
107.
Zurück zum Zitat Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T (2007) Robust object recognition with cortex-like mechanisms. IEEE Trans Pattern Anal Mach Intell 29:411–426CrossRef Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T (2007) Robust object recognition with cortex-like mechanisms. IEEE Trans Pattern Anal Mach Intell 29:411–426CrossRef
108.
Zurück zum Zitat Serre T, Wolf L, Poggio T (2005) Object recognition with features inspired by visual cortex. In: IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005, vol 2:2, pp 994–1000 Serre T, Wolf L, Poggio T (2005) Object recognition with features inspired by visual cortex. In: IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005, vol 2:2, pp 994–1000
109.
Zurück zum Zitat Powell MJD (1987) Radial basis functions for multivariable interpolation: a review. In: Mason JC, Cox MG (eds) Algorithms for approximation. Clarendon Press, New York, pp 143–167 Powell MJD (1987) Radial basis functions for multivariable interpolation: a review. In: Mason JC, Cox MG (eds) Algorithms for approximation. Clarendon Press, New York, pp 143–167
110.
Zurück zum Zitat NL-HaNA (1903) Archief van het Kabinet der Koningin, Den Haag (Netherlands) NL-HaNA (1903) Archief van het Kabinet der Koningin, Den Haag (Netherlands)
111.
Zurück zum Zitat van Oosten J-P, Schomaker L (2014) Separability versus prototypicality in handwritten word-image retrieval. Pattern Recognit 47(3):1031–1038CrossRef van Oosten J-P, Schomaker L (2014) Separability versus prototypicality in handwritten word-image retrieval. Pattern Recognit 47(3):1031–1038CrossRef
Metadaten
Titel
A Survey on handwritten documents word spotting
verfasst von
Rashad Ahmed
Wasfi G. Al-Khatib
Sabri Mahmoud
Publikationsdatum
15.10.2016
Verlag
Springer London
Erschienen in
International Journal of Multimedia Information Retrieval / Ausgabe 1/2017
Print ISSN: 2192-6611
Elektronische ISSN: 2192-662X
DOI
https://doi.org/10.1007/s13735-016-0110-y

Weitere Artikel der Ausgabe 1/2017

International Journal of Multimedia Information Retrieval 1/2017 Zur Ausgabe