Skip to main content
Erschienen in: International Journal on Document Analysis and Recognition (IJDAR) 4/2019

01.08.2019 | Original Paper

Evaluation of word spotting under improper segmentation scenario

verfasst von: Sounak Dey, Anguelos Nicolaou, Josep Lladós, Umapada Pal

Erschienen in: International Journal on Document Analysis and Recognition (IJDAR) | Ausgabe 4/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Word spotting is an important recognition task in large-scale retrieval of document collections. In most of the cases, methods are developed and evaluated assuming perfect word segmentation. In this paper, we propose an experimental framework to quantify the goodness that word segmentation has on the performance achieved by word spotting methods in identical unbiased conditions. The framework consists of generating systematic distortions on segmentation and retrieving the original queries from the distorted dataset. We have tested our framework on several established and state-of-the-art methods using George Washington and Barcelona Marriage Datasets. The experiments done allow for an estimate of the end-to-end performance of word spotting methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Efficient exemplar word spotting. In: BMVC, vol. 1, p. 3 (2012) Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Efficient exemplar word spotting. In: BMVC, vol. 1, p. 3 (2012)
3.
Zurück zum Zitat Atrey, P.K., Hossain, M.A., El Saddik, A., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: a survey. Multimed. Syst. 16(6), 345–379 (2010)CrossRef Atrey, P.K., Hossain, M.A., El Saddik, A., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: a survey. Multimed. Syst. 16(6), 345–379 (2010)CrossRef
4.
Zurück zum Zitat Balasubramanian, A., Meshesha, M., Jawahar, C.: Retrieval from document image collections. In: Document Analysis Systems, vol. 3872, pp. 1–12. Springer, Berlin (2006) Balasubramanian, A., Meshesha, M., Jawahar, C.: Retrieval from document image collections. In: Document Analysis Systems, vol. 3872, pp. 1–12. Springer, Berlin (2006)
5.
Zurück zum Zitat Bhardwaj, A., Jose, D., Govindaraju, V.: Script independent word spotting in multilingual documents. In: IJCNLP, pp. 48–54 (2008) Bhardwaj, A., Jose, D., Govindaraju, V.: Script independent word spotting in multilingual documents. In: IJCNLP, pp. 48–54 (2008)
6.
Zurück zum Zitat Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, Prague, vol. 1, pp. 1–2 (2004) Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, Prague, vol. 1, pp. 1–2 (2004)
7.
Zurück zum Zitat Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893. IEEE (2005) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893. IEEE (2005)
8.
Zurück zum Zitat Dey, S., Nicolaou, A., Llados, J., Pal, U.: Local Binary Pattern for Word Spotting in Handwritten Historical Document, pp. 574–583. Springer, Berlin (2016) Dey, S., Nicolaou, A., Llados, J., Pal, U.: Local Binary Pattern for Word Spotting in Handwritten Historical Document, pp. 574–583. Springer, Berlin (2016)
9.
Zurück zum Zitat Diaconis, P., Graham, R.L.: Spearman’s footrule as a measure of disarray. J. R. Stat. Soc. Ser. B (Methodol.) 39, 262–268 (1977)MathSciNetMATH Diaconis, P., Graham, R.L.: Spearman’s footrule as a measure of disarray. J. R. Stat. Soc. Ser. B (Methodol.) 39, 262–268 (1977)MathSciNetMATH
10.
Zurück zum Zitat Fernández-Mota, D., Almazán, J., Cirera, N., Fornés, A., Lladós, J.: Bh2m: the Barcelona historical, handwritten marriages database. In: 2014 22nd International Conference on Pattern Recognition (ICPR), pp. 256–261. IEEE (2014) Fernández-Mota, D., Almazán, J., Cirera, N., Fornés, A., Lladós, J.: Bh2m: the Barcelona historical, handwritten marriages database. In: 2014 22nd International Conference on Pattern Recognition (ICPR), pp. 256–261. IEEE (2014)
11.
Zurück zum Zitat Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit. Lett. 33(7), 934–942 (2012)CrossRef Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit. Lett. 33(7), 934–942 (2012)CrossRef
12.
Zurück zum Zitat Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 211–224 (2012)CrossRef Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 211–224 (2012)CrossRef
13.
Zurück zum Zitat Garz, A., Fischer, A., Sablatnig, R., Bunke, H.: Binarization-free text line segmentation for historical documents based on interest point clustering. In: 2012 10th IAPR International Workshop on Document Analysis Systems, pp. 95–99 (2012). https://doi.org/10.1109/DAS.2012.23 Garz, A., Fischer, A., Sablatnig, R., Bunke, H.: Binarization-free text line segmentation for historical documents based on interest point clustering. In: 2012 10th IAPR International Workshop on Document Analysis Systems, pp. 95–99 (2012). https://​doi.​org/​10.​1109/​DAS.​2012.​23
14.
Zurück zum Zitat Gatos, B., Pratikakis, I.: Segmentation-free word spotting in historical printed documents. In: 2009 10th International Conference on Document Analysis and Recognition, ICDAR’09, pp. 271–275. IEEE (2009) Gatos, B., Pratikakis, I.: Segmentation-free word spotting in historical printed documents. In: 2009 10th International Conference on Document Analysis and Recognition, ICDAR’09, pp. 271–275. IEEE (2009)
16.
Zurück zum Zitat Ghosh, S., Valveny, E.: Text box proposals for handwritten word spotting from documents. Int. J. Doc. Anal. Recognit. (IJDAR) 21(1–2), 91–108 (2018)CrossRef Ghosh, S., Valveny, E.: Text box proposals for handwritten word spotting from documents. Int. J. Doc. Anal. Recognit. (IJDAR) 21(1–2), 91–108 (2018)CrossRef
17.
Zurück zum Zitat Howe, N.R., Rath, T.M., Manmatha, R.: Boosted decision trees for word recognition in handwritten document retrieval. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 377–383. ACM (2005) Howe, N.R., Rath, T.M., Manmatha, R.: Boosted decision trees for word recognition in handwritten document retrieval. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 377–383. ACM (2005)
18.
Zurück zum Zitat Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanov, A., Iwamura, M., Matas, J., Neumann, L., Chandrasekhar, V.R., Lu, S., et al.: ICDAR 2015 competition on robust reading. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 1156–1160. IEEE (2015) Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanov, A., Iwamura, M., Matas, J., Neumann, L., Chandrasekhar, V.R., Lu, S., et al.: ICDAR 2015 competition on robust reading. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 1156–1160. IEEE (2015)
19.
Zurück zum Zitat Kim, G., Govindaraju, V.: A lexicon driven approach to handwritten word recognition for real-time applications. IEEE Trans. Pattern Anal. Mach. Intell. 19(4), 366–379 (1997)CrossRef Kim, G., Govindaraju, V.: A lexicon driven approach to handwritten word recognition for real-time applications. IEEE Trans. Pattern Anal. Mach. Intell. 19(4), 366–379 (1997)CrossRef
20.
Zurück zum Zitat Konidaris, T., Gatos, B., Ntzios, K., Pratikakis, I., Theodoridis, S., Perantonis, S.J.: Keyword-guided word spotting in historical printed documents using synthetic data and user feedback. Int. J. Doc. Anal. Recognit. (IJDAR) 9(2–4), 167–177 (2007)CrossRef Konidaris, T., Gatos, B., Ntzios, K., Pratikakis, I., Theodoridis, S., Perantonis, S.J.: Keyword-guided word spotting in historical printed documents using synthetic data and user feedback. Int. J. Doc. Anal. Recognit. (IJDAR) 9(2–4), 167–177 (2007)CrossRef
21.
Zurück zum Zitat Lee, J.J., Lee, P.H., Lee, S.W., Yuille, A., Koch, C.: Adaboost for text detection in natural scene. In: 2011 International Conference on Document Analysis and Recognition, pp. 429–434. IEEE (2011) Lee, J.J., Lee, P.H., Lee, S.W., Yuille, A., Koch, C.: Adaboost for text detection in natural scene. In: 2011 International Conference on Document Analysis and Recognition, pp. 429–434. IEEE (2011)
22.
Zurück zum Zitat Liang, Y., Fairhurst, M.C., Guest, R.M.: A synthesised word approach to word retrieval in handwritten documents. Pattern Recognit. 45(12), 4225–4236 (2012)CrossRef Liang, Y., Fairhurst, M.C., Guest, R.M.: A synthesised word approach to word retrieval in handwritten documents. Pattern Recognit. 45(12), 4225–4236 (2012)CrossRef
23.
Zurück zum Zitat Likforman-Sulem, L., Zahour, A., Taconet, B.: Text line segmentation of historical documents: a survey. Int. J. Doc. Anal. Recognit. (IJDAR) 9(2–4), 123–138 (2007)CrossRef Likforman-Sulem, L., Zahour, A., Taconet, B.: Text line segmentation of historical documents: a survey. Int. J. Doc. Anal. Recognit. (IJDAR) 9(2–4), 123–138 (2007)CrossRef
24.
Zurück zum Zitat Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C.: Text line and word segmentation of handwritten documents. Pattern Recognit. 42(12), 3169–3183 (2009)CrossRef Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C.: Text line and word segmentation of handwritten documents. Pattern Recognit. 42(12), 3169–3183 (2009)CrossRef
25.
Zurück zum Zitat Manmatha, R., Rothfeder, J.L.: A scale space approach for automatically segmenting words from historical handwritten documents. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1212–1225 (2005)CrossRef Manmatha, R., Rothfeder, J.L.: A scale space approach for automatically segmenting words from historical handwritten documents. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1212–1225 (2005)CrossRef
26.
Zurück zum Zitat Rath, T.M., Manmatha, R.: Features for word spotting in historical manuscripts. In: 2003 Proceedings. Seventh International Conference on Document Analysis and Recognition, pp. 218–222. IEEE (2003) Rath, T.M., Manmatha, R.: Features for word spotting in historical manuscripts. In: 2003 Proceedings. Seventh International Conference on Document Analysis and Recognition, pp. 218–222. IEEE (2003)
27.
Zurück zum Zitat Rath, T.M., Manmatha, R.: Word image matching using dynamic time warping. In: Proceedings. 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. II–521. IEEE (2003) Rath, T.M., Manmatha, R.: Word image matching using dynamic time warping. In: Proceedings. 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. II–521. IEEE (2003)
28.
Zurück zum Zitat Rodriguez-Serrano, J., Perronnin, F., et al.: A model-based sequence similarity with application to handwritten word spotting. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2108–2120 (2012)CrossRef Rodriguez-Serrano, J., Perronnin, F., et al.: A model-based sequence similarity with application to handwritten word spotting. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2108–2120 (2012)CrossRef
29.
Zurück zum Zitat Rothacker, L., Sudholt, S., Rusakov, E., Kasperidus, M., Fink, G.A.: Word hypotheses for segmentation-free word spotting in historic document images. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1174–1179. IEEE (2017) Rothacker, L., Sudholt, S., Rusakov, E., Kasperidus, M., Fink, G.A.: Word hypotheses for segmentation-free word spotting in historic document images. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1174–1179. IEEE (2017)
30.
Zurück zum Zitat Rothfeder, J.L., Feng, S., Rath, T.M.: Using corner feature correspondences to rank word images by similarity. In: 2003 Conference on Computer Vision and Pattern Recognition Workshop, CVPRW’03, vol. 3, pp. 30–30. IEEE (2003) Rothfeder, J.L., Feng, S., Rath, T.M.: Using corner feature correspondences to rank word images by similarity. In: 2003 Conference on Computer Vision and Pattern Recognition Workshop, CVPRW’03, vol. 3, pp. 30–30. IEEE (2003)
31.
Zurück zum Zitat Rusinol, M., Aldavert, D., Toledo, R., Lladós, J.: Browsing heterogeneous document collections by a segmentation-free word spotting method. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 63–67. IEEE (2011) Rusinol, M., Aldavert, D., Toledo, R., Lladós, J.: Browsing heterogeneous document collections by a segmentation-free word spotting method. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 63–67. IEEE (2011)
32.
Zurück zum Zitat Rusinol, M., Aldavert, D., Toledo, R., Lladós, J.: Efficient segmentation-free keyword spotting in historical document collections. Pattern Recognit. 48(2), 545–555 (2015)CrossRef Rusinol, M., Aldavert, D., Toledo, R., Lladós, J.: Efficient segmentation-free keyword spotting in historical document collections. Pattern Recognit. 48(2), 545–555 (2015)CrossRef
33.
Zurück zum Zitat Sidiropoulos, P., Vrochidis, S., Kompatsiaris, I.: Content-based binary image retrieval using the adaptive hierarchical density histogram. Pattern Recognit. 44(4), 739–750 (2011)CrossRef Sidiropoulos, P., Vrochidis, S., Kompatsiaris, I.: Content-based binary image retrieval using the adaptive hierarchical density histogram. Pattern Recognit. 44(4), 739–750 (2011)CrossRef
34.
Zurück zum Zitat Srihari, S., Srinivasan, H., Babu, P., Bhole, C.: Spotting words in handwritten Arabic documents. In: Electronic Imaging 2006, pp. 606–702. International Society for Optics and Photonics (2006) Srihari, S., Srinivasan, H., Babu, P., Bhole, C.: Spotting words in handwritten Arabic documents. In: Electronic Imaging 2006, pp. 606–702. International Society for Optics and Photonics (2006)
35.
Zurück zum Zitat Stamatopoulos, N., Gatos, B., Louloudis, G., Pal, U., Alaei, A.: Icdar 2013 handwriting segmentation contest. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1402–1406. IEEE (2013) Stamatopoulos, N., Gatos, B., Louloudis, G., Pal, U., Alaei, A.: Icdar 2013 handwriting segmentation contest. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1402–1406. IEEE (2013)
36.
Zurück zum Zitat Sudholt, S., Fink, G.A.: Phocnet: a deep convolutional neural network for word spotting in handwritten documents (2016). arXiv preprint arXiv:1604.00187 Sudholt, S., Fink, G.A.: Phocnet: a deep convolutional neural network for word spotting in handwritten documents (2016). arXiv preprint arXiv:​1604.​00187
37.
Zurück zum Zitat Terasawa, K., Tanaka, Y.: Slit style hog feature for document image word spotting. In: 2009 10th International Conference on Document Analysis and Recognition, ICDAR’09, pp. 116–120. IEEE (2009) Terasawa, K., Tanaka, Y.: Slit style hog feature for document image word spotting. In: 2009 10th International Conference on Document Analysis and Recognition, ICDAR’09, pp. 116–120. IEEE (2009)
38.
Zurück zum Zitat Vamvakas, G., Gatos, B., Perantonis, S.J.: Handwritten character recognition through two-stage foreground sub-sampling. Pattern Recognit. 43(8), 2807–2816 (2010)CrossRef Vamvakas, G., Gatos, B., Perantonis, S.J.: Handwritten character recognition through two-stage foreground sub-sampling. Pattern Recognit. 43(8), 2807–2816 (2010)CrossRef
39.
Zurück zum Zitat Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)CrossRef Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)CrossRef
40.
Zurück zum Zitat Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1457–1464. IEEE (2011) Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1457–1464. IEEE (2011)
Metadaten
Titel
Evaluation of word spotting under improper segmentation scenario
verfasst von
Sounak Dey
Anguelos Nicolaou
Josep Lladós
Umapada Pal
Publikationsdatum
01.08.2019
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal on Document Analysis and Recognition (IJDAR) / Ausgabe 4/2019
Print ISSN: 1433-2833
Elektronische ISSN: 1433-2825
DOI
https://doi.org/10.1007/s10032-019-00338-9

Weitere Artikel der Ausgabe 4/2019

International Journal on Document Analysis and Recognition (IJDAR) 4/2019 Zur Ausgabe