Skip to main content
Top
Published in: International Journal on Document Analysis and Recognition (IJDAR) 1-2/2018

27-04-2018 | Original Paper

Text box proposals for handwritten word spotting from documents

Authors: Suman Ghosh, Ernest Valveny

Published in: International Journal on Document Analysis and Recognition (IJDAR) | Issue 1-2/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this article, we propose a new approach to segmentation-free word spotting that is based on the combination of three different contributions. Firstly, inspired by the success of bounding box proposal algorithms in object recognition, we propose a scheme to generate a set of word-independent text box proposals. For that, we generate a set of atomic bounding boxes based on simple connected component analysis that are combined using a set of spatial constraints in order to generate the final set of text box proposals. Secondly, an attribute representation based on the Pyramidal Histogram of Characters (PHOC) is encoded in an integral image and used to efficiently evaluate text box proposals for retrieval. Thirdly, we also propose an indexing scheme for fast retrieval based on character n-grams. For the generation of the index a similar attribute space based on a Pyramidal Histogram of Character N-grams (PHON) is used. All attribute models are learned using linear SVMs over the Fisher Vector representation of the word images along with the PHOC or PHON labels of the corresponding words. We show the performance of the proposed approach in both tasks of query-by-string and query-by-example in standard single- and multi-writer data sets, reporting state-of-the-art results.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
We understand by segmentation-free the ability of the method to search a word in a whole non-segmented page as opposed to a segmentation-based scenario where retrieval is performed on segmented word images.
 
Literature
1.
go back to reference Plamondon, R., Srihari, S.: On-line and offline handwriting recognition: a comprehensive survey. IEEE Trans. PAMI 22(1), 63–84 (2000)CrossRef Plamondon, R., Srihari, S.: On-line and offline handwriting recognition: a comprehensive survey. IEEE Trans. PAMI 22(1), 63–84 (2000)CrossRef
2.
go back to reference Vinciarelli, A., Bengio, S., Bunke, H.: Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans. PAMI 26, 709–720 (2004)CrossRef Vinciarelli, A., Bengio, S., Bunke, H.: Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans. PAMI 26, 709–720 (2004)CrossRef
3.
go back to reference Rodríguez-Serrano, J., Perronnin, F.: Local gradient histogram features for word spotting in unconstrained hand-written documents. In: ICFHR (2008) Rodríguez-Serrano, J., Perronnin, F.: Local gradient histogram features for word spotting in unconstrained hand-written documents. In: ICFHR (2008)
4.
go back to reference Likforman-Sulem, L., Zahour, A., Taconet, B.: Text line segmentation of historical documents: a survey. Int. J. Doc. Anal. Recognit. 9, 123 (2007)CrossRef Likforman-Sulem, L., Zahour, A., Taconet, B.: Text line segmentation of historical documents: a survey. Int. J. Doc. Anal. Recognit. 9, 123 (2007)CrossRef
5.
go back to reference Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C.: Text line and word segmentation of handwritten documents. Pattern Recognit. 42, 3169 (2009)CrossRefMATH Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C.: Text line and word segmentation of handwritten documents. Pattern Recognit. 42, 3169 (2009)CrossRefMATH
6.
go back to reference Leydier, Y., Ouji, A., Lebourgeois, F., Emptoz, H.: Towards an omnilingual word retrieval system for ancient manuscripts. Pattern Recognit. 42, 2089–2105 (2009)CrossRefMATH Leydier, Y., Ouji, A., Lebourgeois, F., Emptoz, H.: Towards an omnilingual word retrieval system for ancient manuscripts. Pattern Recognit. 42, 2089–2105 (2009)CrossRefMATH
7.
go back to reference Zhang, X., Tan, C.L.: Segmentation-free keyword spotting for handwritten documents based on heat kernel signature. In: International Conference on Document Analysis and Recognition, pp. 827–831 (2013) Zhang, X., Tan, C.L.: Segmentation-free keyword spotting for handwritten documents based on heat kernel signature. In: International Conference on Document Analysis and Recognition, pp. 827–831 (2013)
8.
go back to reference Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. PAMI 34, 211 (2012)CrossRef Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. PAMI 34, 211 (2012)CrossRef
9.
go back to reference Rusinol, M., Aldavert, D., Toledo, R., Lladós, J.: Efficient segmentation-free keyword spotting in historical document collections. Pattern Recognit. 48, 545–555 (2014)CrossRef Rusinol, M., Aldavert, D., Toledo, R., Lladós, J.: Efficient segmentation-free keyword spotting in historical document collections. Pattern Recognit. 48, 545–555 (2014)CrossRef
10.
go back to reference Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, European Conference on Computer Vision, pp. 1–22 (2004) Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, European Conference on Computer Vision, pp. 1–22 (2004)
11.
go back to reference Rothacker, L., Rusiñol, M., Fink, G.A.: Bag-of-features HMMs for segmentation-free word spotting in handwritten documents. In: International Conference on Document Analysis and Recognition, pp. 1305–1309 (2013) Rothacker, L., Rusiñol, M., Fink, G.A.: Bag-of-features HMMs for segmentation-free word spotting in handwritten documents. In: International Conference on Document Analysis and Recognition, pp. 1305–1309 (2013)
12.
go back to reference Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Segmentation-free word spotting with exemplar SVMs. Pattern Recognit. 47, 3967 (2014)CrossRef Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Segmentation-free word spotting with exemplar SVMs. Pattern Recognit. 47, 3967 (2014)CrossRef
13.
go back to reference Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Word spotting and recognition with embedded attributes. IEEE Trans. PAMI 36, 2552 (2014)CrossRef Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Word spotting and recognition with embedded attributes. IEEE Trans. PAMI 36, 2552 (2014)CrossRef
14.
go back to reference Sudholt, S., Fink, G.A.: PHOCNet: a deep convolutional neural network for word spotting in handwritten documents. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 277–282 (2016) Sudholt, S., Fink, G.A.: PHOCNet: a deep convolutional neural network for word spotting in handwritten documents. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 277–282 (2016)
15.
go back to reference Poznanski, A., Wolf, L.: CNN-N-gram for handwriting word recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2305–2314 (2016) Poznanski, A., Wolf, L.: CNN-N-gram for handwriting word recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2305–2314 (2016)
16.
go back to reference Ghosh, S.K., Valveny, E.: A sliding window framework for word spotting based on word attributes. In: Iberian Conference on Pattern Recognition and Image Analysis, pp. 652–661 (2015) Ghosh, S.K., Valveny, E.: A sliding window framework for word spotting based on word attributes. In: Iberian Conference on Pattern Recognition and Image Analysis, pp. 652–661 (2015)
17.
go back to reference Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from Edges. In: ECCV (2014) Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from Edges. In: ECCV (2014)
18.
go back to reference Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vis. 104, 154 (2013)CrossRef Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vis. 104, 154 (2013)CrossRef
19.
go back to reference Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. IJCV 116, 1 (2015)MathSciNetCrossRef Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. IJCV 116, 1 (2015)MathSciNetCrossRef
20.
go back to reference Kovalchuk, A., Wolf, L, Dershowitz, N.: A simple and fast word spotting method. In: International Conference on Frontiers in Handwriting Recognition (2014) Kovalchuk, A., Wolf, L, Dershowitz, N.: A simple and fast word spotting method. In: International Conference on Frontiers in Handwriting Recognition (2014)
21.
go back to reference Wilkinson, T., Brun, A.: A novel word segmentation method based on object detection and deep learning. In: International Symposium of Advances in Visual Computing, pp. 231–240 (2015) Wilkinson, T., Brun, A.: A novel word segmentation method based on object detection and deep learning. In: International Symposium of Advances in Visual Computing, pp. 231–240 (2015)
22.
go back to reference Ghosh, S.K., Valveny, E.: Query by string word spotting based on character bi-gram indexing. In: 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 881–885 (2015) Ghosh, S.K., Valveny, E.: Query by string word spotting based on character bi-gram indexing. In: 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 881–885 (2015)
23.
go back to reference Santosh, K.C.: g-DICE: graph mining-based document information content exploitation. Int. J. Doc. Anal. Recognit. (IJDAR) 18, 337–355 (2015)CrossRef Santosh, K.C.: g-DICE: graph mining-based document information content exploitation. Int. J. Doc. Anal. Recognit. (IJDAR) 18, 337–355 (2015)CrossRef
24.
go back to reference Hassan, Tamir: User-guided wrapping of pdf documents using graph matching techniques. In: International Conference on Document Analysis and Recognition, pp. 631–635 (2009) Hassan, Tamir: User-guided wrapping of pdf documents using graph matching techniques. In: International Conference on Document Analysis and Recognition, pp. 631–635 (2009)
25.
go back to reference Leslie, C., Eskin, E., Noble, W.: The spectrum kernel: a string kernel for SVM protein classification. In: Pacific Symposium on Biocomputing (2002) Leslie, C., Eskin, E., Noble, W.: The spectrum kernel: a string kernel for SVM protein classification. In: Pacific Symposium on Biocomputing (2002)
26.
go back to reference Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. JMLR 2, 419 (2002)MATH Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. JMLR 2, 419 (2002)MATH
27.
go back to reference Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher kernel for large-scale image classification. In: ECCV (2010) Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher kernel for large-scale image classification. In: ECCV (2010)
28.
go back to reference Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE CVPR (2005) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE CVPR (2005)
29.
go back to reference Jones, M.N., Mewor, D.J.K.: Case-sensitive letter and bigram frequency counts from large-scale English corpora. Behav Res Methods Instrum Comput 36, 388 (2004)CrossRef Jones, M.N., Mewor, D.J.K.: Case-sensitive letter and bigram frequency counts from large-scale English corpora. Behav Res Methods Instrum Comput 36, 388 (2004)CrossRef
30.
go back to reference Rath, T., Manmatha, R.: Word spotting for historical documents. IJDAR 9, 139 (2007)CrossRef Rath, T., Manmatha, R.: Word spotting for historical documents. IJDAR 9, 139 (2007)CrossRef
31.
go back to reference Marti, U.-V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. IJDAR 5, 39 (2002)CrossRefMATH Marti, U.-V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. IJDAR 5, 39 (2002)CrossRefMATH
32.
go back to reference Romero, V., Fornés, A., Serrano, N., Sánchez, J.A., Toselli, A.H., Frinken, V., Vidal, E., Lladós, J.: The ESPOSALLES Database: an ancient marriage license corpus for off-line handwriting recognition. Pattern Recognit. (PR) 46, 1658 (2013)CrossRef Romero, V., Fornés, A., Serrano, N., Sánchez, J.A., Toselli, A.H., Frinken, V., Vidal, E., Lladós, J.: The ESPOSALLES Database: an ancient marriage license corpus for off-line handwriting recognition. Pattern Recognit. (PR) 46, 1658 (2013)CrossRef
33.
go back to reference Leydier, Y., Lebourgeois, F., Emptoz, H.: Text search for medieval manuscript images. Pattern Recognit. 40(12), 3552 (2007)CrossRefMATH Leydier, Y., Lebourgeois, F., Emptoz, H.: Text search for medieval manuscript images. Pattern Recognit. 40(12), 3552 (2007)CrossRefMATH
34.
go back to reference Terasawa, K., Tanaka, Y.: Slit style HOG feature for document image word spotting. In: Proceedings of the International Conference on Document Analysis and Recognition (2009) Terasawa, K., Tanaka, Y.: Slit style HOG feature for document image word spotting. In: Proceedings of the International Conference on Document Analysis and Recognition (2009)
35.
go back to reference Wilkinson, T., Lindström, J., Brun, A.: Neural Ctrl-F: segmentation-free query-by-string word spotting in handwritten manuscript collections. In: IEEE Conference on Computer Vision (2017) Wilkinson, T., Lindström, J., Brun, A.: Neural Ctrl-F: segmentation-free query-by-string word spotting in handwritten manuscript collections. In: IEEE Conference on Computer Vision (2017)
36.
go back to reference Rothacker, L., Fink, G.A.: Segmentation-free query-by-string word spotting with bag-of-features HMMs. In: 13th International Conference on Document Analysis and Recognition (ICDAR) (2015) Rothacker, L., Fink, G.A.: Segmentation-free query-by-string word spotting with bag-of-features HMMs. In: 13th International Conference on Document Analysis and Recognition (ICDAR) (2015)
37.
go back to reference Liang, Y., Fairhurst, M., Guest, R.: A synthesised word approach to word retrieval in handwritten documents. Pattern Recognit. 45, 4225 (2012)CrossRef Liang, Y., Fairhurst, M., Guest, R.: A synthesised word approach to word retrieval in handwritten documents. Pattern Recognit. 45, 4225 (2012)CrossRef
38.
go back to reference Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit. Lett. 33, 934 (2012)CrossRef Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit. Lett. 33, 934 (2012)CrossRef
39.
go back to reference Krishnan, P., Jawahar, C.V.: Matching handwritten document images. In: ECCV, pp. 766–782 (2016) Krishnan, P., Jawahar, C.V.: Matching handwritten document images. In: ECCV, pp. 766–782 (2016)
Metadata
Title
Text box proposals for handwritten word spotting from documents
Authors
Suman Ghosh
Ernest Valveny
Publication date
27-04-2018
Publisher
Springer Berlin Heidelberg
Published in
International Journal on Document Analysis and Recognition (IJDAR) / Issue 1-2/2018
Print ISSN: 1433-2833
Electronic ISSN: 1433-2825
DOI
https://doi.org/10.1007/s10032-018-0300-7

Other articles of this Issue 1-2/2018

International Journal on Document Analysis and Recognition (IJDAR) 1-2/2018 Go to the issue

Premium Partner