nach oben

Pattern Analysis and Applications

Erschienen in:

01.11.2013 | Theoretical Advances

Text detection in street level images

verfasst von: Jonathan Fabrizio, Beatriz Marcotegui, Matthieu Cord

Erschienen in: Pattern Analysis and Applications | Ausgabe 4/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Text detection system for natural images is a very challenging task in Computer Vision. Image acquisition introduces distortion in terms of perspective, blurring, illumination, and characters which may have very different shape, size, and color. We introduce in this article a full text detection scheme. Our architecture is based on a new process to combine a hypothesis generation step to get potential boxes of text and a hypothesis validation step to filter false detections. The hypothesis generation process relies on a new efficient segmentation method based on a morphological operator. Regions are then filtered and classified using shape descriptors based on Fourier, Pseudo Zernike moments and an original polar descriptor, which is invariant to rotation. Classification process relies on three SVM classifiers combined in a late fusion scheme. Detected characters are finally grouped to generate our text box hypotheses. Validation step is based on a global SVM classification of the box content using dedicated descriptors adapted from the HOG approach. Results on the well-known ICDAR database are reported showing that our method is competitive. Evaluation protocol and metrics are deeply discussed and results on a very challenging street-level database are also proposed.

Vorheriger Artikel A new invariant descriptor for action recognition based on spherical harmonics

Nächster Artikel A shape-based similarity measure for time series data with ensemble learning

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

f anti-extensive ⇔ \(f(X) \subset X\).

f extensive ⇔ \(X \subset f(X)\).

The smaller p, the more probably pixels are assigned to the high value areas (Fig. 5).

French national geographic institute IGN [20].

Image-based Town On-live Web Navigation and Search engine [22], a project funded by the ANR (French National Research Agency [1]) and the french consortium Cap Digital. The first goal is to allow users to navigate freely within the image flow of a city and the second is to automatically enhance cartographic databases by extracting features from this image flow.

A solution would be to do harsh annotation with three classes instead of two: one for the text, one for non-text and the last one for unreadable text. The system is not penalized, whether it detects unreadable text or not.

Following the protocol is important. Using ICDAR database, but changing the protocol can have a significant impact on the performance evaluation.

Our run gets results for the original and sub-sampled images to catch all text data sizes.

The french national research agency (anr). http://www.agence-nationale-recherche.fr/Intl

Arth C, Limberger F, Bischof H (2007) Real-time license plate recognition on an embedded DSP-platform. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR ’07) pp 1–8

Beucher S (2007) Numerical residues. Image Vis Comput 25(4):405–415. doi:10.1016/j.imavis.2006.07.020

Breen EJ, Jones R (1996) Attribute openings, thinnings, and granulometries. Comput Vis Image Underst 64(3):377–389CrossRef

Chehdi K, Coquin D (1991) Binarisation d’images par seuillage local optimal maximisant un critre d’homognite. GRETSI

Chen D, Odobez J, Thiran J (2004) A localization/verification scheme for finding text in images and video frames based on contrast independent features and machine learning method. Image Commun 19(3):205–217

Chen X, Yuille AL (2004) Detecting and reading text in natural scenes. IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2:366–373. doi:10.1109/CVPR.2004.77

Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATH

Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE CVPR IEEE Computer Society, pp 886–893

10.

Ezaki N, Bulacu M, Schomaker L (2004) Text detection from natural scene images: Towards a system for visually impaired persons. In: 17th International conference on pattern recognition, vol 2, pp 683–686

11.

Fabrizio J, Cord M, Marcotegui B (2009) Text extraction from street level images isprs workshop cmrt. ISPRS Workshop

12.

Fabrizio J, Marcotegui B (2009) Fast implementation of the ultimate opening. International symposium on mathematical morphology. pp 272–281

13.

Fabrizio J, Marcotegui B, Cord M (2009) Text segmentation in natural scenes using toggle-mapping. 2009 IEEE International Conference on Image Processing

14.

Garcia WC, Apostolidis X (2000) Text detection and segmentation in complex color images. In: IEEE International Conference on Acoustic Speech, Signal Processing, pp 2326–2329. IEEE Computer Society

15.

Gatos B, Ntirogiannis K, Pratikakis I (2009) Icdar 2009 document image binarization contest (dibco 2009). International Conference on Document Analysis and Recognition

16.

Gatos B, Ntirogiannis K, Pratikakis I (2010) DIBCO 2009: document image binarization contest. Int J Doc Anal Recognit. doi:10.1007/s10032-010-0115-7

17.

Gosselin P, Cord M (2008) Active learning methods for interactive image retrieval. IEEE Trans Image Process 17(7):1200–1211MathSciNetCrossRef

18.

Hanif SM, Prevost L (2007) Texture based text detection in natural scene images—a help to blind and visually impaired persons. In: Conference on assistive technologies for people with vision and hearing impairments

19.

ICDAR: Robust reading and locating database (2003). http://algoval.essex.ac.uk/icdar/TextLocating.html

20.

Institut géographique national (ign). http://www.ign.fr

21.

Imageval (2006). http://www.imageval.org

22.

Anr itowns project. http://www.itowns.fr

23.

Joachims: svmlight. http://svmlight.joachims.org/

24.

Joachims T (1999) Making large-scale svm learning practical. Advances in kernel methods: support vector learning. pp 169–184

25.

Jung C, Liu Q, Kim J (2009) A stroke filter and its application to text localization. Pattern Recogn Lett 30(2):114–122. doi:10.1016/j.patrec.2008.05.014

26.

Jung K, Kim K, Jain A (2004) Text information extraction in images and video: a survey. Pattern Recogn Lett 37(5):977–997CrossRef

27.

Kavallieratou E, Balcan D, Popa M, Fakotakis N (2001) Handwritten text localization in skewed documents. In: International conference on image processing, pp. I: 1102–1105

28.

Kuncheva L (2004) Combining pattern classifiers. methods and algorithms. Wiley, Hoboken

29.

Liang J, Doermann D, Li H (2005) Camera-based analysis of text and documents: a survey. Int J Doc Anal Recogn 7(2–3):83 – 104

30.

Lienhart R, Effelsberg W (2000) Automatic text segmentation and text recognition for video indexing. Multim Syst 8(1):69–81. doi:10.1007/s005300050006

31.

Liu Q, Jung C, Kim S, Moon Y, yeun Kim J (2006) Stroke filter for text localization in video images. IEEE international conference on image processing

32.

Liu X, Samarabandu J (2006) Multiscale edge based text extraction from complex images. In: International conference on multimedia expo, pp 1721–1724

33.

Lucas S (2005) Icdar 2005 text locating competition results. Eight international conference on document analysis and recognition

34.

Mancas-Thillou C (2006) Natural scene text understanding. Ph.D. thesis, TCTS Lab of the Facult Polytechnique de Mons, Belgium

35.

Niblack W (1986) An introduction to image processing. Prentice-Hall, Englewood Cliffs

36.

Otsu N (1979) A threshold selection method from gray level histogram. IEEE Trans Syst Man Cybern 9:62–66CrossRef

37.

Palumbo PW, Srihari SN, Soh J, Sridhar R, Demjanenko V (1992) Postal address block location in real time. Computer 25(7):34–42. doi:10.1109/2.144438

38.

Pan W, Bui TD, Suen CY (2009) Text detection from natural scene images using topographic maps and sparse representations. In: IEEE ICIP. IEEE Computer Society

39.

Pazio M, Niedwiecki M, Kowalik R, Lebied J (2007) Text detection system for the blind. 15th european signal processing conference EUSIPCO, pp 272–276

40.

Retornaz T (2007) Détection de textes enfouis dans des bases d’images généralistes. un descripteur sémantique pour l’indexation. Ph.D. thesis, Ecole Nationale Suprieure des Mines de Paris—C.M.M., Fontainebleau

41.

Retornaz T, Marcotegui B (2007) Scene text localization based on the ultimate opening. Int Symp Math Morphol 1:177–188

42.

Sauvola J, Inen MP (2000) Adaptive document image binarization. Pattern Recogn Lett 33:225–236CrossRef

43.

Sauvola JJ, Seppänen T, Haapakoski S, Pietikäinen M (1997) Adaptive document binarization. In: ICDAR ’97: Proceedings of the 4th International Conference on Document Analysis and Recognition, pp 147–152. IEEE Computer Society, Washington, DC

44.

Seeger M, Dance C (2001) Binarising camera images for ocr. Proceeding of sixth international conference on document analysis and recognition (ICDAR)

45.

Serra J (1989) Toggle mappings. From pixels to features. In: Simon JC (ed), Elsevier, North-Holland. pp 61–72

46.

Sezgin M, Sankur B (2004) Survey over image thresholding techniques and quantitative performance evaluation. J Electron Imaging 13(1):146–165CrossRef

47.

Shafait F, Keysers D, Breuel TM (2008) Efficient implementation of local adaptive thresholding techniques using integral images. In: Document Recognition and Retrieval XV. San Jose

48.

Szumilas L (2008) Scale and rotation invariant shape matching. Ph.D. thesis, Technische universität wien fakultät für informatik

49.

Trier OD, Jain AK, Taxt T (1996) Feature extraction methods for character recognition-a survey. Pattern Recogn 29(4):641–662. doi:10.1016/0031-3203(95)00118-2

50.

Viola P, Jones M (2001) Robust real-time object detection. Int J Comput Vis

51.

Wahl F, Wong K, Casey R (1982) Block segmentation and text extraction in mixed text/image documents. Comput Graph Image Process 20(4):375–390CrossRef

52.

Wolf C, michel Jolion J, Chassaing F (2002) Text localization, enhancement and binarization in multimedia documents. In: Proceedings of the international conference on pattern recognition (ICPR) 2002, pp 1037–1040

53.

Wolf C, Jolion JM (2006) Object count/area graphs for the evaluation of object detection and segmentation algorithms. IJDAR 8(4):280–296CrossRef

54.

Xiao Y, Yan H (2003) Text region extraction in a document image based on the delaunay tessellation. Pattern Recogn Lett 36(3):799–809MathSciNetCrossRefMATH

55.

Zhao XK, Lin YF, Hu Y Liu YTH (2011) Text from corners: a novel approach to detect text and caption in videos. IEEE Trans Image Process 20(3):790–799

56.

Zhu KF Qi, RJ, Xu L, Kimachi M, Wu Y, Aziwa T (2005) Using adaboost to detect and segment characters from natural scenes. In: Proceedings of CBDAR, ICDAR Workshop

Titel: Text detection in street level images
verfasst von: Jonathan Fabrizio
Beatriz Marcotegui
Matthieu Cord
Publikationsdatum: 01.11.2013
Verlag: Springer London
Erschienen in: Pattern Analysis and Applications / Ausgabe 4/2013
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI: https://doi.org/10.1007/s10044-013-0329-7

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 4/2013

A hierarchical and scalable model for contemporary document image segmentation

Automatic authentication of color laser print-outs using machine identification codes

Joint segmentation and pairing of multispectral chromosome images

Evaluation of an adaptive genetic-based signature extraction system for network intrusion detection

Unsupervised colour image segmentation by low-level perceptual grouping

An empirical comparison of learning algorithms for nonparametric scoring: the TreeRank algorithm and other methods