Skip to main content
Erschienen in: Artificial Intelligence Review 6/2021

16.04.2021

Text detection and localization in scene images: a broad review

verfasst von: Shilpa Mahajan, Rajneesh Rani

Erschienen in: Artificial Intelligence Review | Ausgabe 6/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Nowadays, text detection and localization have gained much popularity in the field of text analysis systems as they pave the way for the number of real-time based applications like mobile transliteration technologies, assistive methods for visually impaired persons, etc. Text detection and localization techniques are used to find the position of text area in the image.This paper intends to present a broad review in this field as five-fold: (1) comparison of document images with scene images and applications of natural scene images, (2) significant and up-to-date traditional machine learning and deep learning-based approaches for the text detection and localization for different languages, (3) various publicly available benchmarked datasets, (4) comparative analysis for other benchmarked datasets and, (5) related challenges and future scope on the field. The paper summarises some of the potential ways in this field, which can serve as a useful reference for the researchers for future exploration of the area.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Ahmed SB, Naz S, Razzak MI, Yousaf R (2017a) Deep Learning based Isolated Arabic Scene Character Recognition. IEEE International Workshop on Arabic Script Analysis and Recognition pp 46–51. arxiv:1704.06821 Ahmed SB, Naz S, Razzak MI, Yousaf R (2017a) Deep Learning based Isolated Arabic Scene Character Recognition. IEEE International Workshop on Arabic Script Analysis and Recognition pp 46–51. arxiv:​1704.​06821
Zurück zum Zitat Alessi NG, Battiato S, Gallo G, Mancuso M, Stanco F (2003) Automatic discrimination of images. Proc SPIE-IS&T Electron Imag SPIE 5017(5017):351–359 Alessi NG, Battiato S, Gallo G, Mancuso M, Stanco F (2003) Automatic discrimination of images. Proc SPIE-IS&T Electron Imag SPIE 5017(5017):351–359
Zurück zum Zitat Angadi SA, Kodabagi M (2009) A texture based methodology for text region extraction from low resolution natural scene images. Int J Image Process 3(5):229–245 Angadi SA, Kodabagi M (2009) A texture based methodology for text region extraction from low resolution natural scene images. Int J Image Process 3(5):229–245
Zurück zum Zitat Chen X, Yang J, Zhang J, Waibel A (2004) Automatic detection and recognition of signs from natural scenes. IEEE Trans Image Process 13(1):87–99CrossRef Chen X, Yang J, Zhang J, Waibel A (2004) Automatic detection and recognition of signs from natural scenes. IEEE Trans Image Process 13(1):87–99CrossRef
Zurück zum Zitat Dargan S, Kumar M, Ayyagari MR, Kumar G (2019) A survey of deep learning and its applications: a new paradigm to machine learning. Archives of Computational Methods in Engineering, pp 1–22 Dargan S, Kumar M, Ayyagari MR, Kumar G (2019) A survey of deep learning and its applications: a new paradigm to machine learning. Archives of Computational Methods in Engineering, pp 1–22
Zurück zum Zitat Deng D, Liu H, Li X, Cai D (2018) PixelLink: Detecting scene text via instance segmentation. In: 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp 6773–6780. arxiv:1801.01315 Deng D, Liu H, Li X, Cai D (2018) PixelLink: Detecting scene text via instance segmentation. In: 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp 6773–6780. arxiv:​1801.​01315
Zurück zum Zitat Gao J, Yang J (2001) An Adaptive Algorithm for Text Detection from Natural Scenes. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, pp 84–89 Gao J, Yang J (2001) An Adaptive Algorithm for Text Detection from Natural Scenes. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, pp 84–89
Zurück zum Zitat Hanif SM, Prevost L (2008) Texture based text detection in natural scene images: a help to blind and visually impaired persons. CEUR Workshop Proc 415:1–6 Hanif SM, Prevost L (2008) Texture based text detection in natural scene images: a help to blind and visually impaired persons. CEUR Workshop Proc 415:1–6
Zurück zum Zitat Ji R, Xu P, Yao H, Zhang Z, Sun X, Liu T (2008) Directional correlation analysis of local haar binary pattern for text detection. In: 2008 IEEE International Conference on Multimedia and Expo, ICME 2008 - Proceedings, June 2014, pp 885–888. https://doi.org/10.1109/ICME.2008.4607577 Ji R, Xu P, Yao H, Zhang Z, Sun X, Liu T (2008) Directional correlation analysis of local haar binary pattern for text detection. In: 2008 IEEE International Conference on Multimedia and Expo, ICME 2008 - Proceedings, June 2014, pp 885–888. https://​doi.​org/​10.​1109/​ICME.​2008.​4607577
Zurück zum Zitat Katper SH, Gilal AR (2020) Deep neural networks combined with stn for multi-oriented text detection and recognition. Int J Adv Comput Sci Appl 11(4):178–184 Katper SH, Gilal AR (2020) Deep neural networks combined with stn for multi-oriented text detection and recognition. Int J Adv Comput Sci Appl 11(4):178–184
Zurück zum Zitat Kaur RP, Kumar M, Jindal MK (2019b) Newspaper text recognition of gurumukhi script using random forest classifier. Multimedia Tools and Applications pp 1–14 Kaur RP, Kumar M, Jindal MK (2019b) Newspaper text recognition of gurumukhi script using random forest classifier. Multimedia Tools and Applications pp 1–14
Zurück zum Zitat Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev, pp 1 – 62 Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev, pp 1 – 62
Zurück zum Zitat Kumar M, Jindal MK, Sharma RK, Jindal SR (2020) Performance evaluation of classifiers for the recognition of offline handwritten gurmukhi characters and numerals: a study. Artif Intell Rev 53(3):2075–2097CrossRef Kumar M, Jindal MK, Sharma RK, Jindal SR (2020) Performance evaluation of classifiers for the recognition of offline handwritten gurmukhi characters and numerals: a study. Artif Intell Rev 53(3):2075–2097CrossRef
Zurück zum Zitat Lai J, Guo L, Qiao Y, Chen X, Zhang Z (2019) Robust Text Line Detection in Equipment Nameplate Images. Proceeding of the IEEE International Conference on Robotics and Biomimetics, pp 889–894 Lai J, Guo L, Qiao Y, Chen X, Zhang Z (2019) Robust Text Line Detection in Equipment Nameplate Images. Proceeding of the IEEE International Conference on Robotics and Biomimetics, pp 889–894
Zurück zum Zitat Li X, W WW, Hou W, Liu RZ, Lu T, Yang J (2018) Shape robust text detection with progressive scale expansion network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9336–9345 Li X, W WW, Hou W, Liu RZ, Lu T, Yang J (2018) Shape robust text detection with progressive scale expansion network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9336–9345
Zurück zum Zitat Liao M, Shi B, Bai X, Wang X, Liu W (2017) TextBoxes: A fast text detector with a single deep neural network. In: 31st AAAI Conference on Artificial Intelligence, AAAI 2017, pp 4161–4167 Liao M, Shi B, Bai X, Wang X, Liu W (2017) TextBoxes: A fast text detector with a single deep neural network. In: 31st AAAI Conference on Artificial Intelligence, AAAI 2017, pp 4161–4167
Zurück zum Zitat Long S, Ruan J, Zhang W, He X (2018) TextSnake : A Flexible Representation for Detecting Text of Arbitrary Shapes. European Conference on Computer Vision pp 1–17. arxiv:arXiv:1807.01544v1 Long S, Ruan J, Zhang W, He X (2018) TextSnake : A Flexible Representation for Detecting Text of Arbitrary Shapes. European Conference on Computer Vision pp 1–17. arxiv:​arXiv:​1807.​01544v1
Zurück zum Zitat Ma Y, Wang Y (2015) Text Detection in Medical Images Using Local Feature Extraction and Supervised Learning. In: 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 61202264, pp 988–993 Ma Y, Wang Y (2015) Text Detection in Medical Images Using Local Feature Extraction and Supervised Learning. In: 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 61202264, pp 988–993
Zurück zum Zitat Mathew M, Jain M, Jawahar C (2017) Benchmarking Scene Text Recognition in Devanagari, Telugu and Malayalam. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, pp 42–46 Mathew M, Jain M, Jawahar C (2017) Benchmarking Scene Text Recognition in Devanagari, Telugu and Malayalam. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, pp 42–46
Zurück zum Zitat Naik S, Nayak S (2015) Text Detection and Character Extraction in Natural Scene Images. In: IEEE 2017 International Conference on Computing Methodologies and Communication (ICCMC), pp 1136–1141 Naik S, Nayak S (2015) Text Detection and Character Extraction in Natural Scene Images. In: IEEE 2017 International Conference on Computing Methodologies and Communication (ICCMC), pp 1136–1141
Zurück zum Zitat Raj H, Ghosh R (2014) Devanagari Text Extraction from Natural Scene Images. International Conference on Advances in Computing,Communications and Informatics (ICACCI), pp 513–517 Raj H, Ghosh R (2014) Devanagari Text Extraction from Natural Scene Images. International Conference on Advances in Computing,Communications and Informatics (ICACCI), pp 513–517
Zurück zum Zitat Rajan V, Raj S (2017) Text Detection and Character Extraction in Natural Scene Images using Fractional Poisson Model. In: Proceedings of the IEEE 2017 International Conference on Computing Methodologies and Communication, pp 1136–1141 Rajan V, Raj S (2017) Text Detection and Character Extraction in Natural Scene Images using Fractional Poisson Model. In: Proceedings of the IEEE 2017 International Conference on Computing Methodologies and Communication, pp 1136–1141
Zurück zum Zitat Shi B, Bai X, Belongie S (2017) Detecting Oriented Text in Natural Images by Linking Segments. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2550–2558 Shi B, Bai X, Belongie S (2017) Detecting Oriented Text in Natural Images by Linking Segments. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2550–2558
Zurück zum Zitat Singh H, Sharma R, Kumar R, Verma K, Kumar R, Kumar M (2019) A benchmark dataset of online handwritten gurmukhi script words and numerals. In: International Conference on Computer Vision and Image Processing, Springer, pp 457–466 Singh H, Sharma R, Kumar R, Verma K, Kumar R, Kumar M (2019) A benchmark dataset of online handwritten gurmukhi script words and numerals. In: International Conference on Computer Vision and Image Processing, Springer, pp 457–466
Zurück zum Zitat Smith MA, Kanade T (1995) Video skimming for quick browsing based on audio and image characterization. Tech. rep Smith MA, Kanade T (1995) Video skimming for quick browsing based on audio and image characterization. Tech. rep
Zurück zum Zitat Xie E, Zang Y, Shao S, Yu G, Yao C, Li G (2019) Scene Text Detection with Supervised Pyramid Context Network. The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), pp 9038–9045 Xie E, Zang Y, Shao S, Yu G, Yao C, Li G (2019) Scene Text Detection with Supervised Pyramid Context Network. The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), pp 9038–9045
Zurück zum Zitat Yang Q, Cheng M, Zhou W, Chen Y, Qiu M, Lin W (2018) Inceptext: A new inception-text module with deformable PSROI pooling for multi-oriented scene text detection. In: IJCAI International Joint Conference on Artificial Intelligence, vol 2018-July, pp 1071–1077. arxiv:arXiv:1805.01167v2 Yang Q, Cheng M, Zhou W, Chen Y, Qiu M, Lin W (2018) Inceptext: A new inception-text module with deformable PSROI pooling for multi-oriented scene text detection. In: IJCAI International Joint Conference on Artificial Intelligence, vol 2018-July, pp 1071–1077. arxiv:​arXiv:​1805.​01167v2
Zurück zum Zitat Yao C, Bai X, Liu W, Ma Y, Tu Z (2017) Detecting Texts of Arbitrary Orientations in Natural Images. Computer Vision and Pattern Recognition 8 Yao C, Bai X, Liu W, Ma Y, Tu Z (2017) Detecting Texts of Arbitrary Orientations in Natural Images. Computer Vision and Pattern Recognition 8
Zurück zum Zitat Zhang J, Cheng R, Wang K, Zhao H (2013) Research on the text detection and extraction from complex images. In: Proceedings - 4th International Conference on Emerging Intelligent Data and Web Technologies, EIDWT 2013, pp 708–713. https://doi.org/10.1109/EIDWT.2013.122 Zhang J, Cheng R, Wang K, Zhao H (2013) Research on the text detection and extraction from complex images. In: Proceedings - 4th International Conference on Emerging Intelligent Data and Web Technologies, EIDWT 2013, pp 708–713. https://​doi.​org/​10.​1109/​EIDWT.​2013.​122
Zurück zum Zitat Zhang SX, Zhu X, Hou JB, Liu C, Yang C, Wang H, Yin XC (2020) Deep relational reasoning graph network for arbitrary shape text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9699–9708 Zhang SX, Zhu X, Hou JB, Liu C, Yang C, Wang H, Yin XC (2020) Deep relational reasoning graph network for arbitrary shape text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9699–9708
Metadaten
Titel
Text detection and localization in scene images: a broad review
verfasst von
Shilpa Mahajan
Rajneesh Rani
Publikationsdatum
16.04.2021
Verlag
Springer Netherlands
Erschienen in
Artificial Intelligence Review / Ausgabe 6/2021
Print ISSN: 0269-2821
Elektronische ISSN: 1573-7462
DOI
https://doi.org/10.1007/s10462-021-10000-8

Weitere Artikel der Ausgabe 6/2021

Artificial Intelligence Review 6/2021 Zur Ausgabe

Premium Partner