Skip to main content
Erschienen in: Artificial Intelligence Review 5/2021

01.01.2021

Deep learning approaches to scene text detection: a comprehensive review

verfasst von: Tauseef Khan, Ram Sarkar, Ayatullah Faruk Mollah

Erschienen in: Artificial Intelligence Review | Ausgabe 5/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In recent times, text detection in the wild has significantly raised its ability due to tremendous success of deep learning models. Applications of computer vision have emerged and got reshaped in a new way in this booming era of deep learning. In the last decade, research community has witnessed drastic changes in the area of text detection from natural scene images in terms of approach, coverage and performance due to huge advancement of deep neural network based models. In this paper, we present (1) a comprehensive review of deep learning approaches towards scene text detection, (2) suitable deep frameworks for this task followed by critical analysis, (3) a categorical study of publicly available scene image datasets and applicable standard evaluation protocols with their pros and cons, and (4) comparative results and analysis of reported methods. Moreover, based on this review and analysis, we precisely mention possible future scopes and thrust areas of deep learning approaches towards text detection from natural scene images on which upcoming researchers may focus.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S (2016) Tensorflow: large-scale machine learning on heterogeneous distributed systems. In: arXiv:1603.04467 Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S (2016) Tensorflow: large-scale machine learning on heterogeneous distributed systems. In: arXiv:​1603.​04467
Zurück zum Zitat Ansari GJ, Shah JH, Yasmin M, Sharif M, Fernandes SL (2018) A novel machine learning approach for scene text extraction. Future Gener Comput Syst 87:328–340CrossRef Ansari GJ, Shah JH, Yasmin M, Sharif M, Fernandes SL (2018) A novel machine learning approach for scene text extraction. Future Gener Comput Syst 87:328–340CrossRef
Zurück zum Zitat Baek Y, Lee B, Han D, Yun S, Lee H (2019) Character region awareness for text detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9365–9374 Baek Y, Lee B, Han D, Yun S, Lee H (2019) Character region awareness for text detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9365–9374
Zurück zum Zitat Bagri N, Johari PK (2015) A comparative study on feature extraction using texture and shape for content based image retrieval. Int J Adv Sci Technol 80(4):41–52CrossRef Bagri N, Johari PK (2015) A comparative study on feature extraction using texture and shape for content based image retrieval. Int J Adv Sci Technol 80(4):41–52CrossRef
Zurück zum Zitat Bai B, Yin F, Liu CL (2013) Scene text localization using gradient local correlation. In: 12th international conference on document analysis and recognition, pp 1380–1384 Bai B, Yin F, Liu CL (2013) Scene text localization using gradient local correlation. In: 12th international conference on document analysis and recognition, pp 1380–1384
Zurück zum Zitat Bastien F, Lamblin P, Pascanu R, Bergstra J, Goodfellow I, Bergeron A, Bouchard N, Warde-Farley D, Bengio Y (2012) Theano: new features and speed improvements. In: arXiv:1211.5590 Bastien F, Lamblin P, Pascanu R, Bergstra J, Goodfellow I, Bergeron A, Bouchard N, Warde-Farley D, Bengio Y (2012) Theano: new features and speed improvements. In: arXiv:​1211.​5590
Zurück zum Zitat Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the CLEAR MOT metrics. J Image Video Process 1 Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the CLEAR MOT metrics. J Image Video Process 1
Zurück zum Zitat Busta M, Neumann L, Matas J (2017) Deep textspotter: an end-to-end trainable scene text localization and recognition framework. In: Proceedings of the IEEE international conference on computer vision, pp 2204–2212 Busta M, Neumann L, Matas J (2017) Deep textspotter: an end-to-end trainable scene text localization and recognition framework. In: Proceedings of the IEEE international conference on computer vision, pp 2204–2212
Zurück zum Zitat Ch’ng CK, Chan CS (2017) Total-text: a comprehensive dataset for scene text detection and recognition. In: 14th international conference on document analysis and recognition, pp 935–942 Ch’ng CK, Chan CS (2017) Total-text: a comprehensive dataset for scene text detection and recognition. In: 14th international conference on document analysis and recognition, pp 935–942
Zurück zum Zitat Ch’ng CK, Chan CS, Liu CL (2019) Total-text: toward orientation robustness in scene text detection. In: International journal on document analysis and recognition, pp 1–22 (In press) Ch’ng CK, Chan CS, Liu CL (2019) Total-text: toward orientation robustness in scene text detection. In: International journal on document analysis and recognition, pp 1–22 (In press)
Zurück zum Zitat Chen X, Yuille AL (2004) Detecting and reading text in natural scenes. In: IEEE conference on computer vision and pattern recognition, vol 2, pp II–II Chen X, Yuille AL (2004) Detecting and reading text in natural scenes. In: IEEE conference on computer vision and pattern recognition, vol 2, pp II–II
Zurück zum Zitat Chen H, Tsai SS, Schroth G, Chen DM, Grzeszczuk R, Girod B (2011) Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In: 18th IEEE international conference on image processing, pp 2609–2612 Chen H, Tsai SS, Schroth G, Chen DM, Grzeszczuk R, Girod B (2011) Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In: 18th IEEE international conference on image processing, pp 2609–2612
Zurück zum Zitat Cho H, Sung M, Jun B (2016) Canny text detector: fast and robust scene text localization algorithm. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3566–3573 Cho H, Sung M, Jun B (2016) Canny text detector: fast and robust scene text localization algorithm. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3566–3573
Zurück zum Zitat Coates A, Carpenter B, Case C, Satheesh S, Suresh B, Wang T, Wu DJ, Ng AY (2011) Text detection and character recognition in scene images with unsupervised feature learning. In: IEEE international conference on document analysis and recognition, pp 440–445 Coates A, Carpenter B, Case C, Satheesh S, Suresh B, Wang T, Wu DJ, Ng AY (2011) Text detection and character recognition in scene images with unsupervised feature learning. In: IEEE international conference on document analysis and recognition, pp 440–445
Zurück zum Zitat da Silveira TL, Kozakevicius AJ, Rodrigues CR (2017) Single-channel EEG sleep stage classification based on a streamlined set of statistical features in wavelet domain. Med Biol Eng Comput 55(2):343–352CrossRef da Silveira TL, Kozakevicius AJ, Rodrigues CR (2017) Single-channel EEG sleep stage classification based on a streamlined set of statistical features in wavelet domain. Med Biol Eng Comput 55(2):343–352CrossRef
Zurück zum Zitat Dai Y, Huang Z, Gao Y, Xu Y, Chen K, Guo J, Qiu W (2018) Fused text segmentation networks for multi-oriented scene text detection. In: 24th international conference on pattern recognition, pp 3604–3609 Dai Y, Huang Z, Gao Y, Xu Y, Chen K, Guo J, Qiu W (2018) Fused text segmentation networks for multi-oriented scene text detection. In: 24th international conference on pattern recognition, pp 3604–3609
Zurück zum Zitat Deng D, Liu H, Li X, Cai D (2018) Pixellink: detecting scene text via instance segmentation. In: 32th international conference of atrificial intelligence AAAI, pp 6773–6780 Deng D, Liu H, Li X, Cai D (2018) Pixellink: detecting scene text via instance segmentation. In: 32th international conference of atrificial intelligence AAAI, pp 6773–6780
Zurück zum Zitat Dey S, Shivakumara P, Raghunandan KS, Pal U, Lu T, Kumar GH, Chan CS (2017) Script independent approach for multi-oriented text detection in scene image. Neurocomputing 242:96–112CrossRef Dey S, Shivakumara P, Raghunandan KS, Pal U, Lu T, Kumar GH, Chan CS (2017) Script independent approach for multi-oriented text detection in scene image. Neurocomputing 242:96–112CrossRef
Zurück zum Zitat Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. In: IEEE computer society conference on computer vision and pattern recognition, pp 2963–2970 Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. In: IEEE computer society conference on computer vision and pattern recognition, pp 2963–2970
Zurück zum Zitat Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136CrossRef Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136CrossRef
Zurück zum Zitat Fathi A, Wojna Z, Rathod V, Wang P, Song HO, Guadarrama S, Murphy KP (2017) Semantic instance segmentation via deep metric learning. In: arXiv:1703.10277 Fathi A, Wojna Z, Rathod V, Wang P, Song HO, Guadarrama S, Murphy KP (2017) Semantic instance segmentation via deep metric learning. In: arXiv:​1703.​10277
Zurück zum Zitat Feng W, He W, Yin F, Zhang XY, Liu CL (2019) TextDragon: an end-to-end framework for arbitrary shaped text spotting. In: Proceedings of the IEEE international conference on computer vision, pp 9076–9085 Feng W, He W, Yin F, Zhang XY, Liu CL (2019) TextDragon: an end-to-end framework for arbitrary shaped text spotting. In: Proceedings of the IEEE international conference on computer vision, pp 9076–9085
Zurück zum Zitat Fogel I, Sagi D (1989) Gabor filters as texture discriminator. Biol Cybern 61(2):103–113CrossRef Fogel I, Sagi D (1989) Gabor filters as texture discriminator. Biol Cybern 61(2):103–113CrossRef
Zurück zum Zitat Francis LM, Sreenath N (2017) TEDLESS–Text detection using least-square SVM from natural scene. J King Saud Univ Comput Inf Sci 29(4) Francis LM, Sreenath N (2017) TEDLESS–Text detection using least-square SVM from natural scene. J King Saud Univ Comput Inf Sci 29(4)
Zurück zum Zitat Gao J, Wang Q, Yuan Y (2019) Convolutional regression network for multi-oriented text detection. IEEE Access 7:96424–96433CrossRef Gao J, Wang Q, Yuan Y (2019) Convolutional regression network for multi-oriented text detection. IEEE Access 7:96424–96433CrossRef
Zurück zum Zitat Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448 Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Zurück zum Zitat Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587 Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Zurück zum Zitat Gllavata J, Ewerth R, Freisleben B (2004) Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In: 17th international conference on pattern recognition, vol 1, pp 425–428 Gllavata J, Ewerth R, Freisleben B (2004) Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In: 17th international conference on pattern recognition, vol 1, pp 425–428
Zurück zum Zitat Greenhalgh J, Mirmehdi M (2012) Real-time detection and recognition of road traffic signs. IEEE Trans Intell Transp Syst 13(4):1498–1506CrossRef Greenhalgh J, Mirmehdi M (2012) Real-time detection and recognition of road traffic signs. IEEE Trans Intell Transp Syst 13(4):1498–1506CrossRef
Zurück zum Zitat Gupta A, Vedaldi A, Zisserman A (2016) Synthetic data for text localisation in natural images. In: IEEE conference on computer vision and pattern recognition, pp 2315–2324 Gupta A, Vedaldi A, Zisserman A (2016) Synthetic data for text localisation in natural images. In: IEEE conference on computer vision and pattern recognition, pp 2315–2324
Zurück zum Zitat He T, Huang W, Qiao Y, Yao J (2016a) Text-attentional convolutional neural network for scene text detection. IEEE Trans Image Process 25(6):2529–2541MathSciNetMATHCrossRef He T, Huang W, Qiao Y, Yao J (2016a) Text-attentional convolutional neural network for scene text detection. IEEE Trans Image Process 25(6):2529–2541MathSciNetMATHCrossRef
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016b) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016b) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Zurück zum Zitat He D, Yang X, Liang C, Zhou Z, Ororbi AG, Kifer D, Lee Giles C (2017a) Multi-scale FCN with cascaded instance aware segmentation for arbitrary oriented word spotting in the wild. In: IEEE conference on computer vision and pattern recognition, pp 3519–3528 He D, Yang X, Liang C, Zhou Z, Ororbi AG, Kifer D, Lee Giles C (2017a) Multi-scale FCN with cascaded instance aware segmentation for arbitrary oriented word spotting in the wild. In: IEEE conference on computer vision and pattern recognition, pp 3519–3528
Zurück zum Zitat He P, Huang W, He T, Zhu Q, Qiao Y, Li X (2017b) Single shot text detector with regional attention. In: IEEE international conference on computer vision, pp 3047–3055 He P, Huang W, He T, Zhu Q, Qiao Y, Li X (2017b) Single shot text detector with regional attention. In: IEEE international conference on computer vision, pp 3047–3055
Zurück zum Zitat He W, Zhang XY, Yin F, Liu CL (2017c) Deep direct regression for multi-oriented scene text detection. In: IEEE international conference on computer vision, pp 745–753 He W, Zhang XY, Yin F, Liu CL (2017c) Deep direct regression for multi-oriented scene text detection. In: IEEE international conference on computer vision, pp 745–753
Zurück zum Zitat He K, Gkioxari G, Dollár P, Girshick R (2017d) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969 He K, Gkioxari G, Dollár P, Girshick R (2017d) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
Zurück zum Zitat He T, Tian Z, Huang W, Shen C, Qiao Y, Sun C (2018a) An end-to-end textspotter with explicit alignment and attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5020–5029 He T, Tian Z, Huang W, Shen C, Qiao Y, Sun C (2018a) An end-to-end textspotter with explicit alignment and attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5020–5029
Zurück zum Zitat He W, Zhang XY, Yin F, Liu CL (2018b) Multi-oriented and multi-lingual scene text detection with direct regression. IEEE Trans Image Process 27(11):5406–5419MathSciNetCrossRef He W, Zhang XY, Yin F, Liu CL (2018b) Multi-oriented and multi-lingual scene text detection with direct regression. IEEE Trans Image Process 27(11):5406–5419MathSciNetCrossRef
Zurück zum Zitat He W, Zhang XY, Yin F, Luo Z, Ogier JM, Liu CL (2020) Realtime multi-scale scene text detection with scale-based region proposal network. Pattern Recognit 98:107026CrossRef He W, Zhang XY, Yin F, Luo Z, Ogier JM, Liu CL (2020) Realtime multi-scale scene text detection with scale-based region proposal network. Pattern Recognit 98:107026CrossRef
Zurück zum Zitat Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef
Zurück zum Zitat Huang X (2019) Automatic video scene text detection based on saliency edge map. Multimed Tools Appl 78(24):34819–34838CrossRef Huang X (2019) Automatic video scene text detection based on saliency edge map. Multimed Tools Appl 78(24):34819–34838CrossRef
Zurück zum Zitat Huang W, Lin Z, Yang J, Wang J (2013) Text localization in natural images using stroke feature transform and text covariance descriptors. In: IEEE international conference on computer vision, pp 1241–1248 Huang W, Lin Z, Yang J, Wang J (2013) Text localization in natural images using stroke feature transform and text covariance descriptors. In: IEEE international conference on computer vision, pp 1241–1248
Zurück zum Zitat Huang W, Qiao Y, Tang X (2014) Robust scene text detection with convolution neural network induced mser trees. In: European conference on computer vision, pp 497–511 Huang W, Qiao Y, Tang X (2014) Robust scene text detection with convolution neural network induced mser trees. In: European conference on computer vision, pp 497–511
Zurück zum Zitat Huang L, Yang Y, Deng Y, Yu Y (2015) Densebox: unifying landmark localization with end to end object detection. In: arXiv:1509.04874 Huang L, Yang Y, Deng Y, Yu Y (2015) Densebox: unifying landmark localization with end to end object detection. In: arXiv:​1509.​04874
Zurück zum Zitat Huang Z, Zhong Z, Sun L, Huo Q (2019) Mask R-CNN with pyramid attention network for scene text detection. In: 2019 IEEE winter conference on applications of computer vision, pp 764–772 Huang Z, Zhong Z, Sun L, Huo Q (2019) Mask R-CNN with pyramid attention network for scene text detection. In: 2019 IEEE winter conference on applications of computer vision, pp 764–772
Zurück zum Zitat Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2016) Reading text in the wild with convolutional neural networks. Int J Comput Vis 116(1):1–20MathSciNetCrossRef Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2016) Reading text in the wild with convolutional neural networks. Int J Comput Vis 116(1):1–20MathSciNetCrossRef
Zurück zum Zitat Jeon M, Jeong YS (2020) Compact and accurate scene text detector. Appl Sci 10(6):2096CrossRef Jeon M, Jeong YS (2020) Compact and accurate scene text detector. Appl Sci 10(6):2096CrossRef
Zurück zum Zitat Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: 22nd international conference on multimedia, pp 675–678 Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: 22nd international conference on multimedia, pp 675–678
Zurück zum Zitat Jiang Y, Zhu X, Wang X, Yang S, Li W, Wang H, Fu P, Luo Z (2017) R2CNN: rotational region CNN for orientation robust scene text detection. In: arXiv:1706.09579 Jiang Y, Zhu X, Wang X, Yang S, Li W, Wang H, Fu P, Luo Z (2017) R2CNN: rotational region CNN for orientation robust scene text detection. In: arXiv:​1706.​09579
Zurück zum Zitat Jiang M, Cheng J, Chen M, Ku X (2018) An improved text localization method for natural scene images. J Phys 960(1):012027 Jiang M, Cheng J, Chen M, Ku X (2018) An improved text localization method for natural scene images. J Phys 960(1):012027
Zurück zum Zitat Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, Qu R (2019) A survey of deep learning-based object detection. IEEE Access 7:128837–128868CrossRef Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, Qu R (2019) A survey of deep learning-based object detection. IEEE Access 7:128837–128868CrossRef
Zurück zum Zitat Joan SF, Valli S (2019) A survey on text information extraction from born-digital and scene text images. Proc Natl Acad Sci India Sect A 89(1):77–101CrossRef Joan SF, Valli S (2019) A survey on text information extraction from born-digital and scene text images. Proc Natl Acad Sci India Sect A 89(1):77–101CrossRef
Zurück zum Zitat Karatzas D, Shafait F, Uchida S, Iwamura M, Bigorda LG, Mestre SR, Mas J, Mota DF, Almazan JA, De Las Heras LP (2011) ICDAR 2011 robust reading competition. In: 12th international conference on document analysis and recognition, pp 1484–1493 Karatzas D, Shafait F, Uchida S, Iwamura M, Bigorda LG, Mestre SR, Mas J, Mota DF, Almazan JA, De Las Heras LP (2011) ICDAR 2011 robust reading competition. In: 12th international conference on document analysis and recognition, pp 1484–1493
Zurück zum Zitat Karatzas D, Shafait F, Uchida S, Iwamura M, Bigorda LG, Mestre SR, Mas J, Mota DF, Almazan JA, De Las Heras LP (2013) ICDAR 2013 robust reading competition. In: 12th international conference on document analysis and recognition, pp 1484–1493 Karatzas D, Shafait F, Uchida S, Iwamura M, Bigorda LG, Mestre SR, Mas J, Mota DF, Almazan JA, De Las Heras LP (2013) ICDAR 2013 robust reading competition. In: 12th international conference on document analysis and recognition, pp 1484–1493
Zurück zum Zitat Karatzas D, Gomez-Bigorda L, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, Matas J, Neumann L, Chandrasekhar VR, Lu S, Shafait F (2015) ICDAR 2015 competition on robust reading. In: 13th international conference on document analysis and recognition, pp 1156–1160 Karatzas D, Gomez-Bigorda L, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, Matas J, Neumann L, Chandrasekhar VR, Lu S, Shafait F (2015) ICDAR 2015 competition on robust reading. In: 13th international conference on document analysis and recognition, pp 1156–1160
Zurück zum Zitat Kasturi R, Goldgof D, Soundararajan P, Manohar V, Garofolo J, Bowers R, Boonstra M, Korzhova V, Zhang J (2008) Framework for performance evaluation of face, text, and vehicle detection and tracking in video: data, metrics, and protocol. IEEE Trans Pattern Anal Mach Intell 31(2):319–336CrossRef Kasturi R, Goldgof D, Soundararajan P, Manohar V, Garofolo J, Bowers R, Boonstra M, Korzhova V, Zhang J (2008) Framework for performance evaluation of face, text, and vehicle detection and tracking in video: data, metrics, and protocol. IEEE Trans Pattern Anal Mach Intell 31(2):319–336CrossRef
Zurück zum Zitat Ketkar N (2017) Introduction to keras. In: Deep learning with python, pp 97–111 Ketkar N (2017) Introduction to keras. In: Deep learning with python, pp 97–111
Zurück zum Zitat Khan T, Mollah AF (2019a) Distance transform-based stroke feature descriptor for text non-text classification. In: Recent developments in machine learning and data analytics, pp 189–200 Khan T, Mollah AF (2019a) Distance transform-based stroke feature descriptor for text non-text classification. In: Recent developments in machine learning and data analytics, pp 189–200
Zurück zum Zitat Khan T, Mollah AF (2019b) AUTNT-A component level dataset for text non-text classification and benchmarking with novel script invariant feature descriptors and D-CNN. Multimed Tools Appl 78(22):32159–32186CrossRef Khan T, Mollah AF (2019b) AUTNT-A component level dataset for text non-text classification and benchmarking with novel script invariant feature descriptors and D-CNN. Multimed Tools Appl 78(22):32159–32186CrossRef
Zurück zum Zitat Khan FA, Tahir MA, Khelifi F, Bouridane A, Almotaeryi R (2017) Robust off-line text independent writer identification using bagged discrete cosine transform features. Expert Syst Appl 71:404–415CrossRef Khan FA, Tahir MA, Khelifi F, Bouridane A, Almotaeryi R (2017) Robust off-line text independent writer identification using bagged discrete cosine transform features. Expert Syst Appl 71:404–415CrossRef
Zurück zum Zitat Kim KH, Hong S, Roh B, Cheon Y, Park M (2016) Pvanet: deep but lightweight neural networks for real-time object detection. In: arXiv:1608.08021 Kim KH, Hong S, Roh B, Cheon Y, Park M (2016) Pvanet: deep but lightweight neural networks for real-time object detection. In: arXiv:​1608.​08021
Zurück zum Zitat Kobchaisawat T, Chalidabhongse TH, Satoh SI (2020) Scene text detection with polygon offsetting and border augmentation. Electronics 9(1):117CrossRef Kobchaisawat T, Chalidabhongse TH, Satoh SI (2020) Scene text detection with polygon offsetting and border augmentation. Electronics 9(1):117CrossRef
Zurück zum Zitat Kong S, Fowlkes CC (2018) Recurrent pixel embedding for instance grouping. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9018–9028 Kong S, Fowlkes CC (2018) Recurrent pixel embedding for instance grouping. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9018–9028
Zurück zum Zitat Koo HI, Kim DH (2013) Scene text detection via connected component clustering and nontext filtering. IEEE Trans Image Process 22(6):2296–2305MathSciNetMATHCrossRef Koo HI, Kim DH (2013) Scene text detection via connected component clustering and nontext filtering. IEEE Trans Image Process 22(6):2296–2305MathSciNetMATHCrossRef
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105 Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Zurück zum Zitat LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. IEEE 86(11):2278–2324CrossRef LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. IEEE 86(11):2278–2324CrossRef
Zurück zum Zitat Lee S, Cho MS, Jung K, Kim JH (2010) Scene text extraction with edge constraint and text collinearity. In: 20th international conference on pattern recognition, pp 3983–3986 Lee S, Cho MS, Jung K, Kim JH (2010) Scene text extraction with edge constraint and text collinearity. In: 20th international conference on pattern recognition, pp 3983–3986
Zurück zum Zitat Lee JJ, Lee PH, Lee SW, Yuille A, Koch C (2011a) Adaboost for text detection in natural scene. In: 2011 International conference on document analysis and recognition, pp 429–434 Lee JJ, Lee PH, Lee SW, Yuille A, Koch C (2011a) Adaboost for text detection in natural scene. In: 2011 International conference on document analysis and recognition, pp 429–434
Zurück zum Zitat Lee JJ, Lee PH, Lee SW, Yuille A, Koch C (2011b) Adaboost for text detection in natural scene. In: International conference on document analysis and recognition, pp 429–434 Lee JJ, Lee PH, Lee SW, Yuille A, Koch C (2011b) Adaboost for text detection in natural scene. In: International conference on document analysis and recognition, pp 429–434
Zurück zum Zitat Leibe B, Matas J, Sebe N, Welling M (eds) (2016) Computer vision—ECCV 2016. In: 14th European conference, vol 9908 Leibe B, Matas J, Sebe N, Welling M (eds) (2016) Computer vision—ECCV 2016. In: 14th European conference, vol 9908
Zurück zum Zitat Li Y, Lu H (2012) Scene text detection via stroke width. In: 21st international conference on pattern recognition, pp 681–684 Li Y, Lu H (2012) Scene text detection via stroke width. In: 21st international conference on pattern recognition, pp 681–684
Zurück zum Zitat Li H, Wang P, Shen C (2017) Towards end-to-end text spotting with convolutional recurrent neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 5238–5246 Li H, Wang P, Shen C (2017) Towards end-to-end text spotting with convolutional recurrent neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 5238–5246
Zurück zum Zitat Li X, Wang W, Hou W, Liu RZ, Lu T, Yang J (2018) Shape robust text detection with progressive scale expansion network. In: arXiv:1806.02559 Li X, Wang W, Hou W, Liu RZ, Lu T, Yang J (2018) Shape robust text detection with progressive scale expansion network. In: arXiv:​1806.​02559
Zurück zum Zitat Liang J, Phillips IT, Haralick RM (1997) Performance evaluation of document layout analysis algorithms on the UW data set. Int Soc Opt Photonics Doc Recognit 3027:149–160 Liang J, Phillips IT, Haralick RM (1997) Performance evaluation of document layout analysis algorithms on the UW data set. Int Soc Opt Photonics Doc Recognit 3027:149–160
Zurück zum Zitat Liang G, Shivakumara P, Lu T, Tan CL (2015) A new wavelet-Laplacian method for arbitrarily-oriented character segmentation in video text lines. In: 13th international conference on document analysis and recognition, pp 926–930 Liang G, Shivakumara P, Lu T, Tan CL (2015) A new wavelet-Laplacian method for arbitrarily-oriented character segmentation in video text lines. In: 13th international conference on document analysis and recognition, pp 926–930
Zurück zum Zitat Liao M, Shi B, Bai X, Wang X, Liu W (2017) TextBoxes: a fast text detector with a single deep neural network. In: International conference of AAAI, pp 4161–4167 Liao M, Shi B, Bai X, Wang X, Liu W (2017) TextBoxes: a fast text detector with a single deep neural network. In: International conference of AAAI, pp 4161–4167
Zurück zum Zitat Liao M, Shi B, Bai X (2018a) Textboxes++: a single-shot oriented scene text detector. IEEE Trans Image Process 27(8):3676–3690MathSciNetMATHCrossRef Liao M, Shi B, Bai X (2018a) Textboxes++: a single-shot oriented scene text detector. IEEE Trans Image Process 27(8):3676–3690MathSciNetMATHCrossRef
Zurück zum Zitat Liao M, Zhu Z, Shi B, Xia GS, Bai X (2018b) Rotation-sensitive regression for oriented scene text detection. In: IEEE conference on computer vision and pattern recognition, pp 5909–5918 Liao M, Zhu Z, Shi B, Xia GS, Bai X (2018b) Rotation-sensitive regression for oriented scene text detection. In: IEEE conference on computer vision and pattern recognition, pp 5909–5918
Zurück zum Zitat Liao M, Wan Z, Yao C, Chen K, Bai X (2019b) Real-time scene text detection with differentiable binarization. In: arXiv:1911.08947 Liao M, Wan Z, Yao C, Chen K, Bai X (2019b) Real-time scene text detection with differentiable binarization. In: arXiv:​1911.​08947
Zurück zum Zitat Liao M, Song B, Long S, He M, Yao C, Bai X (2020) SynthText3D: synthesizing scene text images from 3D virtual worlds. Sci China Inf Sci 63(2):120105CrossRef Liao M, Song B, Long S, He M, Yao C, Bai X (2020) SynthText3D: synthesizing scene text images from 3D virtual worlds. Sci China Inf Sci 63(2):120105CrossRef
Zurück zum Zitat Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, pp 740–755 Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, pp 740–755
Zurück zum Zitat Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: IEEE conference on computer vision and pattern recognition, pp 2117–2125 Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: IEEE conference on computer vision and pattern recognition, pp 2117–2125
Zurück zum Zitat Lin H, Yang P, Zhang F (2019) Review of scene text detection and recognition. In: Archives of computational methods in engineering, pp 1–22 Lin H, Yang P, Zhang F (2019) Review of scene text detection and recognition. In: Archives of computational methods in engineering, pp 1–22
Zurück zum Zitat Liu Y, Jin L (2017) Deep matching prior network: toward tighter multi-oriented text detection. In: IEEE international conference on computer vision and pattern recognition, pp 3454–3461 Liu Y, Jin L (2017) Deep matching prior network: toward tighter multi-oriented text detection. In: IEEE international conference on computer vision and pattern recognition, pp 3454–3461
Zurück zum Zitat Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016a) SSD: single shot multibox detector. In: European conference on computer vision, pp 21–37 Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016a) SSD: single shot multibox detector. In: European conference on computer vision, pp 21–37
Zurück zum Zitat Liu L, Lao S, Fieguth PW, Guo Y, Wang X, Pietikäinen M (2016b) Median robust extended local binary pattern for texture classification. IEEE Trans Image Process 25(3):1368–1381MathSciNetMATHCrossRef Liu L, Lao S, Fieguth PW, Guo Y, Wang X, Pietikäinen M (2016b) Median robust extended local binary pattern for texture classification. IEEE Trans Image Process 25(3):1368–1381MathSciNetMATHCrossRef
Zurück zum Zitat Liu L, Fieguth P, Guo Y, Wang X, Pietikäinen M (2017) Local binary features for texture classification: taxonomy and experimental study. Pattern Recognit 62:135–160CrossRef Liu L, Fieguth P, Guo Y, Wang X, Pietikäinen M (2017) Local binary features for texture classification: taxonomy and experimental study. Pattern Recognit 62:135–160CrossRef
Zurück zum Zitat Liu Z, Lin G, Yang S, Feng J, Lin W, Goh WL (2018a) Learning markov clustering networks for scene text detection. In: IEEE international conference of computer vision and pattern recognition, pp 6936–6944 Liu Z, Lin G, Yang S, Feng J, Lin W, Goh WL (2018a) Learning markov clustering networks for scene text detection. In: IEEE international conference of computer vision and pattern recognition, pp 6936–6944
Zurück zum Zitat Liu S, Qi L, Qin H, Shi J, Jia J (2018b) Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768 Liu S, Qi L, Qin H, Shi J, Jia J (2018b) Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768
Zurück zum Zitat Liu X, Liang D, Yan S, Chen D, Qiao Y, Yan J (2018c) FOTS: fast oriented text spotting with a unified network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5676–5685 Liu X, Liang D, Yan S, Chen D, Qiao Y, Yan J (2018c) FOTS: fast oriented text spotting with a unified network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5676–5685
Zurück zum Zitat Liu Y, Jin L, Zhang S, Luo C, Zhang S (2019a) Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recognit 90:337–345CrossRef Liu Y, Jin L, Zhang S, Luo C, Zhang S (2019a) Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recognit 90:337–345CrossRef
Zurück zum Zitat Liu Y, Jin L, Xie Z, Luo C, Zhang S, Xie L (2019b) Tightness-aware evaluation protocol for scene text detection. In: IEEE Conference on computer vision and pattern recognition, pp 9612–9620 Liu Y, Jin L, Xie Z, Luo C, Zhang S, Xie L (2019b) Tightness-aware evaluation protocol for scene text detection. In: IEEE Conference on computer vision and pattern recognition, pp 9612–9620
Zurück zum Zitat Liu F, Chen C, Gu D, Zheng J (2019c) FTPN: scene text detection with feature pyramid based text proposal network. IEEE Access 7:44219–44228CrossRef Liu F, Chen C, Gu D, Zheng J (2019c) FTPN: scene text detection with feature pyramid based text proposal network. IEEE Access 7:44219–44228CrossRef
Zurück zum Zitat Liu X, Meng G, Pan C (2019d) Scene text detection and recognition with advances in deep learning: a survey. Int J Doc Anal Recognit 22(2):143–162CrossRef Liu X, Meng G, Pan C (2019d) Scene text detection and recognition with advances in deep learning: a survey. Int J Doc Anal Recognit 22(2):143–162CrossRef
Zurück zum Zitat Liu Z, Lin G, Yang S, Liu F, Lin W, Goh WL (2019e) Towards robust curve text detection with conditional spatial expansion. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7269–7278 Liu Z, Lin G, Yang S, Liu F, Lin W, Goh WL (2019e) Towards robust curve text detection with conditional spatial expansion. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7269–7278
Zurück zum Zitat Liu Y, Zhang S, Jin L, Xie L, Wu Y, Wang Z (2019f) Omnidirectional scene text detection with sequential-free box discretization. In: arXiv:1906.02371 Liu Y, Zhang S, Jin L, Xie L, Wu Y, Wang Z (2019f) Omnidirectional scene text detection with sequential-free box discretization. In: arXiv:​1906.​02371
Zurück zum Zitat Liu X, Zhang R, Zhou Y, Jiang Q, Song Q, Li N, Zhou K, Wang L, Wang D, Liao M, Yang M (2019g) ICDAR 2019 robust reading challenge on reading chinese text on signboard. In: arXiv:1912.09641 Liu X, Zhang R, Zhou Y, Jiang Q, Song Q, Li N, Zhou K, Wang L, Wang D, Liao M, Yang M (2019g) ICDAR 2019 robust reading challenge on reading chinese text on signboard. In: arXiv:​1912.​09641
Zurück zum Zitat Liu H, Guo A, Jiang D, Hu Y, Ren B (2020a) PuzzleNet: scene text detection by segment context graph learning. In: arXiv:2002.11371 Liu H, Guo A, Jiang D, Hu Y, Ren B (2020a) PuzzleNet: scene text detection by segment context graph learning. In: arXiv:​2002.​11371
Zurück zum Zitat Liu Y, Chen H, Shen C, He T, Jin L, Wang L (2020b) ABCNet: real-time scene text spotting with adaptive bezier-curve network. In: arXiv:2002.10200 Liu Y, Chen H, Shen C, He T, Jin L, Wang L (2020b) ABCNet: real-time scene text spotting with adaptive bezier-curve network. In: arXiv:​2002.​10200
Zurück zum Zitat Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: IEEE international conference on computer vision and pattern recognition, pp 3431–3440 Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: IEEE international conference on computer vision and pattern recognition, pp 3431–3440
Zurück zum Zitat Long S, Ruan J, Zhang W, He X, Wu W, Yao C (2018a) TextSnake: a flexible representation for detecting text of arbitrary shapes. In: European conference on computer vision, pp 20–36 Long S, Ruan J, Zhang W, He X, Wu W, Yao C (2018a) TextSnake: a flexible representation for detecting text of arbitrary shapes. In: European conference on computer vision, pp 20–36
Zurück zum Zitat Lu S, Chen T, Tian S, Lim JH, Tan CL (2015) Scene text extraction based on edges and support vector regression. Int J Doc Anal Recognit 18(2):125–135CrossRef Lu S, Chen T, Tian S, Lim JH, Tan CL (2015) Scene text extraction based on edges and support vector regression. Int J Doc Anal Recognit 18(2):125–135CrossRef
Zurück zum Zitat Lucas SM (2005) ICDAR 2005 text locating competition results. In: 8th international conference on document analysis and recognition, pp 80–84 Lucas SM (2005) ICDAR 2005 text locating competition results. In: 8th international conference on document analysis and recognition, pp 80–84
Zurück zum Zitat Lucas SM, Panaretos A, Sosa L, Tang A, Wong S, Young R (2003) ICDAR 2003 robust reading competitions. In: 7th international conference on document analysis and recognition, pp 682–687 Lucas SM, Panaretos A, Sosa L, Tang A, Wong S, Young R (2003) ICDAR 2003 robust reading competitions. In: 7th international conference on document analysis and recognition, pp 682–687
Zurück zum Zitat Lyu P, Yao C, Wu W, Yan S, Bai X (2018a) Multi-oriented scene text detection via corner localization and region segmentation. In: IEEE conference on computer vision and pattern recognition, pp 7553–7563 Lyu P, Yao C, Wu W, Yan S, Bai X (2018a) Multi-oriented scene text detection via corner localization and region segmentation. In: IEEE conference on computer vision and pattern recognition, pp 7553–7563
Zurück zum Zitat Lyu P, Liao M, Yao C, Wu W, Bai X (2018b) Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. In: Proceedings of the European conference on computer vision, pp 67–83 Lyu P, Liao M, Yao C, Wu W, Bai X (2018b) Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. In: Proceedings of the European conference on computer vision, pp 67–83
Zurück zum Zitat Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X (2018) Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimed 20(11):3111–3122CrossRef Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X (2018) Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimed 20(11):3111–3122CrossRef
Zurück zum Zitat Ma C, Sun L, Zhong Z, Huo Q (2020) ReLaText: exploiting visual relationships for arbitrary-shaped scene text detection with graph convolutional networks. In: arXiv:2003.06999 Ma C, Sun L, Zhong Z, Huo Q (2020) ReLaText: exploiting visual relationships for arbitrary-shaped scene text detection with graph convolutional networks. In: arXiv:​2003.​06999
Zurück zum Zitat Maitra DS, Bhattacharya U, Parui SK (2015) CNN based common approach to handwritten character recognition of multiple scripts. In: 13th international conference on document analysis and recognition, pp 1021–1025 Maitra DS, Bhattacharya U, Parui SK (2015) CNN based common approach to handwritten character recognition of multiple scripts. In: 13th international conference on document analysis and recognition, pp 1021–1025
Zurück zum Zitat Majhi B, Pujari P (2018) On development and performance evaluation of novel odia handwritten digit recognition methods. Arab J Sci Eng 43(8):3887–3901CrossRef Majhi B, Pujari P (2018) On development and performance evaluation of novel odia handwritten digit recognition methods. Arab J Sci Eng 43(8):3887–3901CrossRef
Zurück zum Zitat Mallat SG (1989) A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans Pattern Anal Mach Intell 7:674–693MATHCrossRef Mallat SG (1989) A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans Pattern Anal Mach Intell 7:674–693MATHCrossRef
Zurück zum Zitat Manjusha K, Kumar MA, Soman KP (2018) Reduced scattering representation for Malayalam character recognition. Arab J Sci Eng 43(8):4315–4326CrossRef Manjusha K, Kumar MA, Soman KP (2018) Reduced scattering representation for Malayalam character recognition. Arab J Sci Eng 43(8):4315–4326CrossRef
Zurück zum Zitat Mishra A, Alahari K, Jawahar CV (2012) Scene text recognition using higher order language priors. In: HAL Mishra A, Alahari K, Jawahar CV (2012) Scene text recognition using higher order language priors. In: HAL
Zurück zum Zitat Mitchell T (1999) The 20 newsgroups text dataset Mitchell T (1999) The 20 newsgroups text dataset
Zurück zum Zitat Mollah AF, Basu S, Nasipuri M (2012) Text detection from camera captured images using a novel fuzzy-based technique. In: 3rd international conference on emerging applications of information technology, pp 291–294 Mollah AF, Basu S, Nasipuri M (2012) Text detection from camera captured images using a novel fuzzy-based technique. In: 3rd international conference on emerging applications of information technology, pp 291–294
Zurück zum Zitat Mosleh A, Bouguila N, Hamza AB (2012) Image text detection using a bandlet-based edge detector and stroke width transform. In: British machine vision conference, pp 1–12 Mosleh A, Bouguila N, Hamza AB (2012) Image text detection using a bandlet-based edge detector and stroke width transform. In: British machine vision conference, pp 1–12
Zurück zum Zitat Nayef N, Yin F, Bizid I, Choi H, Feng Y, Karatzas D, Luo Z, Pal U, Rigaud C, Chazalon J, Khlif W (2017) ICDAR 2017 robust reading challenge on multi-lingual scene text detection and script identification-rrc-mlt. In: 14th IAPR international conference on document analysis and recognition, pp 1454–1459 Nayef N, Yin F, Bizid I, Choi H, Feng Y, Karatzas D, Luo Z, Pal U, Rigaud C, Chazalon J, Khlif W (2017) ICDAR 2017 robust reading challenge on multi-lingual scene text detection and script identification-rrc-mlt. In: 14th IAPR international conference on document analysis and recognition, pp 1454–1459
Zurück zum Zitat Nayef N, Patel Y, Busta M, Chowdhury PN, Karatzas D, Khlif W, Matas J, Pal U, Burie JC, Liu CL, Ogier JM (2019) ICDAR 2019 robust reading challenge on multi-lingual scene text detection and recognition–RRC-MLT-2019. In: IAPR international conference of document analysis and recognition Nayef N, Patel Y, Busta M, Chowdhury PN, Karatzas D, Khlif W, Matas J, Pal U, Burie JC, Liu CL, Ogier JM (2019) ICDAR 2019 robust reading challenge on multi-lingual scene text detection and recognition–RRC-MLT-2019. In: IAPR international conference of document analysis and recognition
Zurück zum Zitat Neumann L, Matas J (2010) A method for text localization and recognition in real-world images. In: Asian conference on computer vision, pp 770–783 Neumann L, Matas J (2010) A method for text localization and recognition in real-world images. In: Asian conference on computer vision, pp 770–783
Zurück zum Zitat Neumann L, Matas J (2012) Real-time scene text localization and recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3538–3545 Neumann L, Matas J (2012) Real-time scene text localization and recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3538–3545
Zurück zum Zitat Neycharan JG, Ahmadyfard A (2018) Edge color transform: a new operator for natural scene text localization. Multimed Tools Appl 77(6):7615–7636CrossRef Neycharan JG, Ahmadyfard A (2018) Edge color transform: a new operator for natural scene text localization. Multimed Tools Appl 77(6):7615–7636CrossRef
Zurück zum Zitat Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE international conference on computer vision, pp 1520–1528 Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE international conference on computer vision, pp 1520–1528
Zurück zum Zitat Ojala T, Pietikäinen M, Harwood D (1996) A comparative study of texture measures with classification based on featured distributions. Pattern Recognit 29(1):51–59CrossRef Ojala T, Pietikäinen M, Harwood D (1996) A comparative study of texture measures with classification based on featured distributions. Pattern Recognit 29(1):51–59CrossRef
Zurück zum Zitat Pan YF, Hou X, Liu CL (2010) A hybrid approach to detect and localize texts in natural scene images. IEEE Trans Image Process 20(3):800–813MathSciNetMATH Pan YF, Hou X, Liu CL (2010) A hybrid approach to detect and localize texts in natural scene images. IEEE Trans Image Process 20(3):800–813MathSciNetMATH
Zurück zum Zitat Paul S, Saha S, Basu S, Saha PK, Nasipuri M (2019) Text localization in camera captured images using fuzzy distance transform based adaptive stroke filter. Multimed Tools Appl 78(13):18017–18036CrossRef Paul S, Saha S, Basu S, Saha PK, Nasipuri M (2019) Text localization in camera captured images using fuzzy distance transform based adaptive stroke filter. Multimed Tools Appl 78(13):18017–18036CrossRef
Zurück zum Zitat Qiao L, Tang S, Cheng Z, Xu Y, Niu Y, Pu S, Wu F (2020) Text perceptron: towards end-to-end arbitrary-shaped text spotting. In: arXiv:2002.06820 Qiao L, Tang S, Cheng Z, Xu Y, Niu Y, Pu S, Wu F (2020) Text perceptron: towards end-to-end arbitrary-shaped text spotting. In: arXiv:​2002.​06820
Zurück zum Zitat Qin S, Manduchi R (2017) Cascaded segmentation-detection networks for word-level text spotting. In: 14th international conference on document analysis and recognition, pp 1275–1282 Qin S, Manduchi R (2017) Cascaded segmentation-detection networks for word-level text spotting. In: 14th international conference on document analysis and recognition, pp 1275–1282
Zurück zum Zitat Qin H, Zhang H, Wang H, Yan Y, Zhang M, Zhao W (2019a) An algorithm for scene text detection using multibox and semantic segmentation. Appl Sci 9(6):1054CrossRef Qin H, Zhang H, Wang H, Yan Y, Zhang M, Zhao W (2019a) An algorithm for scene text detection using multibox and semantic segmentation. Appl Sci 9(6):1054CrossRef
Zurück zum Zitat Qin S, Bissacco A, Raptis M, Fujii Y, Xiao Y (2019b) Towards unconstrained end-to-end text spotting. In: Proceedings of the IEEE international conference on computer vision, pp 4704–4714 Qin S, Bissacco A, Raptis M, Fujii Y, Xiao Y (2019b) Towards unconstrained end-to-end text spotting. In: Proceedings of the IEEE international conference on computer vision, pp 4704–4714
Zurück zum Zitat Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: IEEE conference on computer vision and pattern recognition, pp 779–788 Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: IEEE conference on computer vision and pattern recognition, pp 779–788
Zurück zum Zitat Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99 Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Zurück zum Zitat Richardson E, Azar Y, Avioz O, Geron N, Ronen T, Avraham Z, Shapiro S (2019) It’s all about the scale–efficient text detection using adaptive scaling. In: arXiv:1907.12122 Richardson E, Azar Y, Avioz O, Geron N, Ronen T, Avraham Z, Shapiro S (2019) It’s all about the scale–efficient text detection using adaptive scaling. In: arXiv:​1907.​12122
Zurück zum Zitat Risnumawan A, Shivakumara P, Chan CS, Tan CL (2014) A robust arbitrary text detection system for natural scene images. Expert Syst Appl 41(18):8027–8048CrossRef Risnumawan A, Shivakumara P, Chan CS, Tan CL (2014) A robust arbitrary text detection system for natural scene images. Expert Syst Appl 41(18):8027–8048CrossRef
Zurück zum Zitat Saha S, Chakraborty N, Kundu S, Paul S, Mollah AF, Basu S, Sarkar R (2020) Multi-lingual scene text detection and language identification. Pattern Recognit Lett 138:16–22CrossRef Saha S, Chakraborty N, Kundu S, Paul S, Mollah AF, Basu S, Sarkar R (2020) Multi-lingual scene text detection and language identification. Pattern Recognit Lett 138:16–22CrossRef
Zurück zum Zitat Sain A, Bhunia AK, Roy PP, Pal U (2018) Multi-oriented text detection and verification in video frames and scene images. Neurocomputing 275:1531–1549CrossRef Sain A, Bhunia AK, Roy PP, Pal U (2018) Multi-oriented text detection and verification in video frames and scene images. Neurocomputing 275:1531–1549CrossRef
Zurück zum Zitat Sherstinsky A (2018) Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. In: arXiv:1808.03314 Sherstinsky A (2018) Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. In: arXiv:​1808.​03314
Zurück zum Zitat Shi C, Wang C, Xiao B, Zhang Y, Gao S (2013) Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recognit Lett 34(2):107–116CrossRef Shi C, Wang C, Xiao B, Zhang Y, Gao S (2013) Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recognit Lett 34(2):107–116CrossRef
Zurück zum Zitat Shi B, Bai X, Belongie S (2017a) Detecting oriented text in natural images by linking segments. In: IEEE international conference on computer vision and pattern recognition, pp 2550–2558 Shi B, Bai X, Belongie S (2017a) Detecting oriented text in natural images by linking segments. In: IEEE international conference on computer vision and pattern recognition, pp 2550–2558
Zurück zum Zitat Shi B, Yao C, Liao M, Yang M, Xu P, Cui L, Belongie S, Lu S, Bai X (2017b) ICDAR 2017 competition on reading chinese text in the wild (rctw-17). In: 14th IAPR international conference on document analysis and recognition, pp 1429–1434 Shi B, Yao C, Liao M, Yang M, Xu P, Cui L, Belongie S, Lu S, Bai X (2017b) ICDAR 2017 competition on reading chinese text in the wild (rctw-17). In: 14th IAPR international conference on document analysis and recognition, pp 1429–1434
Zurück zum Zitat Shivakumara P, Phan TQ, Tan CL (2010) A Laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell 33(2):412–419CrossRef Shivakumara P, Phan TQ, Tan CL (2010) A Laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell 33(2):412–419CrossRef
Zurück zum Zitat Shivakumara P, Roy S, Jalab HA, Ibrahim RW, Pal U, Lu T, Khare V, Wahab AWBA (2019) Fractional means based method for multi-oriented keyword spotting in video/scene/license plate images. Expert Syst Appl 118:1–19CrossRef Shivakumara P, Roy S, Jalab HA, Ibrahim RW, Pal U, Lu T, Khare V, Wahab AWBA (2019) Fractional means based method for multi-oriented keyword spotting in video/scene/license plate images. Expert Syst Appl 118:1–19CrossRef
Zurück zum Zitat Song X, Wu Y, Wang W, Lu T (2020) TK-text: multi-shaped scene text detection via instance segmentation. In: Proceedings of the international conference on multimedia modeling, pp 201–213 Song X, Wu Y, Wang W, Lu T (2020) TK-text: multi-shaped scene text detection via instance segmentation. In: Proceedings of the international conference on multimedia modeling, pp 201–213
Zurück zum Zitat Sun Y, Zhang C, Huang Z, Liu J, Han J, Ding E (2018) Textnet: irregular text reading from images with an end-to-end trainable network. In: Proceedings of the Asian conference on computer vision, pp 83–99 Sun Y, Zhang C, Huang Z, Liu J, Han J, Ding E (2018) Textnet: irregular text reading from images with an end-to-end trainable network. In: Proceedings of the Asian conference on computer vision, pp 83–99
Zurück zum Zitat Sun Y, Liu J, Liu W, Han J, Ding E, Liu J (2019) Chinese street view text: large-scale Chinese text reading with partially supervised learning. In: Proceedings of the IEEE international conference on computer vision, pp 9086–9095 Sun Y, Liu J, Liu W, Han J, Ding E, Liu J (2019) Chinese street view text: large-scale Chinese text reading with partially supervised learning. In: Proceedings of the IEEE international conference on computer vision, pp 9086–9095
Zurück zum Zitat Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9 Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Zurück zum Zitat Tang Y, Wu X (2017) Scene text detection and segmentation based on cascaded convolution neural networks. IEEE Trans Image Process 26(3):1509–1520CrossRef Tang Y, Wu X (2017) Scene text detection and segmentation based on cascaded convolution neural networks. IEEE Trans Image Process 26(3):1509–1520CrossRef
Zurück zum Zitat Tang Y, Wu X (2018) Scene text detection using superpixel-based stroke feature transform and deep learning based region classification. IEEE Trans Multimed 20(9):2276–2288CrossRef Tang Y, Wu X (2018) Scene text detection using superpixel-based stroke feature transform and deep learning based region classification. IEEE Trans Multimed 20(9):2276–2288CrossRef
Zurück zum Zitat Tang J, Yang Z, Wang Y, Zheng Q, Xu Y, Bai X (2019) SegLink++: detecting dense and arbitrary-shaped scene text by instance-aware component grouping. In: Pattern recognition, vol 96, pp 106954 Tang J, Yang Z, Wang Y, Zheng Q, Xu Y, Bai X (2019) SegLink++: detecting dense and arbitrary-shaped scene text by instance-aware component grouping. In: Pattern recognition, vol 96, pp 106954
Zurück zum Zitat Tian Z, Huang W, He T, He P, Qiao Y (2016a) Detecting text in natural image with connectionist text proposal network. In: European conference on computer vision, pp 56–72 Tian Z, Huang W, He T, He P, Qiao Y (2016a) Detecting text in natural image with connectionist text proposal network. In: European conference on computer vision, pp 56–72
Zurück zum Zitat Tian S, Bhattacharya U, Lu S, Su B, Wang Q, Wei X, Lu Y, Tan CL (2016b) Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recognit 51:125–134CrossRef Tian S, Bhattacharya U, Lu S, Su B, Wang Q, Wei X, Lu Y, Tan CL (2016b) Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recognit 51:125–134CrossRef
Zurück zum Zitat Tian Z, Shu M, Lyu P, Li R, Zhou C, Shen X, Jia J (2019) Learning shape-aware embedding for scene text detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4234–4243 Tian Z, Shu M, Lyu P, Li R, Zhou C, Shen X, Jia J (2019) Learning shape-aware embedding for scene text detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4234–4243
Zurück zum Zitat Tychsen-Smith L, Petersson L (2017) Denet: scalable real-time object detection with directed sparse sampling. In: IEEE international conference of computer vision, pp 428–436 Tychsen-Smith L, Petersson L (2017) Denet: scalable real-time object detection with directed sparse sampling. In: IEEE international conference of computer vision, pp 428–436
Zurück zum Zitat Van Dongen SM (2000) Graph clustering by flow simulation (Doctoral dissertation) Van Dongen SM (2000) Graph clustering by flow simulation (Doctoral dissertation)
Zurück zum Zitat Veit A, Matera T, Neumann L, Matas J, Belongie S (2016) Coco-text: Dataset and benchmark for text detection and recognition in natural images. In: arXiv:1601.07140 Veit A, Matera T, Neumann L, Matas J, Belongie S (2016) Coco-text: Dataset and benchmark for text detection and recognition in natural images. In: arXiv:​1601.​07140
Zurück zum Zitat Wang K, Belongie S (2010) Word spotting in the wild. In: European conference on computer vision, pp 591–604 Wang K, Belongie S (2010) Word spotting in the wild. In: European conference on computer vision, pp 591–604
Zurück zum Zitat Wang K, Babenko B, Belongie S (2011) End-to-end scene text recognition. In: IEEE international conference on computer vision, pp 1457–1464 Wang K, Babenko B, Belongie S (2011) End-to-end scene text recognition. In: IEEE international conference on computer vision, pp 1457–1464
Zurück zum Zitat Wang T, Wu DJ, Coates A, Ng AY (2012) End-to-end text recognition with convolutional neural networks. In: 21st international conference on pattern recognition, pp 3304–3308 Wang T, Wu DJ, Coates A, Ng AY (2012) End-to-end text recognition with convolutional neural networks. In: 21st international conference on pattern recognition, pp 3304–3308
Zurück zum Zitat Wang K, Li G, Liu X, Yan J, Li S, Huang H (2018) Natural scene text detection based on MSER. In: 3rd international conference on communications, information management and network security Wang K, Li G, Liu X, Yan J, Li S, Huang H (2018) Natural scene text detection based on MSER. In: 3rd international conference on communications, information management and network security
Zurück zum Zitat Wang X, Feng X, Xia Z (2019a) Scene video text tracking based on hybrid deep text detection and layout constraint. Neurocomputing 363:223–235CrossRef Wang X, Feng X, Xia Z (2019a) Scene video text tracking based on hybrid deep text detection and layout constraint. Neurocomputing 363:223–235CrossRef
Zurück zum Zitat Wang W, Xie E, Song X, Zang Y, Wang W, Lu T, Yu G, Shen C (2019b) Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: Proceedings of the IEEE international conference on computer vision, pp 8440–8449 Wang W, Xie E, Song X, Zang Y, Wang W, Lu T, Yu G, Shen C (2019b) Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: Proceedings of the IEEE international conference on computer vision, pp 8440–8449
Zurück zum Zitat Wang P, Zhang C, Qi F, Huang Z, En M, Han J, Liu J, Ding E, Shi G (2019c) A single-shot arbitrarily-shaped text detector based on context attended multi-task learning. In: Proceedings of the 27th ACM international conference on multimedia, pp 1277–1285 Wang P, Zhang C, Qi F, Huang Z, En M, Han J, Liu J, Ding E, Shi G (2019c) A single-shot arbitrarily-shaped text detector based on context attended multi-task learning. In: Proceedings of the 27th ACM international conference on multimedia, pp 1277–1285
Zurück zum Zitat Wang X, Jiang Y, Luo Z, Liu CL, Choi H, Kim S (2019d) Arbitrary shape scene text detection with adaptive text region representation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6449–6458 Wang X, Jiang Y, Luo Z, Liu CL, Choi H, Kim S (2019d) Arbitrary shape scene text detection with adaptive text region representation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6449–6458
Zurück zum Zitat Wang Y, Xie H, Fu Z, Zhang Y (2019e) DSRN: a deep scale relationship network for scene text detection. In: Proceedings of the 28th international joint conference on artificial intelligence. AAAI Press, pp 947–953 Wang Y, Xie H, Fu Z, Zhang Y (2019e) DSRN: a deep scale relationship network for scene text detection. In: Proceedings of the 28th international joint conference on artificial intelligence. AAAI Press, pp 947–953
Zurück zum Zitat Wang H, Lu P, Zhang H, Yang M, Bai X, Xu Y, He M, Wang Y, Liu W (2019f) All you need is boundary: toward arbitrary-shaped text spotting. In: arXiv:1911.09550 Wang H, Lu P, Zhang H, Yang M, Bai X, Xu Y, He M, Wang Y, Liu W (2019f) All you need is boundary: toward arbitrary-shaped text spotting. In: arXiv:​1911.​09550
Zurück zum Zitat Wang S, Liu Y, He Z, Wang Y, Tang Z (2020a) A quadrilateral scene text detector with two-stage network architecture. Pattern Recognit 102:107230CrossRef Wang S, Liu Y, He Z, Wang Y, Tang Z (2020a) A quadrilateral scene text detector with two-stage network architecture. Pattern Recognit 102:107230CrossRef
Zurück zum Zitat Wang Y, Xie H, Zha Z, Xing M, Fu Z, Zhang Y (2020b) ContourNet: taking a further step toward accurate arbitrary-shaped scene text detection. In: arXiv:2004.04940 Wang Y, Xie H, Zha Z, Xing M, Fu Z, Zhang Y (2020b) ContourNet: taking a further step toward accurate arbitrary-shaped scene text detection. In: arXiv:​2004.​04940
Zurück zum Zitat Wolf C, Jolion JM (2006) Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int J Doc Anal Recognit 8(4):280–296CrossRef Wolf C, Jolion JM (2006) Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int J Doc Anal Recognit 8(4):280–296CrossRef
Zurück zum Zitat Wu Y, Natarajan P (2017) Self-organized text detection with minimal post-processing via border learning. In: IEEE international conference of computer vision, pp 5000–5009 Wu Y, Natarajan P (2017) Self-organized text detection with minimal post-processing via border learning. In: IEEE international conference of computer vision, pp 5000–5009
Zurück zum Zitat Xie E, Zang Y, Shao S, Yu G, Yao C, Li G (2019) Scene text detection with supervised pyramid context network. In: Proceedings of the AAAI conference on artificial intelligence, pp 9038–9045 Xie E, Zang Y, Shao S, Yu G, Yao C, Li G (2019) Scene text detection with supervised pyramid context network. In: Proceedings of the AAAI conference on artificial intelligence, pp 9038–9045
Zurück zum Zitat Xu Y, Wang Y, Zhou W, Wang Y, Yang Z, Bai X (2019a) TextField: learning a deep direction field for irregular scene text detection. IEEE Trans Image Process 28(11):5566–5579MathSciNetMATHCrossRef Xu Y, Wang Y, Zhou W, Wang Y, Yang Z, Bai X (2019a) TextField: learning a deep direction field for irregular scene text detection. IEEE Trans Image Process 28(11):5566–5579MathSciNetMATHCrossRef
Zurück zum Zitat Xu Y, Duan J, Kuang Z, Yue X, Sun H, Guan Y, Zhang W (2019b) Geometry normalization networks for accurate scene text detection. In: arXiv:1909.00794 Xu Y, Duan J, Kuang Z, Yue X, Sun H, Guan Y, Zhang W (2019b) Geometry normalization networks for accurate scene text detection. In: arXiv:​1909.​00794
Zurück zum Zitat Yang Q, Cheng M, Zhou W, Chen Y, Qiu M, Lin W, Chu W (2018) Inceptext: a new inception-text module with deformable psroi pooling for multi-oriented scene text detection. In: arXiv:1805.01167 Yang Q, Cheng M, Zhou W, Chen Y, Qiu M, Lin W, Chu W (2018) Inceptext: a new inception-text module with deformable psroi pooling for multi-oriented scene text detection. In: arXiv:​1805.​01167
Zurück zum Zitat Yang P, Zhang F, Yang G (2019) A fast scene text detector using knowledge distillation. IEEE Access 7:22588–22598CrossRef Yang P, Zhang F, Yang G (2019) A fast scene text detector using knowledge distillation. IEEE Access 7:22588–22598CrossRef
Zurück zum Zitat Yang P, Yang G, Gong X, Wu P, Han X, Wu J, Chen C (2020) Instance segmentation network with self-distillation for scene text detection. IEEE Access 8:45825–45836CrossRef Yang P, Yang G, Gong X, Wu P, Han X, Wu J, Chen C (2020) Instance segmentation network with self-distillation for scene text detection. IEEE Access 8:45825–45836CrossRef
Zurück zum Zitat Yao C, Bai X, Liu W, Ma Y, Tu Z (2012) Detecting texts of arbitrary orientations in natural images. In: IEEE conference on computer vision and pattern recognition, pp 1083–1090 Yao C, Bai X, Liu W, Ma Y, Tu Z (2012) Detecting texts of arbitrary orientations in natural images. In: IEEE conference on computer vision and pattern recognition, pp 1083–1090
Zurück zum Zitat Yao C, Bai X, Sang N, Zhou X, Zhou S, Cao Z (2016) Scene text detection via holistic, multi-channel prediction. In: arXiv:1606.09002 Yao C, Bai X, Sang N, Zhou X, Zhou S, Cao Z (2016) Scene text detection via holistic, multi-channel prediction. In: arXiv:​1606.​09002
Zurück zum Zitat Yi C, Tian Y (2011) Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans Image Process 20(9):2594–2605MathSciNetMATHCrossRef Yi C, Tian Y (2011) Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans Image Process 20(9):2594–2605MathSciNetMATHCrossRef
Zurück zum Zitat Yi C, Tian Y (2012) Localizing text in scene images by boundary clustering, stroke segmentation, and string fragment classification. IEEE Trans Image Process 21(9):4256–4268MathSciNetMATHCrossRef Yi C, Tian Y (2012) Localizing text in scene images by boundary clustering, stroke segmentation, and string fragment classification. IEEE Trans Image Process 21(9):4256–4268MathSciNetMATHCrossRef
Zurück zum Zitat Zamberletti A, Noce L, Gallo I (2014) Text localization based on fast feature pyramids and multi-resolution maximally stable extremal regions. In: Asian conference on computer vision, pp 91–105 Zamberletti A, Noce L, Gallo I (2014) Text localization based on fast feature pyramids and multi-resolution maximally stable extremal regions. In: Asian conference on computer vision, pp 91–105
Zurück zum Zitat Zeiler MD, Taylor GW, Fergus R (2011) Adaptive deconvolutional networks for mid and high level feature learning. In: 2011 International conference on computer vision, pp 2018–2025 Zeiler MD, Taylor GW, Fergus R (2011) Adaptive deconvolutional networks for mid and high level feature learning. In: 2011 International conference on computer vision, pp 2018–2025
Zurück zum Zitat Zhan F, Lu S, Xue C (2018) Verisimilar image synthesis for accurate detection and recognition of texts in scenes. In: Proceedings of the European conference on computer vision, pp 249–266 Zhan F, Lu S, Xue C (2018) Verisimilar image synthesis for accurate detection and recognition of texts in scenes. In: Proceedings of the European conference on computer vision, pp 249–266
Zurück zum Zitat Zhang Z, Zhang C, Shen W, Yao C, Liu W, Bai X (2016) Multi-oriented text detection with fully convolutional networks. In: IEEE international conference on computer vision and pattern recognition, pp 4159–4167 Zhang Z, Zhang C, Shen W, Yao C, Liu W, Bai X (2016) Multi-oriented text detection with fully convolutional networks. In: IEEE international conference on computer vision and pattern recognition, pp 4159–4167
Zurück zum Zitat Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: IEEE conference on computer vision and pattern recognition, pp 4203–4212 Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: IEEE conference on computer vision and pattern recognition, pp 4203–4212
Zurück zum Zitat Zhang C, Liang B, Huang Z, En M, Han J, Ding E, Ding X (2019) Look more than once: an accurate detector for text of arbitrary shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10552–10561 Zhang C, Liang B, Huang Z, En M, Han J, Ding E, Ding X (2019) Look more than once: an accurate detector for text of arbitrary shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10552–10561
Zurück zum Zitat Zhong Z, Jin L, Zhang S, Feng Z (2016) Deeptext: a unified framework for text proposal generation and text detection in natural images. arXiv:1605.07314 Zhong Z, Jin L, Zhang S, Feng Z (2016) Deeptext: a unified framework for text proposal generation and text detection in natural images. arXiv:​1605.​07314
Zurück zum Zitat Zhong Z, Sun L, Huo Q (2019a) An anchor-free region proposal network for Faster R-CNN based text detection approaches. Int J Doc Anal Recognit 22(3):315–327CrossRef Zhong Z, Sun L, Huo Q (2019a) An anchor-free region proposal network for Faster R-CNN based text detection approaches. Int J Doc Anal Recognit 22(3):315–327CrossRef
Zurück zum Zitat Zhong Z, Sun L, Huo Q (2019b) Improved localization accuracy by LocNet for faster R-CNN based text detection in natural scene images. In: Pattern recognition, p 106986 Zhong Z, Sun L, Huo Q (2019b) Improved localization accuracy by LocNet for faster R-CNN based text detection in natural scene images. In: Pattern recognition, p 106986
Zurück zum Zitat Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) EAST: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5551–5560 Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) EAST: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5551–5560
Zurück zum Zitat Zhu Y, Yao C, Bai X (2016) Scene text detection and recognition: recent advances and future trends. Front Comput Sci 10(1):19–36CrossRef Zhu Y, Yao C, Bai X (2016) Scene text detection and recognition: recent advances and future trends. Front Comput Sci 10(1):19–36CrossRef
Zurück zum Zitat Zhu Y, Ma C, Du J (2019) Rotated cascade R-CNN: a shape robust detector with coordinate regression. In: Pattern recognition, vol 96 Zhu Y, Ma C, Du J (2019) Rotated cascade R-CNN: a shape robust detector with coordinate regression. In: Pattern recognition, vol 96
Metadaten
Titel
Deep learning approaches to scene text detection: a comprehensive review
verfasst von
Tauseef Khan
Ram Sarkar
Ayatullah Faruk Mollah
Publikationsdatum
01.01.2021
Verlag
Springer Netherlands
Erschienen in
Artificial Intelligence Review / Ausgabe 5/2021
Print ISSN: 0269-2821
Elektronische ISSN: 1573-7462
DOI
https://doi.org/10.1007/s10462-020-09930-6

Weitere Artikel der Ausgabe 5/2021

Artificial Intelligence Review 5/2021 Zur Ausgabe

Premium Partner