Skip to main content

2021 | OriginalPaper | Buchkapitel

58. Text Localization in Scene Images Using Faster R-CNN with Double Region Proposal Networks

verfasst von : Pragya Hari, Rajib Ghosh

Erschienen in: Proceedings of the International Conference on Paradigms of Computing, Communication and Data Sciences

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The problem of text extraction is an interesting area of research in computer vision domain. In the recent years, emergence of various applications on smart hand-held devices such as translation of text from one language to another in real time, computerized aid for visually impaired, user navigation & track monitoring and driving assistance systems, has stimulated the renewed research interest in this domain. Although various Convolutional Neural Network (CNN) based methods have been explored for text localization in scene images, method using Faster R-CNN with double region proposal network (RPN) has not been explored yet. The conventional Faster R-CNN produces regions-of-interest (ROIs) through a single RPN utilizing the feature matrix of the last convolutional layer, whereas the present investigation proposes an end-to-end method of scene text localization where ROIs are generated by double RPNs using the feature matrices of thirteen different convolutional layers and four poolings. Both of these RPNs have then been merged, which enables the system to locate the text regions in the scene images. The performance of the present system has been assessed using ICDAR 2013/2015 RRC test dataset and it has outperformed all existing studies on scene text detection.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Matas J, Chum O, Urban M, Pajdla T (2004) Robust wide-baseline stereo from maximally stable extremal regions. Image Vis Comput 22(10):761–767CrossRef Matas J, Chum O, Urban M, Pajdla T (2004) Robust wide-baseline stereo from maximally stable extremal regions. Image Vis Comput 22(10):761–767CrossRef
2.
Zurück zum Zitat Anthimopoulos M, Gatos B, Pratikakis I (2013) Detection of artificial and scene text in images and video frames. Pattern Anal Appl 16(3):431–446MathSciNetCrossRef Anthimopoulos M, Gatos B, Pratikakis I (2013) Detection of artificial and scene text in images and video frames. Pattern Anal Appl 16(3):431–446MathSciNetCrossRef
3.
Zurück zum Zitat Wang T, Wu DJ, Coates A, Ng AY (2012) End-to-end text recognition with convolutional neural networks. In: Proceedings of 21st international conference on pattern recognition. IEEE, Tsukuba Science City, Japan, pp 3304–3308 Wang T, Wu DJ, Coates A, Ng AY (2012) End-to-end text recognition with convolutional neural networks. In: Proceedings of 21st international conference on pattern recognition. IEEE, Tsukuba Science City, Japan, pp 3304–3308
4.
Zurück zum Zitat Nagaoka Y, Miyazaki T, Sugaya Y, Omachi S (2017) Text detection by faster R-CNN with multiple region proposal networks. In: Proceedings of international conference on document analysis and recognition. IEEE, Kyoto, Japan, pp 15–20 Nagaoka Y, Miyazaki T, Sugaya Y, Omachi S (2017) Text detection by faster R-CNN with multiple region proposal networks. In: Proceedings of international conference on document analysis and recognition. IEEE, Kyoto, Japan, pp 15–20
5.
Zurück zum Zitat Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2016) Reading text in the wild with convolutional neural networks. Int J Comput Vis 116(1):1–20MathSciNetCrossRef Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2016) Reading text in the wild with convolutional neural networks. Int J Comput Vis 116(1):1–20MathSciNetCrossRef
6.
Zurück zum Zitat Gupta A, Vedaldi A, Zisserman A (2016) Synthetic data for text localisation in natural images. In: Proceedings of international conference on computer vision and pattern recognition. IEEE, Las Vegas, USA, pp 2315–2324 Gupta A, Vedaldi A, Zisserman A (2016) Synthetic data for text localisation in natural images. In: Proceedings of international conference on computer vision and pattern recognition. IEEE, Las Vegas, USA, pp 2315–2324
7.
Zurück zum Zitat Zhang Z, Shen W, Yao C, Bai X: Symmetry-based text line detection in natural scenes. In: Proceedings of international conference on computer vision and pattern recognition. IEEE, Boston, USA, pp 2558–2567 Zhang Z, Shen W, Yao C, Bai X: Symmetry-based text line detection in natural scenes. In: Proceedings of international conference on computer vision and pattern recognition. IEEE, Boston, USA, pp 2558–2567
8.
Zurück zum Zitat Shi B, Bai X, Belongie S (2017) Detecting oriented text in natural images by linking segments. In: Proceedings of international conference on computer vision and pattern recognition. IEEE, Honolulu, USA, pp 2550–2558 Shi B, Bai X, Belongie S (2017) Detecting oriented text in natural images by linking segments. In: Proceedings of international conference on computer vision and pattern recognition. IEEE, Honolulu, USA, pp 2550–2558
9.
Zurück zum Zitat Yi C, Tian Y (2011) Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans Image Process 20(9):2594–2605MathSciNetCrossRef Yi C, Tian Y (2011) Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans Image Process 20(9):2594–2605MathSciNetCrossRef
10.
Zurück zum Zitat Yin XC, Yin X, Huang K, Hao HW (2014) Robust text detection in natural scene images. IEEE Trans Pattern Anal Mach Intell 36(5):970–983CrossRef Yin XC, Yin X, Huang K, Hao HW (2014) Robust text detection in natural scene images. IEEE Trans Pattern Anal Mach Intell 36(5):970–983CrossRef
11.
Zurück zum Zitat Neumann L, Matas J (2015) Efficient scene text localization and recognition with local character refinement. In: Proceedings of international conference on document analysis and recognition. IEEE, Nancy, France, pp 1–6 Neumann L, Matas J (2015) Efficient scene text localization and recognition with local character refinement. In: Proceedings of international conference on document analysis and recognition. IEEE, Nancy, France, pp 1–6
12.
Zurück zum Zitat Neumann L, Matas J (2016) Real-time lexicon-free scene text localization and recognition. IEEE Trans Pattern Anal Mach Intelligence 38(9):1872–1885CrossRef Neumann L, Matas J (2016) Real-time lexicon-free scene text localization and recognition. IEEE Trans Pattern Anal Mach Intelligence 38(9):1872–1885CrossRef
13.
Zurück zum Zitat Tian Z, Huang W, He T, He P, Qiao Y (2016) Detecting text in natural image with connectionist text proposal network. In: Proceedings of European conference on computer vision. Springer, Amsterdam, Netherlands, pp 1–16 Tian Z, Huang W, He T, He P, Qiao Y (2016) Detecting text in natural image with connectionist text proposal network. In: Proceedings of European conference on computer vision. Springer, Amsterdam, Netherlands, pp 1–16
14.
Zurück zum Zitat Raj H, Ghosh R (2014) Devanagari text extraction from natural scene images. In: Proceedings of international conference on advances in computing, communications and informatics. IEEE New Delhi, India, pp 513–517 Raj H, Ghosh R (2014) Devanagari text extraction from natural scene images. In: Proceedings of international conference on advances in computing, communications and informatics. IEEE New Delhi, India, pp 513–517
15.
Zurück zum Zitat Gomez L, Karatzas D (2013) Multi-script text extraction from natural scenes. In: Proceedings of international conference on document analysis and recognition. IEEE, Washington DC, USA, pp 467–471 Gomez L, Karatzas D (2013) Multi-script text extraction from natural scenes. In: Proceedings of international conference on document analysis and recognition. IEEE, Washington DC, USA, pp 467–471
16.
Zurück zum Zitat Mishra A, Alahari K, Jawahar C (2012) Top-down and bottom-up cues for scene text recognition. In: Proceedings of international conference on computer vision and pattern recognition. Providence, USA, pp 759–763 Mishra A, Alahari K, Jawahar C (2012) Top-down and bottom-up cues for scene text recognition. In: Proceedings of international conference on computer vision and pattern recognition. Providence, USA, pp 759–763
17.
Zurück zum Zitat Pan YF, Zhu Y, Sun J, Naoi S (2011) Improving scene text detection by scale adaptive segmentation and weighted CRF verification. In: Proceedings of 11th international conference on document analysis and recognition. IEEE, Beijing, China, pp. 759–763 Pan YF, Zhu Y, Sun J, Naoi S (2011) Improving scene text detection by scale adaptive segmentation and weighted CRF verification. In: Proceedings of 11th international conference on document analysis and recognition. IEEE, Beijing, China, pp. 759–763
18.
Zurück zum Zitat Zhang H, Liu C, Yang C, Ding X, Wang KQ (2011) An improved scene text extraction method using conditional random field and optical character recognition. In: Proceedings of 11th international conference on document analysis and recognition. IEEE, Beijing, China, pp 708–712 Zhang H, Liu C, Yang C, Ding X, Wang KQ (2011) An improved scene text extraction method using conditional random field and optical character recognition. In: Proceedings of 11th international conference on document analysis and recognition. IEEE, Beijing, China, pp 708–712
19.
Zurück zum Zitat Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. In: Proceedings of international conference on computer vision and pattern recognition. IEEE, San Francisco, USA, pp 2963–2970 Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. In: Proceedings of international conference on computer vision and pattern recognition. IEEE, San Francisco, USA, pp 2963–2970
20.
Zurück zum Zitat Wong EK, Chen M (2003) A new robust algorithm for video text extraction. c Recogn 36:1397–1406CrossRef Wong EK, Chen M (2003) A new robust algorithm for video text extraction. c Recogn 36:1397–1406CrossRef
21.
Zurück zum Zitat Weinman J, Learned-Miller E, Hanson A (2009) Scene text recognition using similarity and a lexicon with sparse belief propagation. IEEE Trans Pattern Anal Mach Intell 31:1733–1746CrossRef Weinman J, Learned-Miller E, Hanson A (2009) Scene text recognition using similarity and a lexicon with sparse belief propagation. IEEE Trans Pattern Anal Mach Intell 31:1733–1746CrossRef
22.
Zurück zum Zitat Tian S, Lu S, Su B, Tan CL (2014) Scene text segmentation with multi-level maximally stable extremal regions. In: Proceedings of 22nd international conference on pattern recognition. IEEE, Stockholm, Sweden, pp 2703–2708 Tian S, Lu S, Su B, Tan CL (2014) Scene text segmentation with multi-level maximally stable extremal regions. In: Proceedings of 22nd international conference on pattern recognition. IEEE, Stockholm, Sweden, pp 2703–2708
23.
Zurück zum Zitat Yin XC, Pei WY, Zhang J, Hao HW (2015) Multi-orientation scene text detection with adaptive clustering. IEEE Trans Pattern Anal Mach Intell 37(9):1930–1937CrossRef Yin XC, Pei WY, Zhang J, Hao HW (2015) Multi-orientation scene text detection with adaptive clustering. IEEE Trans Pattern Anal Mach Intell 37(9):1930–1937CrossRef
24.
Zurück zum Zitat Sain A, Bhunia AK, Roy PP, Pal U (2018) Multi-oriented text detection and verification in video frames and scene images. Neuro Comput 275:1531–1549 Sain A, Bhunia AK, Roy PP, Pal U (2018) Multi-oriented text detection and verification in video frames and scene images. Neuro Comput 275:1531–1549
25.
Zurück zum Zitat Karatzas D, Bigorda LG, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, Matas, J, Neumann L, Chandrasekhar VR, Lu S, Shafait F, Uchida S, Valveny E (2015) ICDAR 2015 competition on Robust Reading. In: Proceedings of 13th international conference on document analysis and recognition. IEEE, Nancy, France, pp 1156–1160 Karatzas D, Bigorda LG, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, Matas, J, Neumann L, Chandrasekhar VR, Lu S, Shafait F, Uchida S, Valveny E (2015) ICDAR 2015 competition on Robust Reading. In: Proceedings of 13th international conference on document analysis and recognition. IEEE, Nancy, France, pp 1156–1160
26.
Zurück zum Zitat Wang K, Belongie S (2010) Word spotting in the wild. In: Proceedings of European conference on computer vision. Springer, Crete, Greece, pp 591–604 Wang K, Belongie S (2010) Word spotting in the wild. In: Proceedings of European conference on computer vision. Springer, Crete, Greece, pp 591–604
Metadaten
Titel
Text Localization in Scene Images Using Faster R-CNN with Double Region Proposal Networks
verfasst von
Pragya Hari
Rajib Ghosh
Copyright-Jahr
2021
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-15-7533-4_58

Neuer Inhalt