Skip to main content
Top
Published in: Neural Processing Letters 4/2023

17-11-2022

Real-Time Accurate Text Detection with Adaptive Double Pyramid Network

Authors: Weina Zhou, Wanyu Song

Published in: Neural Processing Letters | Issue 4/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Segmentation-based methods have been widely adopted in scene text detection recently, for they could more accurately predict the shape of various scene text at pixel-level than other methods. However, complicated feature aggregation or label assignment algorithms used in current segmentation-based methods would significantly decrease the detection speed during the improving of accuracy. In this paper, we present an Adaptive Double Pyramid Network (ADPNet) for real-time detection of arbitrary-shaped text, which sets a Double Feature Enhancement Pyramid using Packet Downsampling Units (PDUnits) to enhance feature maps with a minimal amount of processing. The performance of ADPNet is validated on three benchmark datasets, and it shows that ADPNet obtains state-of-the-art performance in both speed and accuracy. Specifically, the proposed network achieves an F-measure of 85.7% while running at 40.5 fps on the ICDAR2015 dataset.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Yu J, Yao J, Zhang J, Yu Z, Tao D (2019) Single pixel reconstruction for one-stage instance segmentation. arXiv preprint arXiv:1904.07426 Yu J, Yao J, Zhang J, Yu Z, Tao D (2019) Single pixel reconstruction for one-stage instance segmentation. arXiv preprint arXiv:​1904.​07426
2.
go back to reference Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440 Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440
3.
go back to reference Zhang Z, Zhang C, Shen W, Yao C, Liu W, Bai X (2016) Multi-oriented text detection with fully convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4159–4167 Zhang Z, Zhang C, Shen W, Yao C, Liu W, Bai X (2016) Multi-oriented text detection with fully convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4159–4167
4.
go back to reference Deng D, Liu H, Li X, Cai D (2018) Pixellink: Detecting scene text via instance segmentation. In: Proceedings of the AAAI conference on artificial intelligence, vol. 32 Deng D, Liu H, Li X, Cai D (2018) Pixellink: Detecting scene text via instance segmentation. In: Proceedings of the AAAI conference on artificial intelligence, vol. 32
5.
go back to reference Long S, Ruan J, Zhang W, He X, Wu W, Yao C (2018) Textsnake: a flexible representation for detecting text of arbitrary shapes. In: Proceedings of the European conference on computer vision (ECCV), pp. 20–36 Long S, Ruan J, Zhang W, He X, Wu W, Yao C (2018) Textsnake: a flexible representation for detecting text of arbitrary shapes. In: Proceedings of the European conference on computer vision (ECCV), pp. 20–36
6.
go back to reference Wang W, Xie E, Li X, Hou W, Lu T, Yu G, Shao S (2019) Shape robust text detection with progressive scale expansion network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9336–9345 Wang W, Xie E, Li X, Hou W, Lu T, Yu G, Shao S (2019) Shape robust text detection with progressive scale expansion network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9336–9345
7.
go back to reference Liao M, Wan Z, Yao C, Chen K, Bai X (2020) Real-time scene text detection with differentiable binarization. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, 11474–11481 Liao M, Wan Z, Yao C, Chen K, Bai X (2020) Real-time scene text detection with differentiable binarization. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, 11474–11481
8.
go back to reference Li J, Lin Y, Liu R, Ho CM, Shi H (2021) Rsca: real-time segmentation-based context-aware scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2349–2358 Li J, Lin Y, Liu R, Ho CM, Shi H (2021) Rsca: real-time segmentation-based context-aware scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2349–2358
9.
go back to reference Tian Z, Shu M, Lyu P, Li R, Zhou C, Shen X, Jia J (2019) Learning shape-aware embedding for scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4234–4243 Tian Z, Shu M, Lyu P, Li R, Zhou C, Shen X, Jia J (2019) Learning shape-aware embedding for scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4234–4243
10.
go back to reference Zhang J, Cao Y, Wu Q (2021) Vector of locally and adaptively aggregated descriptors for image feature representation. Pattern Recognit. 116:107952CrossRef Zhang J, Cao Y, Wu Q (2021) Vector of locally and adaptively aggregated descriptors for image feature representation. Pattern Recognit. 116:107952CrossRef
11.
go back to reference Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Advances in neural information processing systems 28
12.
go back to reference Huang J, Jiang Z, Zhang H, Cai B, Yao Y (2017) Region proposal for ship detection based on structured forests edge method. In: 2017 IEEE international geoscience and remote sensing symposium (IGARSS), pp. 1856–1859. IEEE Huang J, Jiang Z, Zhang H, Cai B, Yao Y (2017) Region proposal for ship detection based on structured forests edge method. In: 2017 IEEE international geoscience and remote sensing symposium (IGARSS), pp. 1856–1859. IEEE
13.
go back to reference Wang S, Liu Y, He Z, Wang Y, Tang Z (2020) A quadrilateral scene text detector with two-stage network architecture. Pattern Recognit 102:107230CrossRef Wang S, Liu Y, He Z, Wang Y, Tang Z (2020) A quadrilateral scene text detector with two-stage network architecture. Pattern Recognit 102:107230CrossRef
14.
go back to reference Xue C, Lu S, Hoi S (2022) Detection and rectification of arbitrary shaped scene texts by using text keypoints and links. Pattern Recognit 124:108494CrossRef Xue C, Lu S, Hoi S (2022) Detection and rectification of arbitrary shaped scene texts by using text keypoints and links. Pattern Recognit 124:108494CrossRef
15.
go back to reference Deng L, Gong Y, Lin Y, Shuai J, Tu X, Zhang Y, Ma Z, Xie M (2019) Detecting multi-oriented text with corner-based region proposals. Neurocomputing 334:134–142CrossRef Deng L, Gong Y, Lin Y, Shuai J, Tu X, Zhang Y, Ma Z, Xie M (2019) Detecting multi-oriented text with corner-based region proposals. Neurocomputing 334:134–142CrossRef
16.
go back to reference Li J, Cheng B, Feris R, Xiong J, Huang TS, Hwu W-M, Shi H (2021) Pseudo-iou: Improving label assignment in anchor-free object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2378–2387 Li J, Cheng B, Feris R, Xiong J, Huang TS, Hwu W-M, Shi H (2021) Pseudo-iou: Improving label assignment in anchor-free object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2378–2387
17.
go back to reference Mafla A, Tito R, Dey S, Gómez L, Rusiñol M, Valveny E, Karatzas D (2021) Real-time lexicon-free scene text retrieval. Pattern Recognit 110:107656CrossRef Mafla A, Tito R, Dey S, Gómez L, Rusiñol M, Valveny E, Karatzas D (2021) Real-time lexicon-free scene text retrieval. Pattern Recognit 110:107656CrossRef
18.
go back to reference Zhou W, Chen K (2022) A lightweight hand gesture recognition in complex backgrounds. Displays 74:102226CrossRef Zhou W, Chen K (2022) A lightweight hand gesture recognition in complex backgrounds. Displays 74:102226CrossRef
19.
go back to reference Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5551–5560 Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5551–5560
20.
go back to reference Xu X, Zhang Z, Wang Z, Price B, Wang Z, Shi H (2021) Rethinking text segmentation: A novel dataset and a text-specific refinement approach. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12045–12055 Xu X, Zhang Z, Wang Z, Price B, Wang Z, Shi H (2021) Rethinking text segmentation: A novel dataset and a text-specific refinement approach. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12045–12055
21.
go back to reference Qiao L, Tang S, Cheng Z, Xu Y, Niu Y, Pu S, Wu F (2020) Text perceptron: Towards end-to-end arbitrary-shaped text spotting. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, pp. 11899–11907 Qiao L, Tang S, Cheng Z, Xu Y, Niu Y, Pu S, Wu F (2020) Text perceptron: Towards end-to-end arbitrary-shaped text spotting. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, pp. 11899–11907
22.
go back to reference Cao M, Zou Y (2020) All you need is a second look: Towards tighter arbitrary shape text detection. In: ICASSP 2020-2020 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp. 2228–2232. IEEE Cao M, Zou Y (2020) All you need is a second look: Towards tighter arbitrary shape text detection. In: ICASSP 2020-2020 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp. 2228–2232. IEEE
23.
go back to reference Wang W, Xie E, Song X, Zang Y, Wang W, Lu T, Yu G, Shen C (2019) Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8440–8449 Wang W, Xie E, Song X, Zang Y, Wang W, Lu T, Yu G, Shen C (2019) Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8440–8449
24.
go back to reference Wang Y, Xie H, Zha Z-J, Xing M, Fu Z, Zhang Y (2020) Contournet: Taking a further step toward accurate arbitrary-shaped scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11753–11762 Wang Y, Xie H, Zha Z-J, Xing M, Fu Z, Zhang Y (2020) Contournet: Taking a further step toward accurate arbitrary-shaped scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11753–11762
25.
go back to reference Liang M, Hou J-B, Zhu X, Yang C, Qin J, Yin X-C (2021) Multi-orientation scene text detection with scale-guided regression. Neurocomputing 461:310–318CrossRef Liang M, Hou J-B, Zhu X, Yang C, Qin J, Yin X-C (2021) Multi-orientation scene text detection with scale-guided regression. Neurocomputing 461:310–318CrossRef
26.
go back to reference Gupta A, Vedaldi A, Zisserman A (2016) Synthetic data for text localisation in natural images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2315–2324 Gupta A, Vedaldi A, Zisserman A (2016) Synthetic data for text localisation in natural images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2315–2324
27.
go back to reference Yao C, Bai X, Liu W, Ma Y, Tu Z (2012) Detecting texts of arbitrary orientations in natural images. In: 2012 IEEE conference on computer vision and pattern recognition, pp. 1083–1090. IEEE Yao C, Bai X, Liu W, Ma Y, Tu Z (2012) Detecting texts of arbitrary orientations in natural images. In: 2012 IEEE conference on computer vision and pattern recognition, pp. 1083–1090. IEEE
28.
go back to reference Yao C, Bai X, Liu W (2014) A unified framework for multioriented text detection and recognition. IEEE Transact Image Process 23(11):4737–4749MathSciNetCrossRefMATH Yao C, Bai X, Liu W (2014) A unified framework for multioriented text detection and recognition. IEEE Transact Image Process 23(11):4737–4749MathSciNetCrossRefMATH
29.
go back to reference Liu Y, Jin L, Zhang S, Luo C, Zhang S (2019) Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recognit 90:337–345CrossRef Liu Y, Jin L, Zhang S, Luo C, Zhang S (2019) Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recognit 90:337–345CrossRef
30.
go back to reference Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141 Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141
31.
go back to reference Liao M, Zhu Z, Shi B, Xia G-s, Bai X (2018) Rotation-sensitive regression for oriented scene text detection. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 5909–5918 Liao M, Zhu Z, Shi B, Xia G-s, Bai X (2018) Rotation-sensitive regression for oriented scene text detection. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 5909–5918
Metadata
Title
Real-Time Accurate Text Detection with Adaptive Double Pyramid Network
Authors
Weina Zhou
Wanyu Song
Publication date
17-11-2022
Publisher
Springer US
Published in
Neural Processing Letters / Issue 4/2023
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-022-11080-5

Other articles of this Issue 4/2023

Neural Processing Letters 4/2023 Go to the issue