Skip to main content
Erschienen in: Neural Processing Letters 4/2023

17.11.2022

Real-Time Accurate Text Detection with Adaptive Double Pyramid Network

verfasst von: Weina Zhou, Wanyu Song

Erschienen in: Neural Processing Letters | Ausgabe 4/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Segmentation-based methods have been widely adopted in scene text detection recently, for they could more accurately predict the shape of various scene text at pixel-level than other methods. However, complicated feature aggregation or label assignment algorithms used in current segmentation-based methods would significantly decrease the detection speed during the improving of accuracy. In this paper, we present an Adaptive Double Pyramid Network (ADPNet) for real-time detection of arbitrary-shaped text, which sets a Double Feature Enhancement Pyramid using Packet Downsampling Units (PDUnits) to enhance feature maps with a minimal amount of processing. The performance of ADPNet is validated on three benchmark datasets, and it shows that ADPNet obtains state-of-the-art performance in both speed and accuracy. Specifically, the proposed network achieves an F-measure of 85.7% while running at 40.5 fps on the ICDAR2015 dataset.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Yu J, Yao J, Zhang J, Yu Z, Tao D (2019) Single pixel reconstruction for one-stage instance segmentation. arXiv preprint arXiv:1904.07426 Yu J, Yao J, Zhang J, Yu Z, Tao D (2019) Single pixel reconstruction for one-stage instance segmentation. arXiv preprint arXiv:​1904.​07426
2.
Zurück zum Zitat Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440 Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440
3.
Zurück zum Zitat Zhang Z, Zhang C, Shen W, Yao C, Liu W, Bai X (2016) Multi-oriented text detection with fully convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4159–4167 Zhang Z, Zhang C, Shen W, Yao C, Liu W, Bai X (2016) Multi-oriented text detection with fully convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4159–4167
4.
Zurück zum Zitat Deng D, Liu H, Li X, Cai D (2018) Pixellink: Detecting scene text via instance segmentation. In: Proceedings of the AAAI conference on artificial intelligence, vol. 32 Deng D, Liu H, Li X, Cai D (2018) Pixellink: Detecting scene text via instance segmentation. In: Proceedings of the AAAI conference on artificial intelligence, vol. 32
5.
Zurück zum Zitat Long S, Ruan J, Zhang W, He X, Wu W, Yao C (2018) Textsnake: a flexible representation for detecting text of arbitrary shapes. In: Proceedings of the European conference on computer vision (ECCV), pp. 20–36 Long S, Ruan J, Zhang W, He X, Wu W, Yao C (2018) Textsnake: a flexible representation for detecting text of arbitrary shapes. In: Proceedings of the European conference on computer vision (ECCV), pp. 20–36
6.
Zurück zum Zitat Wang W, Xie E, Li X, Hou W, Lu T, Yu G, Shao S (2019) Shape robust text detection with progressive scale expansion network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9336–9345 Wang W, Xie E, Li X, Hou W, Lu T, Yu G, Shao S (2019) Shape robust text detection with progressive scale expansion network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9336–9345
7.
Zurück zum Zitat Liao M, Wan Z, Yao C, Chen K, Bai X (2020) Real-time scene text detection with differentiable binarization. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, 11474–11481 Liao M, Wan Z, Yao C, Chen K, Bai X (2020) Real-time scene text detection with differentiable binarization. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, 11474–11481
8.
Zurück zum Zitat Li J, Lin Y, Liu R, Ho CM, Shi H (2021) Rsca: real-time segmentation-based context-aware scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2349–2358 Li J, Lin Y, Liu R, Ho CM, Shi H (2021) Rsca: real-time segmentation-based context-aware scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2349–2358
9.
Zurück zum Zitat Tian Z, Shu M, Lyu P, Li R, Zhou C, Shen X, Jia J (2019) Learning shape-aware embedding for scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4234–4243 Tian Z, Shu M, Lyu P, Li R, Zhou C, Shen X, Jia J (2019) Learning shape-aware embedding for scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4234–4243
10.
Zurück zum Zitat Zhang J, Cao Y, Wu Q (2021) Vector of locally and adaptively aggregated descriptors for image feature representation. Pattern Recognit. 116:107952CrossRef Zhang J, Cao Y, Wu Q (2021) Vector of locally and adaptively aggregated descriptors for image feature representation. Pattern Recognit. 116:107952CrossRef
11.
Zurück zum Zitat Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Advances in neural information processing systems 28
12.
Zurück zum Zitat Huang J, Jiang Z, Zhang H, Cai B, Yao Y (2017) Region proposal for ship detection based on structured forests edge method. In: 2017 IEEE international geoscience and remote sensing symposium (IGARSS), pp. 1856–1859. IEEE Huang J, Jiang Z, Zhang H, Cai B, Yao Y (2017) Region proposal for ship detection based on structured forests edge method. In: 2017 IEEE international geoscience and remote sensing symposium (IGARSS), pp. 1856–1859. IEEE
13.
Zurück zum Zitat Wang S, Liu Y, He Z, Wang Y, Tang Z (2020) A quadrilateral scene text detector with two-stage network architecture. Pattern Recognit 102:107230CrossRef Wang S, Liu Y, He Z, Wang Y, Tang Z (2020) A quadrilateral scene text detector with two-stage network architecture. Pattern Recognit 102:107230CrossRef
14.
Zurück zum Zitat Xue C, Lu S, Hoi S (2022) Detection and rectification of arbitrary shaped scene texts by using text keypoints and links. Pattern Recognit 124:108494CrossRef Xue C, Lu S, Hoi S (2022) Detection and rectification of arbitrary shaped scene texts by using text keypoints and links. Pattern Recognit 124:108494CrossRef
15.
Zurück zum Zitat Deng L, Gong Y, Lin Y, Shuai J, Tu X, Zhang Y, Ma Z, Xie M (2019) Detecting multi-oriented text with corner-based region proposals. Neurocomputing 334:134–142CrossRef Deng L, Gong Y, Lin Y, Shuai J, Tu X, Zhang Y, Ma Z, Xie M (2019) Detecting multi-oriented text with corner-based region proposals. Neurocomputing 334:134–142CrossRef
16.
Zurück zum Zitat Li J, Cheng B, Feris R, Xiong J, Huang TS, Hwu W-M, Shi H (2021) Pseudo-iou: Improving label assignment in anchor-free object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2378–2387 Li J, Cheng B, Feris R, Xiong J, Huang TS, Hwu W-M, Shi H (2021) Pseudo-iou: Improving label assignment in anchor-free object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2378–2387
17.
Zurück zum Zitat Mafla A, Tito R, Dey S, Gómez L, Rusiñol M, Valveny E, Karatzas D (2021) Real-time lexicon-free scene text retrieval. Pattern Recognit 110:107656CrossRef Mafla A, Tito R, Dey S, Gómez L, Rusiñol M, Valveny E, Karatzas D (2021) Real-time lexicon-free scene text retrieval. Pattern Recognit 110:107656CrossRef
18.
Zurück zum Zitat Zhou W, Chen K (2022) A lightweight hand gesture recognition in complex backgrounds. Displays 74:102226CrossRef Zhou W, Chen K (2022) A lightweight hand gesture recognition in complex backgrounds. Displays 74:102226CrossRef
19.
Zurück zum Zitat Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5551–5560 Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5551–5560
20.
Zurück zum Zitat Xu X, Zhang Z, Wang Z, Price B, Wang Z, Shi H (2021) Rethinking text segmentation: A novel dataset and a text-specific refinement approach. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12045–12055 Xu X, Zhang Z, Wang Z, Price B, Wang Z, Shi H (2021) Rethinking text segmentation: A novel dataset and a text-specific refinement approach. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12045–12055
21.
Zurück zum Zitat Qiao L, Tang S, Cheng Z, Xu Y, Niu Y, Pu S, Wu F (2020) Text perceptron: Towards end-to-end arbitrary-shaped text spotting. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, pp. 11899–11907 Qiao L, Tang S, Cheng Z, Xu Y, Niu Y, Pu S, Wu F (2020) Text perceptron: Towards end-to-end arbitrary-shaped text spotting. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, pp. 11899–11907
22.
Zurück zum Zitat Cao M, Zou Y (2020) All you need is a second look: Towards tighter arbitrary shape text detection. In: ICASSP 2020-2020 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp. 2228–2232. IEEE Cao M, Zou Y (2020) All you need is a second look: Towards tighter arbitrary shape text detection. In: ICASSP 2020-2020 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp. 2228–2232. IEEE
23.
Zurück zum Zitat Wang W, Xie E, Song X, Zang Y, Wang W, Lu T, Yu G, Shen C (2019) Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8440–8449 Wang W, Xie E, Song X, Zang Y, Wang W, Lu T, Yu G, Shen C (2019) Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8440–8449
24.
Zurück zum Zitat Wang Y, Xie H, Zha Z-J, Xing M, Fu Z, Zhang Y (2020) Contournet: Taking a further step toward accurate arbitrary-shaped scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11753–11762 Wang Y, Xie H, Zha Z-J, Xing M, Fu Z, Zhang Y (2020) Contournet: Taking a further step toward accurate arbitrary-shaped scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11753–11762
25.
Zurück zum Zitat Liang M, Hou J-B, Zhu X, Yang C, Qin J, Yin X-C (2021) Multi-orientation scene text detection with scale-guided regression. Neurocomputing 461:310–318CrossRef Liang M, Hou J-B, Zhu X, Yang C, Qin J, Yin X-C (2021) Multi-orientation scene text detection with scale-guided regression. Neurocomputing 461:310–318CrossRef
26.
Zurück zum Zitat Gupta A, Vedaldi A, Zisserman A (2016) Synthetic data for text localisation in natural images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2315–2324 Gupta A, Vedaldi A, Zisserman A (2016) Synthetic data for text localisation in natural images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2315–2324
27.
Zurück zum Zitat Yao C, Bai X, Liu W, Ma Y, Tu Z (2012) Detecting texts of arbitrary orientations in natural images. In: 2012 IEEE conference on computer vision and pattern recognition, pp. 1083–1090. IEEE Yao C, Bai X, Liu W, Ma Y, Tu Z (2012) Detecting texts of arbitrary orientations in natural images. In: 2012 IEEE conference on computer vision and pattern recognition, pp. 1083–1090. IEEE
28.
Zurück zum Zitat Yao C, Bai X, Liu W (2014) A unified framework for multioriented text detection and recognition. IEEE Transact Image Process 23(11):4737–4749MathSciNetCrossRefMATH Yao C, Bai X, Liu W (2014) A unified framework for multioriented text detection and recognition. IEEE Transact Image Process 23(11):4737–4749MathSciNetCrossRefMATH
29.
Zurück zum Zitat Liu Y, Jin L, Zhang S, Luo C, Zhang S (2019) Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recognit 90:337–345CrossRef Liu Y, Jin L, Zhang S, Luo C, Zhang S (2019) Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recognit 90:337–345CrossRef
30.
Zurück zum Zitat Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141 Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141
31.
Zurück zum Zitat Liao M, Zhu Z, Shi B, Xia G-s, Bai X (2018) Rotation-sensitive regression for oriented scene text detection. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 5909–5918 Liao M, Zhu Z, Shi B, Xia G-s, Bai X (2018) Rotation-sensitive regression for oriented scene text detection. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 5909–5918
Metadaten
Titel
Real-Time Accurate Text Detection with Adaptive Double Pyramid Network
verfasst von
Weina Zhou
Wanyu Song
Publikationsdatum
17.11.2022
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 4/2023
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-022-11080-5

Weitere Artikel der Ausgabe 4/2023

Neural Processing Letters 4/2023 Zur Ausgabe

Neuer Inhalt