Skip to main content
Top
Published in: International Journal on Document Analysis and Recognition (IJDAR) 3/2022

29-03-2022 | Original Paper

Scene text detection via decoupled feature pyramid networks

Authors: Min Liang, Jie-Bo Hou, Xiaobin Zhu, Chun Yang, Jingyan Qin

Published in: International Journal on Document Analysis and Recognition (IJDAR) | Issue 3/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Detecting arbitrary shape scene texts is challenging mainly due to the varied aspect ratios, curves, and scales. In this paper, we propose a novel arbitrary shape scene text detection method via Decoupled Feature Pyramid Networks (DFPN) and regression-based linking (RegLink). Our innovative DFPN decouples the width and height of feature maps generated by FPN to enhance the discriminability of features for varied aspect ratios. As quadrilateral regression results cannot directly represent curve text, we propose a simple yet effective RegLink to link pixels into text instances because pixels in the same curve text have an identical target quadrilateral. Thus, our RegLink can extend the ability of the rotated rectangles text detector for detecting curve text. Besides, we propose a Feature Scale Module to enhance the robustness of features for varied scales. In this way, our method can effectively detect scene texts in arbitrary shapes. Meanwhile, experimental results on three publicly available challenging datasets demonstrate the effectiveness of our method. The code and model of our method is available at https://​github.​com/​lmplayer/​DFPN-master.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: CVPR, pp. 9365–9374 (2019) Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: CVPR, pp. 9365–9374 (2019)
2.
go back to reference Chen, J., Lian, Z.: Textpolar: irregular scene text detection using polar representation. Int. J. Doc. Anal. Recognit. 24, 315–323 (2021)CrossRef Chen, J., Lian, Z.: Textpolar: irregular scene text detection using polar representation. Int. J. Doc. Anal. Recognit. 24, 315–323 (2021)CrossRef
3.
go back to reference Ch’ng, C., Chan, C.S., Liu, C.: Total-text: toward orientation robustness in scene text detection. Int. J. Doc. Anal. Recognit. 23(1), 31–52 (2020) Ch’ng, C., Chan, C.S., Liu, C.: Total-text: toward orientation robustness in scene text detection. Int. J. Doc. Anal. Recognit. 23(1), 31–52 (2020)
4.
go back to reference Chng, C.K., Chan, C.S.: Total-text: a comprehensive dataset for scene text detection and recognition. In: ICDAR, pp. 935–942 (2017) Chng, C.K., Chan, C.S.: Total-text: a comprehensive dataset for scene text detection and recognition. In: ICDAR, pp. 935–942 (2017)
5.
go back to reference Dai, Y., Huang, Z., Gao, Y., Xu, Y., Chen, K., Guo, J., Qiu, W.: Fused text segmentation networks for multi-oriented scene text detection. In: ICPR, pp. 3604–3609 (2018) Dai, Y., Huang, Z., Gao, Y., Xu, Y., Chen, K., Guo, J., Qiu, W.: Fused text segmentation networks for multi-oriented scene text detection. In: ICPR, pp. 3604–3609 (2018)
6.
go back to reference Deng, D., Liu, H., Li, X., Cai, D. PixelLink: detecting scene text via instance segmentation. In: AAAI, pp. 6773–6780 (2018) Deng, D., Liu, H., Li, X., Cai, D. PixelLink: detecting scene text via instance segmentation. In: AAAI, pp. 6773–6780 (2018)
7.
go back to reference Feng, W., He, W., Yin, F., Zhang, X.Y., Liu, C.L.: Textdragon: an end-to-end framework for arbitrary shaped text spotting. In: ICCV, pp. 9075–9084 (2019) Feng, W., He, W., Yin, F., Zhang, X.Y., Liu, C.L.: Textdragon: an end-to-end framework for arbitrary shaped text spotting. In: ICCV, pp. 9075–9084 (2019)
8.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
9.
go back to reference He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: ICCV, pp. 2980–2988 (2017) He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: ICCV, pp. 2980–2988 (2017)
10.
go back to reference He, W., Zhang, X.Y., Yin, F., Liu, C.L.: Deep direct regression for multi-oriented scene text detection. In: ICCV, pp. 745–753 (2017) He, W., Zhang, X.Y., Yin, F., Liu, C.L.: Deep direct regression for multi-oriented scene text detection. In: ICCV, pp. 745–753 (2017)
11.
go back to reference Hou, J., Zhu, X., Liu, C., Sheng, K., Wu, L., Wang, H., Yin, X.: HAM: hidden anchor mechanism for scene text detection. IEEE Trans. Image Process. 29, 7904–7916 (2020)CrossRef Hou, J., Zhu, X., Liu, C., Sheng, K., Wu, L., Wang, H., Yin, X.: HAM: hidden anchor mechanism for scene text detection. IEEE Trans. Image Process. 29, 7904–7916 (2020)CrossRef
13.
go back to reference Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S.K., et al.: ADB ICDAR 2015 competition on robust reading. In: ICDAR, pp. 1156–1160 (2015) Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S.K., et al.: ADB ICDAR 2015 competition on robust reading. In: ICDAR, pp. 1156–1160 (2015)
14.
go back to reference Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014)
15.
go back to reference Kuang, Z., Sun, H., Li, Z., Yue, X., Lin, T.H., Chen, J., Wei, H., Zhu, Y., Gao, T., Zhang, W., Chen, K., Zhang, W., Lin, D.: MMOCR: a comprehensive toolbox for text detection, recognition and understanding. In: ACM MM, pp. 3791–3794 (2021) Kuang, Z., Sun, H., Li, Z., Yue, X., Lin, T.H., Chen, J., Wei, H., Zhu, Y., Gao, T., Zhang, W., Chen, K., Zhang, W., Lin, D.: MMOCR: a comprehensive toolbox for text detection, recognition and understanding. In: ACM MM, pp. 3791–3794 (2021)
16.
go back to reference Liao, M., Shi, B., Bai, X.: Textboxes++: a single-shot oriented scene text detector. IEEE Trans. Image Process. 27(8), 3676–3690 (2018)MathSciNetCrossRef Liao, M., Shi, B., Bai, X.: Textboxes++: a single-shot oriented scene text detector. IEEE Trans. Image Process. 27(8), 3676–3690 (2018)MathSciNetCrossRef
17.
go back to reference Liao, M., Pang, G., Huang, J., Hassner, T., Bai, X.: Mask textspotter v3: segmentation proposal network for robust scene text spotting. In: ECCV, pp. 706–722 (2020) Liao, M., Pang, G., Huang, J., Hassner, T., Bai, X.: Mask textspotter v3: segmentation proposal network for robust scene text spotting. In: ECCV, pp. 706–722 (2020)
18.
go back to reference Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X.: Real-time scene text detection with differentiable binarization. In: AAAI, pp. 11474–11481 (2020) Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X.: Real-time scene text detection with differentiable binarization. In: AAAI, pp. 11474–11481 (2020)
19.
go back to reference Liao, M., Lyu, P., He, M., Yao, C., Wu, W., Bai, X.: Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 532–548 (2021)CrossRef Liao, M., Lyu, P., He, M., Yao, C., Wu, W., Bai, X.: Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 532–548 (2021)CrossRef
20.
go back to reference Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: CVPR, pp. 936–944 (2017) Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: CVPR, pp. 936–944 (2017)
21.
go back to reference Liu, S., Huang, D., Wang, Y.: Receptive field block net for accurate and fast object detection. In: ECCV, pp. 404–419 (2018) Liu, S., Huang, D., Wang, Y.: Receptive field block net for accurate and fast object detection. In: ECCV, pp. 404–419 (2018)
22.
go back to reference Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C., Berg, A.C.: SSD: single shot multibox detector. In: ECCV, pp. 21–37 (2016) Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C., Berg, A.C.: SSD: single shot multibox detector. In: ECCV, pp. 21–37 (2016)
23.
go back to reference Liu, X., Liang, D., Yan, S., Chen, D., Qiao, Y., Yan, J.: FOTS: fast oriented text spotting with a unified network. In: CVPR, pp. 5676–5685 (2018) Liu, X., Liang, D., Yan, S., Chen, D., Qiao, Y., Yan, J.: FOTS: fast oriented text spotting with a unified network. In: CVPR, pp. 5676–5685 (2018)
24.
go back to reference Liu, X., Meng, G., Pan, C.: Scene text detection and recognition with advances in deep learning: a survey. Int. J. Doc. Anal. Recognit. 22(2), 143–162 (2019)CrossRef Liu, X., Meng, G., Pan, C.: Scene text detection and recognition with advances in deep learning: a survey. Int. J. Doc. Anal. Recognit. 22(2), 143–162 (2019)CrossRef
25.
go back to reference Liu, Y., Chen, H., Shen, C., He, T., Jin, L., Wang, L.: Abcnet: real-time scene text spotting with adaptive bezier-curve network. In: CVPR, pp. 9806–9815 (2020) Liu, Y., Chen, H., Shen, C., He, T., Jin, L., Wang, L.: Abcnet: real-time scene text spotting with adaptive bezier-curve network. In: CVPR, pp. 9806–9815 (2020)
26.
go back to reference Liu, Z., Lin, G., Yang, S., Liu, F., Lin, W., Goh, W.L.: Towards robust curve text detection with conditional spatial expansion. In: CVPR, pp. 7269–7278 (2019) Liu, Z., Lin, G., Yang, S., Liu, F., Lin, W., Goh, W.L.: Towards robust curve text detection with conditional spatial expansion. In: CVPR, pp. 7269–7278 (2019)
27.
go back to reference Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
28.
go back to reference Long, S., Ruan, J., Zhang, W., He, X., Wu, W., Yao, C.: Textsnake: a flexible representation for detecting text of arbitrary shapes. In: ECCV, pp. 19–35 (2018) Long, S., Ruan, J., Zhang, W., He, X., Wu, W., Yao, C.: Textsnake: a flexible representation for detecting text of arbitrary shapes. In: ECCV, pp. 19–35 (2018)
29.
go back to reference Lyu, P., Liao, M., Yao, C., Wu, W., Bai, X.: Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. In: ECCV, pp. 71–88 (2018) Lyu, P., Liao, M., Yao, C., Wu, W., Bai, X.: Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. In: ECCV, pp. 71–88 (2018)
30.
go back to reference Lyu, P., Yao, C., Wu, W., Yan, S., Bai, X.: Multi-oriented scene text detection via corner localization and region segmentation. In: (CVPR), pp. 7553–7563 (2018) Lyu, P., Yao, C., Wu, W., Yan, S., Bai, X.: Multi-oriented scene text detection via corner localization and region segmentation. In: (CVPR), pp. 7553–7563 (2018)
32.
go back to reference Nayef, N., Yin, F., Bizid, I., Choi, H., Feng, Y., Karatzas, D., et al.: ZL ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification—RRC-MLT. In: ICDAR, pp. 1454–1459 (2017) Nayef, N., Yin, F., Bizid, I., Choi, H., Feng, Y., Karatzas, D., et al.: ZL ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification—RRC-MLT. In: ICDAR, pp. 1454–1459 (2017)
33.
go back to reference Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR, pp. 6517–6525 (2017) Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR, pp. 6517–6525 (2017)
34.
go back to reference Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016) Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016)
35.
go back to reference Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)CrossRef Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)CrossRef
36.
go back to reference Shi, B., Bai, X., Belongie, S.J.: Detecting oriented text in natural images by linking segments. In: CVPR, pp. 3482–3490 (2017) Shi, B., Bai, X., Belongie, S.J.: Detecting oriented text in natural images by linking segments. In: CVPR, pp. 3482–3490 (2017)
37.
go back to reference Shrivastava, A., Gupta, A., Girshick, R.B.: Training region-based object detectors with online hard example mining. In: CVPR, pp. 761–769 (2016) Shrivastava, A., Gupta, A., Girshick, R.B.: Training region-based object detectors with online hard example mining. In: CVPR, pp. 761–769 (2016)
38.
go back to reference Tian, S., Yin, X., Su, Y., Hao, H.: A unified framework for tracking based text detection and recognition from web videos. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 542–554 (2018)CrossRef Tian, S., Yin, X., Su, Y., Hao, H.: A unified framework for tracking based text detection and recognition from web videos. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 542–554 (2018)CrossRef
39.
go back to reference Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: ECCV, pp. 56–72 (2016) Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: ECCV, pp. 56–72 (2016)
40.
go back to reference Tian, Z., Shu, M., Lyu, P., Li, R., Zhou, C., Shen, X., Jia, J.: Learning shape-aware embedding for scene text detection. In: CVPR, pp. 4234–4243 (2019) Tian, Z., Shu, M., Lyu, P., Li, R., Zhou, C., Shen, X., Jia, J.: Learning shape-aware embedding for scene text detection. In: CVPR, pp. 4234–4243 (2019)
41.
go back to reference Wang, H., Lu, P., Zhang, H., Yang, M., Bai, X., Xu, Y., He, M., Wang, Y., Liu, W.: All you need is boundary: toward arbitrary-shaped text spotting. In: AAAI, pp. 12160–12167 (2020) Wang, H., Lu, P., Zhang, H., Yang, M., Bai, X., Xu, Y., He, M., Wang, Y., Liu, W.: All you need is boundary: toward arbitrary-shaped text spotting. In: AAAI, pp. 12160–12167 (2020)
42.
go back to reference Wang, W., Xie, E., Li, X., Hou, W., Lu, T., Yu, G., Shao, S.: Shape robust text detection with progressive scale expansion network. In: CVPR, pp. 9336–9345 (2019) Wang, W., Xie, E., Li, X., Hou, W., Lu, T., Yu, G., Shao, S.: Shape robust text detection with progressive scale expansion network. In: CVPR, pp. 9336–9345 (2019)
43.
go back to reference Wang, W., Xie, E., Song, X., Zang, Y., Wang, W., Lu, T., Yu, G., Shen, C.: Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: ICCV, pp. 8439–8448 (2019) Wang, W., Xie, E., Song, X., Zang, Y., Wang, W., Lu, T., Yu, G., Shen, C.: Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: ICCV, pp. 8439–8448 (2019)
44.
go back to reference Wang, X., Jiang, Y., Luo, Z., Liu, C.L., Choi, H., Kim, S.: Arbitrary shape scene text detection with adaptive text region representation. In: CVPR, pp. 6449–6458 (2019) Wang, X., Jiang, Y., Luo, Z., Liu, C.L., Choi, H., Kim, S.: Arbitrary shape scene text detection with adaptive text region representation. In: CVPR, pp. 6449–6458 (2019)
45.
go back to reference Wang, Y., Xie, H., Zha, Z.J., Xing, M., Fu, Z., Zhang, Y.: Contournet: taking a further step toward accurate arbitrary-shaped scene text detection. In: CVPR, pp. 11750–11759 (2020) Wang, Y., Xie, H., Zha, Z.J., Xing, M., Fu, Z., Zhang, Y.: Contournet: taking a further step toward accurate arbitrary-shaped scene text detection. In: CVPR, pp. 11750–11759 (2020)
46.
go back to reference Xie, L., Liu, Y., Jin, L., Xie, Z.: Derpn: taking a further step toward more general object detection. In: AAAI, pp. 9046–9053 (2019) Xie, L., Liu, Y., Jin, L., Xie, Z.: Derpn: taking a further step toward more general object detection. In: AAAI, pp. 9046–9053 (2019)
47.
go back to reference Xu, Y., Wang, Y., Zhou, W., Wang, Y., Yang, Z., Bai, X.: Textfield: learning a deep direction field for irregular scene text detection. IEEE Trans. Image Process. 28(11), 5566–5579 (2019)MathSciNetCrossRef Xu, Y., Wang, Y., Zhou, W., Wang, Y., Yang, Z., Bai, X.: Textfield: learning a deep direction field for irregular scene text detection. IEEE Trans. Image Process. 28(11), 5566–5579 (2019)MathSciNetCrossRef
48.
go back to reference Xue, C., Lu, S., Zhan, F.: Accurate scene text detection through border semantics awareness and bootstrapping. In: ECCV, pp. 370–387 (2018) Xue, C., Lu, S., Zhan, F.: Accurate scene text detection through border semantics awareness and bootstrapping. In: ECCV, pp. 370–387 (2018)
49.
go back to reference Yang, C., Yin, X., Pei, W., Tian, S., Zuo, Z., Zhu, C., Yan, J.: Tracking based multi-orientation scene text detection: a unified framework with dynamic programming. IEEE Trans. Image Process. 26(7), 3235–3248 (2017)MathSciNetCrossRef Yang, C., Yin, X., Pei, W., Tian, S., Zuo, Z., Zhu, C., Yan, J.: Tracking based multi-orientation scene text detection: a unified framework with dynamic programming. IEEE Trans. Image Process. 26(7), 3235–3248 (2017)MathSciNetCrossRef
50.
go back to reference Yang, Q., Cheng, M., Zhou, W., Chen, Y., Qiu, M., Lin, W.: Inceptext: a new inception-text module with deformable PSROI pooling for multi-oriented scene text detection. In: IJCAI, pp. 1071–1077 (2018) Yang, Q., Cheng, M., Zhou, W., Chen, Y., Qiu, M., Lin, W.: Inceptext: a new inception-text module with deformable PSROI pooling for multi-oriented scene text detection. In: IJCAI, pp. 1071–1077 (2018)
51.
go back to reference Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: CVPR, pp. 1083–1090 (2012) Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: CVPR, pp. 1083–1090 (2012)
52.
go back to reference Yao, C., Bai, X., Liu, W.: A unified framework for multioriented text detection and recognition. IEEE Trans. Image Process. 23(11), 4737–4749 (2014)MathSciNetCrossRef Yao, C., Bai, X., Liu, W.: A unified framework for multioriented text detection and recognition. IEEE Trans. Image Process. 23(11), 4737–4749 (2014)MathSciNetCrossRef
53.
go back to reference Ye, Q., Doermann, D.S.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)CrossRef Ye, Q., Doermann, D.S.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)CrossRef
54.
go back to reference Yin, X., Yin, X., Huang, K., Hao, H.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 970–983 (2014)CrossRef Yin, X., Yin, X., Huang, K., Hao, H.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 970–983 (2014)CrossRef
55.
go back to reference Yin, X., Pei, W., Zhang, J., Hao, H.: Multi-orientation scene text detection with adaptive clustering. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1930–1937 (2015)CrossRef Yin, X., Pei, W., Zhang, J., Hao, H.: Multi-orientation scene text detection with adaptive clustering. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1930–1937 (2015)CrossRef
56.
go back to reference Yin, X., Zuo, Z., Tian, S., Liu, C.: Text detection, tracking and recognition in video: a comprehensive survey. IEEE Trans. Image Process. 25(6), 2752–2773 (2016)MathSciNetCrossRef Yin, X., Zuo, Z., Tian, S., Liu, C.: Text detection, tracking and recognition in video: a comprehensive survey. IEEE Trans. Image Process. 25(6), 2752–2773 (2016)MathSciNetCrossRef
57.
go back to reference Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: ICLR (2016) Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: ICLR (2016)
58.
go back to reference Yu, J., Jiang, Y., Wang, Z., Cao, Z., Huang, T.S.: Unitbox: an advanced object detection network. In: ACM MM, pp. 516–520 (2016) Yu, J., Jiang, Y., Wang, Z., Cao, Z., Huang, T.S.: Unitbox: an advanced object detection network. In: ACM MM, pp. 516–520 (2016)
59.
go back to reference Zhang, C., Liang, B., Huang, Z., En, M., Han, J., Ding, E., Ding, X.: Look more than once: an accurate detector for text of arbitrary shapes. In: CVPR, pp. 10552–10561 (2019) Zhang, C., Liang, B., Huang, Z., En, M., Han, J., Ding, E., Ding, X.: Look more than once: an accurate detector for text of arbitrary shapes. In: CVPR, pp. 10552–10561 (2019)
61.
go back to reference Zhu, X., Li, Z., Li, X., Li, S., Dai, F.: Attention-aware perceptual enhancement nets for low-resolution image classification. Inf. Sci. 515, 233–247 (2020)CrossRef Zhu, X., Li, Z., Li, X., Li, S., Dai, F.: Attention-aware perceptual enhancement nets for low-resolution image classification. Inf. Sci. 515, 233–247 (2020)CrossRef
62.
go back to reference Zhu, Y., Du, J.: Textmountain: accurate scene text detection via instance segmentation. Pattern Recogn. 110, 107336 (2021)CrossRef Zhu, Y., Du, J.: Textmountain: accurate scene text detection via instance segmentation. Pattern Recogn. 110, 107336 (2021)CrossRef
63.
go back to reference Zhu, Y., Chen, J., Liang, L., Kuang, Z., Jin, L., Zhang, W.: Fourier contour embedding for arbitrary-shaped text detection. In: CVPR, pp. 3123–3131 (2021) Zhu, Y., Chen, J., Liang, L., Kuang, Z., Jin, L., Zhang, W.: Fourier contour embedding for arbitrary-shaped text detection. In: CVPR, pp. 3123–3131 (2021)
Metadata
Title
Scene text detection via decoupled feature pyramid networks
Authors
Min Liang
Jie-Bo Hou
Xiaobin Zhu
Chun Yang
Jingyan Qin
Publication date
29-03-2022
Publisher
Springer Berlin Heidelberg
Published in
International Journal on Document Analysis and Recognition (IJDAR) / Issue 3/2022
Print ISSN: 1433-2833
Electronic ISSN: 1433-2825
DOI
https://doi.org/10.1007/s10032-022-00397-5

Other articles of this Issue 3/2022

International Journal on Document Analysis and Recognition (IJDAR) 3/2022 Go to the issue

Premium Partner