Skip to main content
Erschienen in: International Journal on Document Analysis and Recognition (IJDAR) 3/2022

29.03.2022 | Original Paper

Scene text detection via decoupled feature pyramid networks

verfasst von: Min Liang, Jie-Bo Hou, Xiaobin Zhu, Chun Yang, Jingyan Qin

Erschienen in: International Journal on Document Analysis and Recognition (IJDAR) | Ausgabe 3/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Detecting arbitrary shape scene texts is challenging mainly due to the varied aspect ratios, curves, and scales. In this paper, we propose a novel arbitrary shape scene text detection method via Decoupled Feature Pyramid Networks (DFPN) and regression-based linking (RegLink). Our innovative DFPN decouples the width and height of feature maps generated by FPN to enhance the discriminability of features for varied aspect ratios. As quadrilateral regression results cannot directly represent curve text, we propose a simple yet effective RegLink to link pixels into text instances because pixels in the same curve text have an identical target quadrilateral. Thus, our RegLink can extend the ability of the rotated rectangles text detector for detecting curve text. Besides, we propose a Feature Scale Module to enhance the robustness of features for varied scales. In this way, our method can effectively detect scene texts in arbitrary shapes. Meanwhile, experimental results on three publicly available challenging datasets demonstrate the effectiveness of our method. The code and model of our method is available at https://​github.​com/​lmplayer/​DFPN-master.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: CVPR, pp. 9365–9374 (2019) Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: CVPR, pp. 9365–9374 (2019)
2.
Zurück zum Zitat Chen, J., Lian, Z.: Textpolar: irregular scene text detection using polar representation. Int. J. Doc. Anal. Recognit. 24, 315–323 (2021)CrossRef Chen, J., Lian, Z.: Textpolar: irregular scene text detection using polar representation. Int. J. Doc. Anal. Recognit. 24, 315–323 (2021)CrossRef
3.
Zurück zum Zitat Ch’ng, C., Chan, C.S., Liu, C.: Total-text: toward orientation robustness in scene text detection. Int. J. Doc. Anal. Recognit. 23(1), 31–52 (2020) Ch’ng, C., Chan, C.S., Liu, C.: Total-text: toward orientation robustness in scene text detection. Int. J. Doc. Anal. Recognit. 23(1), 31–52 (2020)
4.
Zurück zum Zitat Chng, C.K., Chan, C.S.: Total-text: a comprehensive dataset for scene text detection and recognition. In: ICDAR, pp. 935–942 (2017) Chng, C.K., Chan, C.S.: Total-text: a comprehensive dataset for scene text detection and recognition. In: ICDAR, pp. 935–942 (2017)
5.
Zurück zum Zitat Dai, Y., Huang, Z., Gao, Y., Xu, Y., Chen, K., Guo, J., Qiu, W.: Fused text segmentation networks for multi-oriented scene text detection. In: ICPR, pp. 3604–3609 (2018) Dai, Y., Huang, Z., Gao, Y., Xu, Y., Chen, K., Guo, J., Qiu, W.: Fused text segmentation networks for multi-oriented scene text detection. In: ICPR, pp. 3604–3609 (2018)
6.
Zurück zum Zitat Deng, D., Liu, H., Li, X., Cai, D. PixelLink: detecting scene text via instance segmentation. In: AAAI, pp. 6773–6780 (2018) Deng, D., Liu, H., Li, X., Cai, D. PixelLink: detecting scene text via instance segmentation. In: AAAI, pp. 6773–6780 (2018)
7.
Zurück zum Zitat Feng, W., He, W., Yin, F., Zhang, X.Y., Liu, C.L.: Textdragon: an end-to-end framework for arbitrary shaped text spotting. In: ICCV, pp. 9075–9084 (2019) Feng, W., He, W., Yin, F., Zhang, X.Y., Liu, C.L.: Textdragon: an end-to-end framework for arbitrary shaped text spotting. In: ICCV, pp. 9075–9084 (2019)
8.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
9.
Zurück zum Zitat He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: ICCV, pp. 2980–2988 (2017) He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: ICCV, pp. 2980–2988 (2017)
10.
Zurück zum Zitat He, W., Zhang, X.Y., Yin, F., Liu, C.L.: Deep direct regression for multi-oriented scene text detection. In: ICCV, pp. 745–753 (2017) He, W., Zhang, X.Y., Yin, F., Liu, C.L.: Deep direct regression for multi-oriented scene text detection. In: ICCV, pp. 745–753 (2017)
11.
Zurück zum Zitat Hou, J., Zhu, X., Liu, C., Sheng, K., Wu, L., Wang, H., Yin, X.: HAM: hidden anchor mechanism for scene text detection. IEEE Trans. Image Process. 29, 7904–7916 (2020)CrossRef Hou, J., Zhu, X., Liu, C., Sheng, K., Wu, L., Wang, H., Yin, X.: HAM: hidden anchor mechanism for scene text detection. IEEE Trans. Image Process. 29, 7904–7916 (2020)CrossRef
13.
Zurück zum Zitat Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S.K., et al.: ADB ICDAR 2015 competition on robust reading. In: ICDAR, pp. 1156–1160 (2015) Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S.K., et al.: ADB ICDAR 2015 competition on robust reading. In: ICDAR, pp. 1156–1160 (2015)
14.
Zurück zum Zitat Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014) Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014)
15.
Zurück zum Zitat Kuang, Z., Sun, H., Li, Z., Yue, X., Lin, T.H., Chen, J., Wei, H., Zhu, Y., Gao, T., Zhang, W., Chen, K., Zhang, W., Lin, D.: MMOCR: a comprehensive toolbox for text detection, recognition and understanding. In: ACM MM, pp. 3791–3794 (2021) Kuang, Z., Sun, H., Li, Z., Yue, X., Lin, T.H., Chen, J., Wei, H., Zhu, Y., Gao, T., Zhang, W., Chen, K., Zhang, W., Lin, D.: MMOCR: a comprehensive toolbox for text detection, recognition and understanding. In: ACM MM, pp. 3791–3794 (2021)
16.
Zurück zum Zitat Liao, M., Shi, B., Bai, X.: Textboxes++: a single-shot oriented scene text detector. IEEE Trans. Image Process. 27(8), 3676–3690 (2018)MathSciNetCrossRef Liao, M., Shi, B., Bai, X.: Textboxes++: a single-shot oriented scene text detector. IEEE Trans. Image Process. 27(8), 3676–3690 (2018)MathSciNetCrossRef
17.
Zurück zum Zitat Liao, M., Pang, G., Huang, J., Hassner, T., Bai, X.: Mask textspotter v3: segmentation proposal network for robust scene text spotting. In: ECCV, pp. 706–722 (2020) Liao, M., Pang, G., Huang, J., Hassner, T., Bai, X.: Mask textspotter v3: segmentation proposal network for robust scene text spotting. In: ECCV, pp. 706–722 (2020)
18.
Zurück zum Zitat Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X.: Real-time scene text detection with differentiable binarization. In: AAAI, pp. 11474–11481 (2020) Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X.: Real-time scene text detection with differentiable binarization. In: AAAI, pp. 11474–11481 (2020)
19.
Zurück zum Zitat Liao, M., Lyu, P., He, M., Yao, C., Wu, W., Bai, X.: Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 532–548 (2021)CrossRef Liao, M., Lyu, P., He, M., Yao, C., Wu, W., Bai, X.: Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 532–548 (2021)CrossRef
20.
Zurück zum Zitat Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: CVPR, pp. 936–944 (2017) Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: CVPR, pp. 936–944 (2017)
21.
Zurück zum Zitat Liu, S., Huang, D., Wang, Y.: Receptive field block net for accurate and fast object detection. In: ECCV, pp. 404–419 (2018) Liu, S., Huang, D., Wang, Y.: Receptive field block net for accurate and fast object detection. In: ECCV, pp. 404–419 (2018)
22.
Zurück zum Zitat Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C., Berg, A.C.: SSD: single shot multibox detector. In: ECCV, pp. 21–37 (2016) Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C., Berg, A.C.: SSD: single shot multibox detector. In: ECCV, pp. 21–37 (2016)
23.
Zurück zum Zitat Liu, X., Liang, D., Yan, S., Chen, D., Qiao, Y., Yan, J.: FOTS: fast oriented text spotting with a unified network. In: CVPR, pp. 5676–5685 (2018) Liu, X., Liang, D., Yan, S., Chen, D., Qiao, Y., Yan, J.: FOTS: fast oriented text spotting with a unified network. In: CVPR, pp. 5676–5685 (2018)
24.
Zurück zum Zitat Liu, X., Meng, G., Pan, C.: Scene text detection and recognition with advances in deep learning: a survey. Int. J. Doc. Anal. Recognit. 22(2), 143–162 (2019)CrossRef Liu, X., Meng, G., Pan, C.: Scene text detection and recognition with advances in deep learning: a survey. Int. J. Doc. Anal. Recognit. 22(2), 143–162 (2019)CrossRef
25.
Zurück zum Zitat Liu, Y., Chen, H., Shen, C., He, T., Jin, L., Wang, L.: Abcnet: real-time scene text spotting with adaptive bezier-curve network. In: CVPR, pp. 9806–9815 (2020) Liu, Y., Chen, H., Shen, C., He, T., Jin, L., Wang, L.: Abcnet: real-time scene text spotting with adaptive bezier-curve network. In: CVPR, pp. 9806–9815 (2020)
26.
Zurück zum Zitat Liu, Z., Lin, G., Yang, S., Liu, F., Lin, W., Goh, W.L.: Towards robust curve text detection with conditional spatial expansion. In: CVPR, pp. 7269–7278 (2019) Liu, Z., Lin, G., Yang, S., Liu, F., Lin, W., Goh, W.L.: Towards robust curve text detection with conditional spatial expansion. In: CVPR, pp. 7269–7278 (2019)
27.
Zurück zum Zitat Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
28.
Zurück zum Zitat Long, S., Ruan, J., Zhang, W., He, X., Wu, W., Yao, C.: Textsnake: a flexible representation for detecting text of arbitrary shapes. In: ECCV, pp. 19–35 (2018) Long, S., Ruan, J., Zhang, W., He, X., Wu, W., Yao, C.: Textsnake: a flexible representation for detecting text of arbitrary shapes. In: ECCV, pp. 19–35 (2018)
29.
Zurück zum Zitat Lyu, P., Liao, M., Yao, C., Wu, W., Bai, X.: Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. In: ECCV, pp. 71–88 (2018) Lyu, P., Liao, M., Yao, C., Wu, W., Bai, X.: Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. In: ECCV, pp. 71–88 (2018)
30.
Zurück zum Zitat Lyu, P., Yao, C., Wu, W., Yan, S., Bai, X.: Multi-oriented scene text detection via corner localization and region segmentation. In: (CVPR), pp. 7553–7563 (2018) Lyu, P., Yao, C., Wu, W., Yan, S., Bai, X.: Multi-oriented scene text detection via corner localization and region segmentation. In: (CVPR), pp. 7553–7563 (2018)
32.
Zurück zum Zitat Nayef, N., Yin, F., Bizid, I., Choi, H., Feng, Y., Karatzas, D., et al.: ZL ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification—RRC-MLT. In: ICDAR, pp. 1454–1459 (2017) Nayef, N., Yin, F., Bizid, I., Choi, H., Feng, Y., Karatzas, D., et al.: ZL ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification—RRC-MLT. In: ICDAR, pp. 1454–1459 (2017)
33.
Zurück zum Zitat Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR, pp. 6517–6525 (2017) Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR, pp. 6517–6525 (2017)
34.
Zurück zum Zitat Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016) Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016)
35.
Zurück zum Zitat Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)CrossRef Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)CrossRef
36.
Zurück zum Zitat Shi, B., Bai, X., Belongie, S.J.: Detecting oriented text in natural images by linking segments. In: CVPR, pp. 3482–3490 (2017) Shi, B., Bai, X., Belongie, S.J.: Detecting oriented text in natural images by linking segments. In: CVPR, pp. 3482–3490 (2017)
37.
Zurück zum Zitat Shrivastava, A., Gupta, A., Girshick, R.B.: Training region-based object detectors with online hard example mining. In: CVPR, pp. 761–769 (2016) Shrivastava, A., Gupta, A., Girshick, R.B.: Training region-based object detectors with online hard example mining. In: CVPR, pp. 761–769 (2016)
38.
Zurück zum Zitat Tian, S., Yin, X., Su, Y., Hao, H.: A unified framework for tracking based text detection and recognition from web videos. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 542–554 (2018)CrossRef Tian, S., Yin, X., Su, Y., Hao, H.: A unified framework for tracking based text detection and recognition from web videos. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 542–554 (2018)CrossRef
39.
Zurück zum Zitat Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: ECCV, pp. 56–72 (2016) Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: ECCV, pp. 56–72 (2016)
40.
Zurück zum Zitat Tian, Z., Shu, M., Lyu, P., Li, R., Zhou, C., Shen, X., Jia, J.: Learning shape-aware embedding for scene text detection. In: CVPR, pp. 4234–4243 (2019) Tian, Z., Shu, M., Lyu, P., Li, R., Zhou, C., Shen, X., Jia, J.: Learning shape-aware embedding for scene text detection. In: CVPR, pp. 4234–4243 (2019)
41.
Zurück zum Zitat Wang, H., Lu, P., Zhang, H., Yang, M., Bai, X., Xu, Y., He, M., Wang, Y., Liu, W.: All you need is boundary: toward arbitrary-shaped text spotting. In: AAAI, pp. 12160–12167 (2020) Wang, H., Lu, P., Zhang, H., Yang, M., Bai, X., Xu, Y., He, M., Wang, Y., Liu, W.: All you need is boundary: toward arbitrary-shaped text spotting. In: AAAI, pp. 12160–12167 (2020)
42.
Zurück zum Zitat Wang, W., Xie, E., Li, X., Hou, W., Lu, T., Yu, G., Shao, S.: Shape robust text detection with progressive scale expansion network. In: CVPR, pp. 9336–9345 (2019) Wang, W., Xie, E., Li, X., Hou, W., Lu, T., Yu, G., Shao, S.: Shape robust text detection with progressive scale expansion network. In: CVPR, pp. 9336–9345 (2019)
43.
Zurück zum Zitat Wang, W., Xie, E., Song, X., Zang, Y., Wang, W., Lu, T., Yu, G., Shen, C.: Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: ICCV, pp. 8439–8448 (2019) Wang, W., Xie, E., Song, X., Zang, Y., Wang, W., Lu, T., Yu, G., Shen, C.: Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: ICCV, pp. 8439–8448 (2019)
44.
Zurück zum Zitat Wang, X., Jiang, Y., Luo, Z., Liu, C.L., Choi, H., Kim, S.: Arbitrary shape scene text detection with adaptive text region representation. In: CVPR, pp. 6449–6458 (2019) Wang, X., Jiang, Y., Luo, Z., Liu, C.L., Choi, H., Kim, S.: Arbitrary shape scene text detection with adaptive text region representation. In: CVPR, pp. 6449–6458 (2019)
45.
Zurück zum Zitat Wang, Y., Xie, H., Zha, Z.J., Xing, M., Fu, Z., Zhang, Y.: Contournet: taking a further step toward accurate arbitrary-shaped scene text detection. In: CVPR, pp. 11750–11759 (2020) Wang, Y., Xie, H., Zha, Z.J., Xing, M., Fu, Z., Zhang, Y.: Contournet: taking a further step toward accurate arbitrary-shaped scene text detection. In: CVPR, pp. 11750–11759 (2020)
46.
Zurück zum Zitat Xie, L., Liu, Y., Jin, L., Xie, Z.: Derpn: taking a further step toward more general object detection. In: AAAI, pp. 9046–9053 (2019) Xie, L., Liu, Y., Jin, L., Xie, Z.: Derpn: taking a further step toward more general object detection. In: AAAI, pp. 9046–9053 (2019)
47.
Zurück zum Zitat Xu, Y., Wang, Y., Zhou, W., Wang, Y., Yang, Z., Bai, X.: Textfield: learning a deep direction field for irregular scene text detection. IEEE Trans. Image Process. 28(11), 5566–5579 (2019)MathSciNetCrossRef Xu, Y., Wang, Y., Zhou, W., Wang, Y., Yang, Z., Bai, X.: Textfield: learning a deep direction field for irregular scene text detection. IEEE Trans. Image Process. 28(11), 5566–5579 (2019)MathSciNetCrossRef
48.
Zurück zum Zitat Xue, C., Lu, S., Zhan, F.: Accurate scene text detection through border semantics awareness and bootstrapping. In: ECCV, pp. 370–387 (2018) Xue, C., Lu, S., Zhan, F.: Accurate scene text detection through border semantics awareness and bootstrapping. In: ECCV, pp. 370–387 (2018)
49.
Zurück zum Zitat Yang, C., Yin, X., Pei, W., Tian, S., Zuo, Z., Zhu, C., Yan, J.: Tracking based multi-orientation scene text detection: a unified framework with dynamic programming. IEEE Trans. Image Process. 26(7), 3235–3248 (2017)MathSciNetCrossRef Yang, C., Yin, X., Pei, W., Tian, S., Zuo, Z., Zhu, C., Yan, J.: Tracking based multi-orientation scene text detection: a unified framework with dynamic programming. IEEE Trans. Image Process. 26(7), 3235–3248 (2017)MathSciNetCrossRef
50.
Zurück zum Zitat Yang, Q., Cheng, M., Zhou, W., Chen, Y., Qiu, M., Lin, W.: Inceptext: a new inception-text module with deformable PSROI pooling for multi-oriented scene text detection. In: IJCAI, pp. 1071–1077 (2018) Yang, Q., Cheng, M., Zhou, W., Chen, Y., Qiu, M., Lin, W.: Inceptext: a new inception-text module with deformable PSROI pooling for multi-oriented scene text detection. In: IJCAI, pp. 1071–1077 (2018)
51.
Zurück zum Zitat Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: CVPR, pp. 1083–1090 (2012) Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: CVPR, pp. 1083–1090 (2012)
52.
Zurück zum Zitat Yao, C., Bai, X., Liu, W.: A unified framework for multioriented text detection and recognition. IEEE Trans. Image Process. 23(11), 4737–4749 (2014)MathSciNetCrossRef Yao, C., Bai, X., Liu, W.: A unified framework for multioriented text detection and recognition. IEEE Trans. Image Process. 23(11), 4737–4749 (2014)MathSciNetCrossRef
53.
Zurück zum Zitat Ye, Q., Doermann, D.S.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)CrossRef Ye, Q., Doermann, D.S.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)CrossRef
54.
Zurück zum Zitat Yin, X., Yin, X., Huang, K., Hao, H.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 970–983 (2014)CrossRef Yin, X., Yin, X., Huang, K., Hao, H.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 970–983 (2014)CrossRef
55.
Zurück zum Zitat Yin, X., Pei, W., Zhang, J., Hao, H.: Multi-orientation scene text detection with adaptive clustering. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1930–1937 (2015)CrossRef Yin, X., Pei, W., Zhang, J., Hao, H.: Multi-orientation scene text detection with adaptive clustering. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1930–1937 (2015)CrossRef
56.
Zurück zum Zitat Yin, X., Zuo, Z., Tian, S., Liu, C.: Text detection, tracking and recognition in video: a comprehensive survey. IEEE Trans. Image Process. 25(6), 2752–2773 (2016)MathSciNetCrossRef Yin, X., Zuo, Z., Tian, S., Liu, C.: Text detection, tracking and recognition in video: a comprehensive survey. IEEE Trans. Image Process. 25(6), 2752–2773 (2016)MathSciNetCrossRef
57.
Zurück zum Zitat Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: ICLR (2016) Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: ICLR (2016)
58.
Zurück zum Zitat Yu, J., Jiang, Y., Wang, Z., Cao, Z., Huang, T.S.: Unitbox: an advanced object detection network. In: ACM MM, pp. 516–520 (2016) Yu, J., Jiang, Y., Wang, Z., Cao, Z., Huang, T.S.: Unitbox: an advanced object detection network. In: ACM MM, pp. 516–520 (2016)
59.
Zurück zum Zitat Zhang, C., Liang, B., Huang, Z., En, M., Han, J., Ding, E., Ding, X.: Look more than once: an accurate detector for text of arbitrary shapes. In: CVPR, pp. 10552–10561 (2019) Zhang, C., Liang, B., Huang, Z., En, M., Han, J., Ding, E., Ding, X.: Look more than once: an accurate detector for text of arbitrary shapes. In: CVPR, pp. 10552–10561 (2019)
61.
Zurück zum Zitat Zhu, X., Li, Z., Li, X., Li, S., Dai, F.: Attention-aware perceptual enhancement nets for low-resolution image classification. Inf. Sci. 515, 233–247 (2020)CrossRef Zhu, X., Li, Z., Li, X., Li, S., Dai, F.: Attention-aware perceptual enhancement nets for low-resolution image classification. Inf. Sci. 515, 233–247 (2020)CrossRef
62.
Zurück zum Zitat Zhu, Y., Du, J.: Textmountain: accurate scene text detection via instance segmentation. Pattern Recogn. 110, 107336 (2021)CrossRef Zhu, Y., Du, J.: Textmountain: accurate scene text detection via instance segmentation. Pattern Recogn. 110, 107336 (2021)CrossRef
63.
Zurück zum Zitat Zhu, Y., Chen, J., Liang, L., Kuang, Z., Jin, L., Zhang, W.: Fourier contour embedding for arbitrary-shaped text detection. In: CVPR, pp. 3123–3131 (2021) Zhu, Y., Chen, J., Liang, L., Kuang, Z., Jin, L., Zhang, W.: Fourier contour embedding for arbitrary-shaped text detection. In: CVPR, pp. 3123–3131 (2021)
Metadaten
Titel
Scene text detection via decoupled feature pyramid networks
verfasst von
Min Liang
Jie-Bo Hou
Xiaobin Zhu
Chun Yang
Jingyan Qin
Publikationsdatum
29.03.2022
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal on Document Analysis and Recognition (IJDAR) / Ausgabe 3/2022
Print ISSN: 1433-2833
Elektronische ISSN: 1433-2825
DOI
https://doi.org/10.1007/s10032-022-00397-5

Weitere Artikel der Ausgabe 3/2022

International Journal on Document Analysis and Recognition (IJDAR) 3/2022 Zur Ausgabe

Premium Partner