nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

A Deep Convolutional Deblurring and Detection Neural Network for Localizing Text in Videos

verfasst von : Yang Wang, Ye Qian, Jiahao Shi, Feng Su

Erschienen in: MultiMedia Modeling

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Scene text in the video is usually vulnerable to various blurs like those caused by camera or text motions, which brings additional difficulty to reliably extract them from the video for content-based video applications. In this paper, we propose a novel fully convolutional deep neural network for deblurring and detecting text in the video. Specifically, to cope with blur of video text, we propose an effective deblurring subnetwork that is composed of multi-level convolutional blocks with both cross-block (long) and within-block (short) skip connections for progressively learning residual deblurred image details as well as a spatial attention mechanism to pay more attention on blurred regions, which generates the sharper image for current frame by fusing multiple surrounding adjacent frames. To further localize text in the frames, we enhance the EAST text detection model by introducing deformable convolution layers and deconvolution layers, which better capture widely varied appearances of video text. Experiments on the public scene text video dataset demonstrate the state-of-the-art performance of the proposed video text deblurring and detection model.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Face Tells Detailed Expression: Generating Comprehensive Facial Expression Sentence Through Facial Action Units

Nächstes Kapitel Generate Images with Obfuscated Attributes for Private Image Classification

Cho, S., Wang, J., Lee, S.: Video deblurring for hand-held cameras using patch-based synthesis. ACM Trans. Graph. (TOG) 31(4), 64 (2012)CrossRef

Dai, J., et al.: Deformable convolutional networks. In: ICCV, October 2017

Delbracio, M., Sapiro, G.: Burst deblurring: removing camera shake through fourier burst accumulation. In: CVPR, pp. 2385–2393 (2015)

Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: CVPR, pp. 2963–2970 (2010)

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

He, P., Huang, W., He, T., Zhu, Q., Qiao, Y., Li, X.: Single shot text detector with regional attention. In: ICCV, pp. 3047–3055 (2017)

Karatzas, D., et al.: ICDAR 2013 robust reading competition. In: ICDAR, pp. 1484–1493 (2013)

Khare, V., Shivakumara, P., Paramesran, R., Blumenstein, M.: Arbitrarily-oriented multi-lingual text detection in video. Multimedia Tools Appl. 76(15), 16625–16655 (2017)CrossRef

Khare, V., Shivakumara, P., Raveendran, P.: A new histogram oriented moments descriptor for multi-oriented moving text detection in video. Expert Syst. Appl. 42(21), 7627–7640 (2015)CrossRef

10.

Khare, V., Shivakumara, P., Raveendran, P., Blumenstein, M.: A blind deconvolution model for scene text detection and recognition in video. Pattern Recogn. 54(C), 128–148 (2016)CrossRef

11.

Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. IVC 22(10), 761–767 (2004)CrossRef

12.

Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)

13.

Shi, B., Bai, X., Belongie, S.: Detecting oriented text in natural images by linking segments. In: CVPR, pp. 3482–3490 (2017)

14.

Shivakumara, P., Phan, T.Q., Tan, C.L.: New fourier-statistical features in RGB space for video text detection. IEEE TCSVT 20(11), 1520–1532 (2010)

15.

Shivakumara, P., Sreedhar, R.P., Phan, T.Q., Lu, S., Tan, C.L.: Multioriented video scene text detection through Bayesian classification and boundary growing. IEEE TCSVT 22(8), 1227–1235 (2012)

16.

Su, S., Delbracio, M., Wang, J., Sapiro, G., Heidrich, W., Wang, O.: Deep video deblurring for hand-held cameras. In: CVPR, pp. 1279–1288, July 2017

17.

Tian, S., Pan, Y., Huang, C., Lu, S., Yu, K., Tan, C.L.: Text flow: a unified text detection system in natural scene images. In: ICCV, pp. 4651–4659 (2015)

18.

Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 56–72. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_4CrossRef

19.

Wang, L., Wang, Y., Shan, S., Su, F.: Scene text detection and tracking in video with background cues. In: ICMR, pp. 160–168 (2018)

20.

Yang, C., et al.: Tracking based multi-orientation scene text detection: a unified framework with dynamic programming. IEEE TIP 26(7), 3235–3248 (2017)MathSciNetMATH

21.

Yin, X.C., Yin, X., Huang, K., Hao, H.W.: Robust text detection in natural scene images. IEEE TPAMI 36(5), 970–983 (2014)CrossRef

22.

Zhao, X., Lin, K.H., Fu, Y., Hu, Y., Liu, Y., Huang, T.S.: Text from corners: a novel approach to detect text and caption in videos. IEEE TIP 20(3), 790–799 (2011)MathSciNetMATH

23.

Zhou, X., et al.: EAST: an efficient and accurate scene text detector. In: CVPR, pp. 2642–2651 (2017)

Titel: A Deep Convolutional Deblurring and Detection Neural Network for Localizing Text in Videos
verfasst von: Yang Wang
Ye Qian
Jiahao Shi
Feng Su
Verlag: Springer International Publishing
Buch: MultiMedia Modeling
Print ISBN: 978-3-030-37733-5

Electronic ISBN: 978-3-030-37734-2

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-3-030-37734-2_10

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Internationaler Motorenkongress/© [M] ATZlive | Chisnikov / Fotolia.com, Search Icon, Banner Hanser, Benedikt Bonnmann von Adesso/© Adesso, Teilzeit/© Fokussiert / stock.adobe.com, Hans-Joachim Lefeld/© Lucht Probst Associates GmbH, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.