nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

Unsupervised Feature Propagation for Fast Video Object Detection Using Generative Adversarial Networks

verfasst von : Xuan Zhang, Guangxing Han, Wenduo He

Erschienen in: MultiMedia Modeling

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

We propose unsupervised Feature Propagation Generative Adversarial Network (denoted as FPGAN) for fast video object detection in this paper. In our video object detector, we detect objects on spare key frames using pre-trained state-of-the-art object detector R-FCN, and propagate CNN features to adjacent frames for fast detection via a light-weight transformation network. To learn the feature propagation network, we make full use of unlabeled video data and employ generative adversarial networks in model training. Specifically, in FPGAN, the generator is the feature propagation network, and the discriminator employs second-order temporal coherence and 3D ConvNets to distinguish between predicted and “ground truth” CNN features. In addition, Euclidean distance loss provided by the pre-trained image object detector is also adopted to jointly supervise the learning. Our method doesn’t need any human labelling in videos. Experiments on the large-scale ImageNet VID dataset demonstrate the effectiveness of our method.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel HRTF Representation with Convolutional Auto-encoder

Nächstes Kapitel OmniEyes: Analysis and Synthesis of Artistically Painted Eyes

Girshick, R., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. In: CVPR, pp. 580–587 (2014)

Girshick, R.: Fast r-cnn. In: ICCV, pp. 1440–1448 (2015)

Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)

Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NIPS, pp. 379–387 (2016)

Han, G., Zhang, X., Li, C.: Revisiting faster R-CNN: a deeper look at region proposal network. In: ICONIP, pp. 14–24 (2017)

Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR, pp. 7263–7271 (2017)

Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2CrossRef

Han, G., Zhang, X., Li, C.: Single shot object detection with top-down refinement. In: ICIP, pp. 3360–3364 (2017)

Kang, K., et al.: Object detection in videos with tubelet proposal networks. In: CVPR, pp. 727–735 (2017)

10.

Shelhamer, E., Rakelly, K., Hoffman, J., Darrell, T.: Clockwork convnets for video semantic segmentation. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 852–868. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_69CrossRef

11.

Zhu, X., Xiong, Y., Dai, J., Yuan, L., Wei, Y.: Deep feature flow for video recognition. In: CVPR, pp. 2349–2358 (2017)

12.

Han, G., Zhang, X., Li, C.: Semi-supervised DFF: decoupling detection and feature flow for video object detectors. In: ACM Multimedia 2018. New York, NY, USA, pp. 1811–1819 (2018). https://doi.org/10.1145/3240508.3240693

13.

Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)

14.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

15.

Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)

16.

Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43CrossRef

17.

Mathieu, M., Couprie, C., LeCun, Y.: Deep multi-scale video prediction beyond mean square error. In: ICLR (2016)

18.

Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)

19.

Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., Webb, R.: Learning from simulated and unsupervised images through adversarial training. In: CVPR (2017)

20.

Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)

21.

Wiskott, L., Sejnowski, T.J.: Slow feature analysis: unsupervised learning of invariances. Neural Comput. 14(4), 715–770 (2002)CrossRef

22.

Jayaraman, D., Grauman, K.: Slow and steady feature analysis: higher order temporal coherence in video. In: CVPR (2016)

23.

Wang, X., Gupta, A.: Unsupervised learning of visual representations using videos. In: ICCV 2015 (2015)

24.

Lee, H.-Y., Huang, J.-B., Singh, M., Yang, M.-H.: Unsupervised representation learning by sorting sequences. In: ICCV (2017)

25.

Dosovitskiy, A., et al.. Flownet: learning optical flow with convolutional networks. In: ICCV (2015)

26.

Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: ICCV (2015)

27.

Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: ICML (2017)

Titel: Unsupervised Feature Propagation for Fast Video Object Detection Using Generative Adversarial Networks
verfasst von: Xuan Zhang
Guangxing Han
Wenduo He
Verlag: Springer International Publishing
Buch: MultiMedia Modeling
Print ISBN: 978-3-030-37730-4

Electronic ISBN: 978-3-030-37731-1

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-3-030-37731-1_50

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Interview Entropie Bild 1/© Bernhard Weßling, Joerg Schweinsberg/© Datacore Software, Smart Factory Symbolbild/© TensorSpark | Generated with AI | Getty Images, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.