nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

Deep Learning Architectures for Computer Vision Applications: A Study

verfasst von : Randheer Bagi, Tanima Dutta, Hari Prabhat Gupta

Erschienen in: Advances in Data and Information Sciences

Verlag: Springer Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Deep learning has become one of the most preferred solution for many complex problems. It shows outstanding performance in the field of computer vision to perform tasks like, image classification, object detection, and image generation. Recently, many research efforts are focused on changing the deep learning architecture for widespread application domain. In this paper, we present a comprehensive survey on the various issues and challenges faced by deep learning techniques. Furthermore, we analyze different deep learning architectures to provide the solution for the computer vision tasks along with their importance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Minimizing Synchronization Error in Compressed Domain Watermarking

Nächstes Kapitel Robust Reversible Watermarking for Grayscale Medical Images

Chen, L. C., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. CoRR abs/1706.05587.

Choe, J. W., Nikoozadeh, A., & Oralkan, O., Khuri-Yakub, B.T. (2013). GPU-based real-time volumetric ultrasound image reconstruction for a ring array. IEEE Transactions on Medical Imaging,32(7), 1258–1264.

Choi, Y., Choi, M., Kim, M., Ha, J., Kim, S., & Choo, J. (2017). StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. CoRR abs/1711.09020.

Forsyth, D. A., & Ponce, J. (2002). Computer vision: A modern approach. Pearson Education India.

Girshick, R. (2015). Fast R-CNN. In Proceedings of IEEE ICCV (pp. 1440–1448).

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.

Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial nets. In Proceedings of NIPS (pp. 2672–2680).

Graves, A., Mohamed, A., & Hinton, G. (2013). Speech recognition with deep recurrent neural networks. In Proceedings of IEEE ICASSP (pp. 6645–6649).

Guo, T., Dong, J., Li, H., & Gao, Y. (2017). Simple convolutional neural network on image classification. In Proceeding of IEEE ICBDA (pp. 721–724).

10.

Hall, M. A., & Smith, L. A. (1999). Feature selection for machine learning: Comparing a correlation-based filter approach to the wrapper. In Proceedings of IFAIRSC (pp. 235–239).

11.

Hatcher, W. G., & Yu, W. (2018). A survey of deep learning: Platforms, applications and emerging research trends. IEEE Access,6, 24411–24432.

12.

He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition. CoRR abs/1512.03385.

13.

He, K., Gkioxari, G., Dollr, P., & Girshick, R. (2017). Mask R-CNN. In Proceedings of IEEE ICCV (pp. 2980–2988).

14.

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation,9(8), 1735–1780.

15.

Hu, B., Lu, Z., Li, H., & Chen, Q. (2014). Convolutional neural network architectures for matching natural language sentences. Advances in Neural Information Processing Systems,27, 2042–2050.

16.

Jaiswal, A., AbdAlmageed, W., & Natarajan, P. (2018). CapsuleGAN: Generative adversarial capsule network. In ECCV Workshops.

17.

Kaiming, H., Zhang, X., Ren, S., Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imageNet classification. In Proceedings of IEEE ICCV (pp. 1026–1034).

18.

Karras, T., Laine, S., & Aila, T. (2018). A style-based generator architecture for generative adversarial networks. CoRR abs/1812.04948.

19.

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems,25, 1097–1105.

20.

Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE,86(11), 2278–2324.

21.

O’Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. ArXiv e-prints.

22.

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016a). You only look once: Unified, real-time object detection. In Proceeding of IEEE CVPR (pp. 779–788).

23.

Redmon, J., Divvala, S. K., Girshick, R. B., & Farhadi, A. (2016b). You only look once: Unified, real-time object detection. Proceeding of IEEE CVPR (pp. 779–788).

24.

Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems,28, 91–99.

25.

Sabour, S., Frosst, N., & Hinton, G. E. (2017). Dynamic routing between capsules. CoRR abs/1710.09829, 1710.09829.

26.

Sak, H., Senior, A. W., & Beaufays, F. (2014). Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition. CoRR abs/1402.1128.

27.

Schuster, M., & Paliwal, K. (1997). Bidirectional recurrent neural networks. Transaction in Signal Processing,45(11), 2673–2681.

28.

Shin, H., Roth, H. R., Gao, M., Lu, L., Xu, Z., Nogues, I., et al. (2016). Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Transactions on Medical Imaging,35(5), 1285–1298.

29.

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556.

30.

Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems,27, 3104–3112.

31.

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., et al. (2015). Going deeper with convolutions. In Proceeding of IEEE CVPR (pp. 1–9).

32.

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceeding of IEEE CVPR.

33.

Turner, C. R., Wolf, A. L., Fuggetta, A., & Lavazza, L. (1998). Feature engineering. In Proceedings of IWSSD (p. 162).

34.

Vijay, B., Kendall, A., & Cipolla, R. (2017). SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence.

35.

Zeiler, M. D., & Fergus, R. (2013). Visualizing and understanding convolutional networks. CoRR.

36.

Zhao, Q., Sheng, T., Wang, Y., Tang, Z., Chen, Y., Cai, L., et al. (2018). M2Det: A single-shot object detector based on multi-level feature pyramid network. CoRR abs/1811.04533.

37.

Zhu, J. Y., Krähenbühl, P., Shechtman, E., & Efros, A. A. (2016). Generative visual manipulation on the natural image manifold. CoRR abs/1609.03552.

Titel: Deep Learning Architectures for Computer Vision Applications: A Study
verfasst von: Randheer Bagi
Tanima Dutta
Hari Prabhat Gupta
Verlag: Springer Singapore
Buch: Advances in Data and Information Sciences
Print ISBN: 978-981-15-0693-2

Electronic ISBN: 978-981-15-0694-9

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-981-15-0694-9_56

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Die Gewinner und Laudatoren des Sustainability Award in Automotive 2024/© Uli Regenscheit | ATZlive, Search Icon, Banner Hanser, Suresh Vittal/© Alteryx, Additiv gefertigte Teile/© Marina_Skoropadskaya | Getty Images | iStock, Warnschild "Land unter"/© Bluedesign / Fotolia, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH, adäsion-Webinar-Matinee/© krystiannawrocki_ Getty Images

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.