Skip to main content
main-content
Top

Hint

Swipe to navigate through the chapters of this book

2020 | OriginalPaper | Chapter

5. Bilderkennung mit tiefen neuronalen Netzen

Authors: Gerhard Paaß, Dirk Hecker

Published in: Künstliche Intelligenz

Publisher: Springer Fachmedien Wiesbaden

share
SHARE

Zusammenfassung

In der Bilderkennung werden automatische Verfahren gesucht, mit denen man Objekte in einem Bild oder Foto identifizieren kann. Dabei geht es einerseits um die Klassifikation der Bildobjekte und andererseits um die Bestimmung ihrer Position auf dem Bild. Die allermeisten TNN zur Bildverarbeitung sind Convolutionale Neuronale Netze (CNN). Deren Schichten enthalten einerseits kleine rezeptive Felder (Convolutions), welche über die Pixelmatrix des Eingabebildes verschoben werde. Sie sind in der Lage, lokale Bildmerkmale zu detektieren. Zum anderen werden Poolingschichten verwendet, welche die Ergebnisse lokal aggregieren. Moderne CNNs enthalten Hunderte dieser Schichten, welche sukzessive komplexere Bildmerkmale erkennen können. Sie machen oft weniger Fehler bei der Bildklassifikation als Menschen. Spezielle Varianten wurden entwickelt, um die Position von Objekten in Bildern pixelgenau zu bestimmen. Schließlich werden Modelle zu Bestimmung der Ungenauigkeit von Bildklassifikationen vorgestellt, und der Einfluss von Bildverzerrungen und bewußten Bildmanipulationen auf die Klassifikationsgenauigkeit diskutiert.
Literature
go back to reference Cui, Y., et al. 2018. Large scale fine-grained categorization and domain-specific transfer learning. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4109–4118). 2018. Cui, Y., et al. 2018. Large scale fine-grained categorization and domain-specific transfer learning. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4109–4118). 2018.
go back to reference Deng, J., et al. 2009. ImageNet: A large-scale hierarchical image database. CVPR 2009 – IEEE Conference on Computer Vision and Pattern Recognition. S. 248–255. 2009. Deng, J., et al. 2009. ImageNet: A large-scale hierarchical image database. CVPR 2009 – IEEE Conference on Computer Vision and Pattern Recognition. S. 248–255. 2009.
go back to reference Di Mattia, F., et al. 2019. A survey on gans for anomaly detection. [Online] 2019. [Zitat vom: 20. 04. 2020.] arXiv preprint arXiv:1906.11632. Di Mattia, F., et al. 2019. A survey on gans for anomaly detection. [Online] 2019. [Zitat vom: 20. 04. 2020.] arXiv preprint arXiv:1906.11632.
go back to reference Eslami, S. A., et al. 2018. Neural scene representation and rendering. Science, 360(6394), 1204–1210. 2018. Eslami, S. A., et al. 2018. Neural scene representation and rendering. Science, 360(6394), 1204–1210. 2018.
go back to reference Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. 2017. Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115. 2017. Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. 2017. Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115. 2017.
go back to reference Eykholt, K., et al. 2018. Robust physical-world attacks on deep learning visual classification. Proc. ICPR pp. 1625–1634. 2018. Eykholt, K., et al. 2018. Robust physical-world attacks on deep learning visual classification. Proc. ICPR pp. 1625–1634. 2018.
go back to reference Fukushima, K. 1980. Neocognitron: A self-organiing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4): 93–202. 1980. Fukushima, K. 1980. Neocognitron: A self-organiing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4): 93–202. 1980.
go back to reference Gal, Yarin. 2016. Uncertainty in Deep Learning – Thesis. University of Cambridge : s.n., 2016. Gal, Yarin. 2016. Uncertainty in Deep Learning – Thesis. University of Cambridge : s.n., 2016.
go back to reference Goodfellow, Ian, Bengio, Yoshua und Courville, Aaron. 2016. Deep Learning. s.l. : MIT Press, 2016. Goodfellow, Ian, Bengio, Yoshua und Courville, Aaron. 2016. Deep Learning. s.l. : MIT Press, 2016.
go back to reference Hastie, Trevor, Tibshirani, Robert und Friedman, Jerome. 2009. The Elements of Statistical Learning, 2. Ausgabe. s.l. : Springer, 2009. Hastie, Trevor, Tibshirani, Robert und Friedman, Jerome. 2009. The Elements of Statistical Learning, 2. Ausgabe. s.l. : Springer, 2009.
go back to reference He, K., et al. 2016. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778). 2016. He, K., et al. 2016. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778). 2016.
go back to reference Hendrycks, D. und Dietterich, T. 2019. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. ICLR 2019. 2019. Hendrycks, D. und Dietterich, T. 2019. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. ICLR 2019. 2019.
go back to reference Hossain, M. T., et al. 2018. Distortion robust image classification with deep convolutional neural network based on discrete cosine transform. [Online] 2018. [Zitat vom: 30. 05. 2019.] arXiv preprint arXiv:1811.05819. Hossain, M. T., et al. 2018. Distortion robust image classification with deep convolutional neural network based on discrete cosine transform. [Online] 2018. [Zitat vom: 30. 05. 2019.] arXiv preprint arXiv:1811.05819.
go back to reference Huang, G., et al. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700–4708). 2017. Huang, G., et al. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700–4708). 2017.
go back to reference Hubel, D. H. und Wiesel, T. N. 1962. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. The Journal of physiology, 160(1), 106–154. 1962. Hubel, D. H. und Wiesel, T. N. 1962. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. The Journal of physiology, 160(1), 106–154. 1962.
go back to reference Jakubovitz, D. und Giryes, R. 2018. Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization. ECCV 2018. 2018. Jakubovitz, D. und Giryes, R. 2018. Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization. ECCV 2018. 2018.
go back to reference Kauffmann, J., Müller, K. R. und Montavon, G. 2020. Towards explaining anomalies: a deep Taylor decomposition of one-class models. Pattern Recognition, 107198. 2020. Kauffmann, J., Müller, K. R. und Montavon, G. 2020. Towards explaining anomalies: a deep Taylor decomposition of one-class models. Pattern Recognition, 107198. 2020.
go back to reference Krizhevsky, A., Sutskever, I. und Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems (pp. 1097–1105). 2012. Krizhevsky, A., Sutskever, I. und Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems (pp. 1097–1105). 2012.
go back to reference LeCun, Y., et al. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. 1998. LeCun, Y., et al. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. 1998.
go back to reference Lee, J., et al. 2019. Biobert: pre-trained biomedical language representation model for biomedical text mining. [Online] 2019. [Zitat vom: 16. 07. 2019.] arXiv preprint arXiv:1901.08746. Lee, J., et al. 2019. Biobert: pre-trained biomedical language representation model for biomedical text mining. [Online] 2019. [Zitat vom: 16. 07. 2019.] arXiv preprint arXiv:1901.08746.
go back to reference Li, Fei-Fei, Johnson, Justin und Yeung, Serena. 2018. CNNs for Visual Recognition. Lecture 2: Image Classification pipeline. s.l. : Stanford University, 2018. Li, Fei-Fei, Johnson, Justin und Yeung, Serena. 2018. CNNs for Visual Recognition. Lecture 2: Image Classification pipeline. s.l. : Stanford University, 2018.
go back to reference Marcus, G. 2018. Deep learning: A critical appraisal. [Online] 2018. [Zitat vom: 21. 05. 2019.] arXiv preprint arXiv:1801.00631. Marcus, G. 2018. Deep learning: A critical appraisal. [Online] 2018. [Zitat vom: 21. 05. 2019.] arXiv preprint arXiv:1801.00631.
go back to reference Noh, H., S. Hong und Han, B. 2015. Learning Deconvolution Network for Semantic Segmentation. Proceedings of the IEEE international conference on computer vision (pp. 1520–1528). 2015. Noh, H., S. Hong und Han, B. 2015. Learning Deconvolution Network for Semantic Segmentation. Proceedings of the IEEE international conference on computer vision (pp. 1520–1528). 2015.
go back to reference Noy, A., et al. 2019. ASAP: Architecture search, anneal and prune. [Online] 25. 09. 2019. arXiv preprint arXiv:1904.04123. Noy, A., et al. 2019. ASAP: Architecture search, anneal and prune. [Online] 25. 09. 2019. arXiv preprint arXiv:1904.04123.
go back to reference Phillips, P. J., et al. 2018. Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms. Proceedings of the National Academy of Sciences, 115(24), 6171–6176. 2018. Phillips, P. J., et al. 2018. Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms. Proceedings of the National Academy of Sciences, 115(24), 6171–6176. 2018.
go back to reference Ranjan, R., et al. 2018. Deep learning for understanding faces: Machines may be just as good, or better, than humans. IEEE Signal Processing Magazine, 35(1), 66–83. 2018. Ranjan, R., et al. 2018. Deep learning for understanding faces: Machines may be just as good, or better, than humans. IEEE Signal Processing Magazine, 35(1), 66–83. 2018.
go back to reference Redmon, J. und Farhadi, A. 2017. YOLO9000: better, faster, stronger. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7263–7271). 2017. Redmon, J. und Farhadi, A. 2017. YOLO9000: better, faster, stronger. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7263–7271). 2017.
go back to reference Redmon, J., et al. 2016. You only look once: Unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779–788). 2016. Redmon, J., et al. 2016. You only look once: Unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779–788). 2016.
go back to reference Ronneberger, O., Fischer, P. und Brox, T. 2015. U-net: Convolutional networks for biomedical image segmentation. International Conference on Medical image computing and computer-assisted intervention (pp. 234–241). Springer, Cham. 2015. Ronneberger, O., Fischer, P. und Brox, T. 2015. U-net: Convolutional networks for biomedical image segmentation. International Conference on Medical image computing and computer-assisted intervention (pp. 234–241). Springer, Cham. 2015.
go back to reference Russakovsky, O., et al. 2015. International Journal of Computer Vision, 115(3), 211–252. 2015. Russakovsky, O., et al. 2015. International Journal of Computer Vision, 115(3), 211–252. 2015.
go back to reference Sokolic, J., et al. 2016. Margin Preservation of Deep Neural Networks. [Online] 2016. arXiv, abs/1605.08254, 2016. Sokolic, J., et al. 2016. Margin Preservation of Deep Neural Networks. [Online] 2016. arXiv, abs/1605.08254, 2016.
go back to reference Szegedy, C., et al. 2015. Going deeper with convolutions. IEEE conference on computer vision and pattern recognition (pp. 1–9). 2015. Szegedy, C., et al. 2015. Going deeper with convolutions. IEEE conference on computer vision and pattern recognition (pp. 1–9). 2015.
go back to reference Szegedy, C., et al. 2014. Intriguing properties of neural networks. [Online] 2014. [Zitat vom: 30. 05. 2019.] arXiv preprint arXiv:1312.6199. Szegedy, C., et al. 2014. Intriguing properties of neural networks. [Online] 2014. [Zitat vom: 30. 05. 2019.] arXiv preprint arXiv:1312.6199.
go back to reference Teye, M., Azizpour, H. und Smith, K. 2018. Bayesian uncertainty estimation for batch normalized deep networks. [Online] 2018. [Zitat vom: 09. 06. 2019.] arXiv preprint arXiv:1802.06455. Teye, M., Azizpour, H. und Smith, K. 2018. Bayesian uncertainty estimation for batch normalized deep networks. [Online] 2018. [Zitat vom: 09. 06. 2019.] arXiv preprint arXiv:1802.06455.
go back to reference Van Horn, G., et al. 2018. The inaturalist species classification and detection dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8769–8778). 2018. Van Horn, G., et al. 2018. The inaturalist species classification and detection dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8769–8778). 2018.
go back to reference Wang, C., Li, M. und Smola, A. J. 2019. Language Models with Transformers. [Online] 2019. [Zitat vom: 27. 05. 2019.] arXiv preprint arXiv:1904.09408. Wang, C., Li, M. und Smola, A. J. 2019. Language Models with Transformers. [Online] 2019. [Zitat vom: 27. 05. 2019.] arXiv preprint arXiv:1904.09408.
go back to reference Welling, M. und Teh, Y. W. 2011. Bayesian learning via stochastic gradient Langevin dynamics. In Proceedings of the 28th international conference on machine learning (ICML-11) (pp. 681–688). 2011. Welling, M. und Teh, Y. W. 2011. Bayesian learning via stochastic gradient Langevin dynamics. In Proceedings of the 28th international conference on machine learning (ICML-11) (pp. 681–688). 2011.
go back to reference Wu, L. Y., et al. 2018. Starspace: Embed all the things! Thirty-Second AAAI Conference on Artificial Intelligence. 2018. Wu, L. Y., et al. 2018. Starspace: Embed all the things! Thirty-Second AAAI Conference on Artificial Intelligence. 2018.
go back to reference Xiao, C., et al. 2019. Generating adversarial examples with adversarial networks. [Online] 14. 02 2019. [Zitat vom: 15. 06. 2019.] arXiv preprint arXiv:1801.02610. Xiao, C., et al. 2019. Generating adversarial examples with adversarial networks. [Online] 14. 02 2019. [Zitat vom: 15. 06. 2019.] arXiv preprint arXiv:1801.02610.
go back to reference Yalniz, I. Z., et al. 2019. Billion-scale semi-supervised learning for image classification. [Online] 2019. [Zitat vom: 21. 10. 2019.] arXiv preprint arXiv:1905.00546. Yalniz, I. Z., et al. 2019. Billion-scale semi-supervised learning for image classification. [Online] 2019. [Zitat vom: 21. 10. 2019.] arXiv preprint arXiv:1905.00546.
go back to reference You, Y., et al. 2018. Imagenet training in minutes. In Proceedings of the 47th International Conference on Parallel Processing (p. 1). ACM. 2018. You, Y., et al. 2018. Imagenet training in minutes. In Proceedings of the 47th International Conference on Parallel Processing (p. 1). ACM. 2018.
go back to reference Zagoruyko, Sergey, und Komodakis, Nikos. 2016. Wide residual networks. [Online] 2016. arXiv preprint arXiv:1605.07146. Zagoruyko, Sergey, und Komodakis, Nikos. 2016. Wide residual networks. [Online] 2016. arXiv preprint arXiv:1605.07146.
go back to reference Zeiler, M. D. und Fergus, R. 2014. Visualizing and understanding convolutional networks. In European conference on computer vision (pp. 818–833). Springer, Cham. 2014. Zeiler, M. D. und Fergus, R. 2014. Visualizing and understanding convolutional networks. In European conference on computer vision (pp. 818–833). Springer, Cham. 2014.
go back to reference Zhu, Y., et al. 2019. Improving Semantic Segmentation via Video Propagation and Label Relaxation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 8856–8865). 2019. Zhu, Y., et al. 2019. Improving Semantic Segmentation via Video Propagation and Label Relaxation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 8856–8865). 2019.
Metadata
Title
Bilderkennung mit tiefen neuronalen Netzen
Authors
Gerhard Paaß
Dirk Hecker
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-658-30211-5_5

Premium Partner