Skip to main content
main-content

Tipp

Weitere Kapitel dieses Buchs durch Wischen aufrufen

2020 | OriginalPaper | Buchkapitel

5. Bilderkennung mit tiefen neuronalen Netzen

verfasst von: Gerhard Paaß, Dirk Hecker

Erschienen in: Künstliche Intelligenz

Verlag: Springer Fachmedien Wiesbaden

share
TEILEN

Zusammenfassung

In der Bilderkennung werden automatische Verfahren gesucht, mit denen man Objekte in einem Bild oder Foto identifizieren kann. Dabei geht es einerseits um die Klassifikation der Bildobjekte und andererseits um die Bestimmung ihrer Position auf dem Bild. Die allermeisten TNN zur Bildverarbeitung sind Convolutionale Neuronale Netze (CNN). Deren Schichten enthalten einerseits kleine rezeptive Felder (Convolutions), welche über die Pixelmatrix des Eingabebildes verschoben werde. Sie sind in der Lage, lokale Bildmerkmale zu detektieren. Zum anderen werden Poolingschichten verwendet, welche die Ergebnisse lokal aggregieren. Moderne CNNs enthalten Hunderte dieser Schichten, welche sukzessive komplexere Bildmerkmale erkennen können. Sie machen oft weniger Fehler bei der Bildklassifikation als Menschen. Spezielle Varianten wurden entwickelt, um die Position von Objekten in Bildern pixelgenau zu bestimmen. Schließlich werden Modelle zu Bestimmung der Ungenauigkeit von Bildklassifikationen vorgestellt, und der Einfluss von Bildverzerrungen und bewußten Bildmanipulationen auf die Klassifikationsgenauigkeit diskutiert.
Literatur
Zurück zum Zitat Cui, Y., et al. 2018. Large scale fine-grained categorization and domain-specific transfer learning. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4109–4118). 2018. Cui, Y., et al. 2018. Large scale fine-grained categorization and domain-specific transfer learning. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4109–4118). 2018.
Zurück zum Zitat Deng, J., et al. 2009. ImageNet: A large-scale hierarchical image database. CVPR 2009 – IEEE Conference on Computer Vision and Pattern Recognition. S. 248–255. 2009. Deng, J., et al. 2009. ImageNet: A large-scale hierarchical image database. CVPR 2009 – IEEE Conference on Computer Vision and Pattern Recognition. S. 248–255. 2009.
Zurück zum Zitat Di Mattia, F., et al. 2019. A survey on gans for anomaly detection. [Online] 2019. [Zitat vom: 20. 04. 2020.] arXiv preprint arXiv:1906.11632. Di Mattia, F., et al. 2019. A survey on gans for anomaly detection. [Online] 2019. [Zitat vom: 20. 04. 2020.] arXiv preprint arXiv:1906.11632.
Zurück zum Zitat Eslami, S. A., et al. 2018. Neural scene representation and rendering. Science, 360(6394), 1204–1210. 2018. Eslami, S. A., et al. 2018. Neural scene representation and rendering. Science, 360(6394), 1204–1210. 2018.
Zurück zum Zitat Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. 2017. Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115. 2017. Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. 2017. Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115. 2017.
Zurück zum Zitat Eykholt, K., et al. 2018. Robust physical-world attacks on deep learning visual classification. Proc. ICPR pp. 1625–1634. 2018. Eykholt, K., et al. 2018. Robust physical-world attacks on deep learning visual classification. Proc. ICPR pp. 1625–1634. 2018.
Zurück zum Zitat Fukushima, K. 1980. Neocognitron: A self-organiing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4): 93–202. 1980. Fukushima, K. 1980. Neocognitron: A self-organiing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4): 93–202. 1980.
Zurück zum Zitat Gal, Yarin. 2016. Uncertainty in Deep Learning – Thesis. University of Cambridge : s.n., 2016. Gal, Yarin. 2016. Uncertainty in Deep Learning – Thesis. University of Cambridge : s.n., 2016.
Zurück zum Zitat Goodfellow, Ian, Bengio, Yoshua und Courville, Aaron. 2016. Deep Learning. s.l. : MIT Press, 2016. Goodfellow, Ian, Bengio, Yoshua und Courville, Aaron. 2016. Deep Learning. s.l. : MIT Press, 2016.
Zurück zum Zitat Hastie, Trevor, Tibshirani, Robert und Friedman, Jerome. 2009. The Elements of Statistical Learning, 2. Ausgabe. s.l. : Springer, 2009. Hastie, Trevor, Tibshirani, Robert und Friedman, Jerome. 2009. The Elements of Statistical Learning, 2. Ausgabe. s.l. : Springer, 2009.
Zurück zum Zitat He, K., et al. 2016. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778). 2016. He, K., et al. 2016. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778). 2016.
Zurück zum Zitat Hendrycks, D. und Dietterich, T. 2019. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. ICLR 2019. 2019. Hendrycks, D. und Dietterich, T. 2019. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. ICLR 2019. 2019.
Zurück zum Zitat Hossain, M. T., et al. 2018. Distortion robust image classification with deep convolutional neural network based on discrete cosine transform. [Online] 2018. [Zitat vom: 30. 05. 2019.] arXiv preprint arXiv:1811.05819. Hossain, M. T., et al. 2018. Distortion robust image classification with deep convolutional neural network based on discrete cosine transform. [Online] 2018. [Zitat vom: 30. 05. 2019.] arXiv preprint arXiv:1811.05819.
Zurück zum Zitat Huang, G., et al. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700–4708). 2017. Huang, G., et al. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700–4708). 2017.
Zurück zum Zitat Hubel, D. H. und Wiesel, T. N. 1962. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. The Journal of physiology, 160(1), 106–154. 1962. Hubel, D. H. und Wiesel, T. N. 1962. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. The Journal of physiology, 160(1), 106–154. 1962.
Zurück zum Zitat Jakubovitz, D. und Giryes, R. 2018. Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization. ECCV 2018. 2018. Jakubovitz, D. und Giryes, R. 2018. Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization. ECCV 2018. 2018.
Zurück zum Zitat Kauffmann, J., Müller, K. R. und Montavon, G. 2020. Towards explaining anomalies: a deep Taylor decomposition of one-class models. Pattern Recognition, 107198. 2020. Kauffmann, J., Müller, K. R. und Montavon, G. 2020. Towards explaining anomalies: a deep Taylor decomposition of one-class models. Pattern Recognition, 107198. 2020.
Zurück zum Zitat Krizhevsky, A., Sutskever, I. und Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems (pp. 1097–1105). 2012. Krizhevsky, A., Sutskever, I. und Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems (pp. 1097–1105). 2012.
Zurück zum Zitat LeCun, Y., et al. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. 1998. LeCun, Y., et al. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. 1998.
Zurück zum Zitat Lee, J., et al. 2019. Biobert: pre-trained biomedical language representation model for biomedical text mining. [Online] 2019. [Zitat vom: 16. 07. 2019.] arXiv preprint arXiv:1901.08746. Lee, J., et al. 2019. Biobert: pre-trained biomedical language representation model for biomedical text mining. [Online] 2019. [Zitat vom: 16. 07. 2019.] arXiv preprint arXiv:1901.08746.
Zurück zum Zitat Li, Fei-Fei, Johnson, Justin und Yeung, Serena. 2018. CNNs for Visual Recognition. Lecture 2: Image Classification pipeline. s.l. : Stanford University, 2018. Li, Fei-Fei, Johnson, Justin und Yeung, Serena. 2018. CNNs for Visual Recognition. Lecture 2: Image Classification pipeline. s.l. : Stanford University, 2018.
Zurück zum Zitat Marcus, G. 2018. Deep learning: A critical appraisal. [Online] 2018. [Zitat vom: 21. 05. 2019.] arXiv preprint arXiv:1801.00631. Marcus, G. 2018. Deep learning: A critical appraisal. [Online] 2018. [Zitat vom: 21. 05. 2019.] arXiv preprint arXiv:1801.00631.
Zurück zum Zitat Noh, H., S. Hong und Han, B. 2015. Learning Deconvolution Network for Semantic Segmentation. Proceedings of the IEEE international conference on computer vision (pp. 1520–1528). 2015. Noh, H., S. Hong und Han, B. 2015. Learning Deconvolution Network for Semantic Segmentation. Proceedings of the IEEE international conference on computer vision (pp. 1520–1528). 2015.
Zurück zum Zitat Noy, A., et al. 2019. ASAP: Architecture search, anneal and prune. [Online] 25. 09. 2019. arXiv preprint arXiv:1904.04123. Noy, A., et al. 2019. ASAP: Architecture search, anneal and prune. [Online] 25. 09. 2019. arXiv preprint arXiv:1904.04123.
Zurück zum Zitat Phillips, P. J., et al. 2018. Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms. Proceedings of the National Academy of Sciences, 115(24), 6171–6176. 2018. Phillips, P. J., et al. 2018. Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms. Proceedings of the National Academy of Sciences, 115(24), 6171–6176. 2018.
Zurück zum Zitat Ranjan, R., et al. 2018. Deep learning for understanding faces: Machines may be just as good, or better, than humans. IEEE Signal Processing Magazine, 35(1), 66–83. 2018. Ranjan, R., et al. 2018. Deep learning for understanding faces: Machines may be just as good, or better, than humans. IEEE Signal Processing Magazine, 35(1), 66–83. 2018.
Zurück zum Zitat Redmon, J. und Farhadi, A. 2017. YOLO9000: better, faster, stronger. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7263–7271). 2017. Redmon, J. und Farhadi, A. 2017. YOLO9000: better, faster, stronger. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7263–7271). 2017.
Zurück zum Zitat Redmon, J., et al. 2016. You only look once: Unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779–788). 2016. Redmon, J., et al. 2016. You only look once: Unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779–788). 2016.
Zurück zum Zitat Ronneberger, O., Fischer, P. und Brox, T. 2015. U-net: Convolutional networks for biomedical image segmentation. International Conference on Medical image computing and computer-assisted intervention (pp. 234–241). Springer, Cham. 2015. Ronneberger, O., Fischer, P. und Brox, T. 2015. U-net: Convolutional networks for biomedical image segmentation. International Conference on Medical image computing and computer-assisted intervention (pp. 234–241). Springer, Cham. 2015.
Zurück zum Zitat Russakovsky, O., et al. 2015. International Journal of Computer Vision, 115(3), 211–252. 2015. Russakovsky, O., et al. 2015. International Journal of Computer Vision, 115(3), 211–252. 2015.
Zurück zum Zitat Sokolic, J., et al. 2016. Margin Preservation of Deep Neural Networks. [Online] 2016. arXiv, abs/1605.08254, 2016. Sokolic, J., et al. 2016. Margin Preservation of Deep Neural Networks. [Online] 2016. arXiv, abs/1605.08254, 2016.
Zurück zum Zitat Szegedy, C., et al. 2015. Going deeper with convolutions. IEEE conference on computer vision and pattern recognition (pp. 1–9). 2015. Szegedy, C., et al. 2015. Going deeper with convolutions. IEEE conference on computer vision and pattern recognition (pp. 1–9). 2015.
Zurück zum Zitat Szegedy, C., et al. 2014. Intriguing properties of neural networks. [Online] 2014. [Zitat vom: 30. 05. 2019.] arXiv preprint arXiv:1312.6199. Szegedy, C., et al. 2014. Intriguing properties of neural networks. [Online] 2014. [Zitat vom: 30. 05. 2019.] arXiv preprint arXiv:1312.6199.
Zurück zum Zitat Teye, M., Azizpour, H. und Smith, K. 2018. Bayesian uncertainty estimation for batch normalized deep networks. [Online] 2018. [Zitat vom: 09. 06. 2019.] arXiv preprint arXiv:1802.06455. Teye, M., Azizpour, H. und Smith, K. 2018. Bayesian uncertainty estimation for batch normalized deep networks. [Online] 2018. [Zitat vom: 09. 06. 2019.] arXiv preprint arXiv:1802.06455.
Zurück zum Zitat Van Horn, G., et al. 2018. The inaturalist species classification and detection dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8769–8778). 2018. Van Horn, G., et al. 2018. The inaturalist species classification and detection dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8769–8778). 2018.
Zurück zum Zitat Wang, C., Li, M. und Smola, A. J. 2019. Language Models with Transformers. [Online] 2019. [Zitat vom: 27. 05. 2019.] arXiv preprint arXiv:1904.09408. Wang, C., Li, M. und Smola, A. J. 2019. Language Models with Transformers. [Online] 2019. [Zitat vom: 27. 05. 2019.] arXiv preprint arXiv:1904.09408.
Zurück zum Zitat Welling, M. und Teh, Y. W. 2011. Bayesian learning via stochastic gradient Langevin dynamics. In Proceedings of the 28th international conference on machine learning (ICML-11) (pp. 681–688). 2011. Welling, M. und Teh, Y. W. 2011. Bayesian learning via stochastic gradient Langevin dynamics. In Proceedings of the 28th international conference on machine learning (ICML-11) (pp. 681–688). 2011.
Zurück zum Zitat Wu, L. Y., et al. 2018. Starspace: Embed all the things! Thirty-Second AAAI Conference on Artificial Intelligence. 2018. Wu, L. Y., et al. 2018. Starspace: Embed all the things! Thirty-Second AAAI Conference on Artificial Intelligence. 2018.
Zurück zum Zitat Xiao, C., et al. 2019. Generating adversarial examples with adversarial networks. [Online] 14. 02 2019. [Zitat vom: 15. 06. 2019.] arXiv preprint arXiv:1801.02610. Xiao, C., et al. 2019. Generating adversarial examples with adversarial networks. [Online] 14. 02 2019. [Zitat vom: 15. 06. 2019.] arXiv preprint arXiv:1801.02610.
Zurück zum Zitat Yalniz, I. Z., et al. 2019. Billion-scale semi-supervised learning for image classification. [Online] 2019. [Zitat vom: 21. 10. 2019.] arXiv preprint arXiv:1905.00546. Yalniz, I. Z., et al. 2019. Billion-scale semi-supervised learning for image classification. [Online] 2019. [Zitat vom: 21. 10. 2019.] arXiv preprint arXiv:1905.00546.
Zurück zum Zitat You, Y., et al. 2018. Imagenet training in minutes. In Proceedings of the 47th International Conference on Parallel Processing (p. 1). ACM. 2018. You, Y., et al. 2018. Imagenet training in minutes. In Proceedings of the 47th International Conference on Parallel Processing (p. 1). ACM. 2018.
Zurück zum Zitat Zagoruyko, Sergey, und Komodakis, Nikos. 2016. Wide residual networks. [Online] 2016. arXiv preprint arXiv:1605.07146. Zagoruyko, Sergey, und Komodakis, Nikos. 2016. Wide residual networks. [Online] 2016. arXiv preprint arXiv:1605.07146.
Zurück zum Zitat Zeiler, M. D. und Fergus, R. 2014. Visualizing and understanding convolutional networks. In European conference on computer vision (pp. 818–833). Springer, Cham. 2014. Zeiler, M. D. und Fergus, R. 2014. Visualizing and understanding convolutional networks. In European conference on computer vision (pp. 818–833). Springer, Cham. 2014.
Zurück zum Zitat Zhu, Y., et al. 2019. Improving Semantic Segmentation via Video Propagation and Label Relaxation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 8856–8865). 2019. Zhu, Y., et al. 2019. Improving Semantic Segmentation via Video Propagation and Label Relaxation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 8856–8865). 2019.
Metadaten
Titel
Bilderkennung mit tiefen neuronalen Netzen
verfasst von
Gerhard Paaß
Dirk Hecker
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-658-30211-5_5

Premium Partner