Skip to main content
Top

2018 | OriginalPaper | Chapter

A Shared Encoder DNN for Integrated Recognition and Segmentation of Traffic Scenes

Authors : Malte Oeljeklaus, Frank Hoffmann, Torsten Bertram

Published in: Frontiers in Computational Intelligence

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Detection of traffic related objects in the vehicles surroundings is an important task for future automated cars. Visual object recognition and scene labeling from onboard cameras provides valuable information for the driving task. In computer vision, the task of generating meaningful image regions representing specific object categories such as cars or road area, is denoted as semantic segmentation. In contrast, scene recognition computes a global label that reflects the overall category of the scene. This contribution presents an efficient deep neural network (DNN) capable of solving both problems. The network topology avoids redundant computations, by employing a shared feature encoder stage combined with designated decoders for the two specific tasks. Additionally, element-wise weights in a novel Hadamard layer efficiently exploit spatial priors for the segmentation task. Traffic scene segmentation is examined in conjunction with road topology recognition based on the cityscapes dataset [2] augmented with manually labeled road topology data.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
This work employs a variant of the architecture published at https://​github.​com/​BVLC/​caffe/​tree/​master/​models/​bvlc_​googlenet. Accessed: 18.01.2017.
 
Literature
1.
go back to reference Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2015) Semantic image segmentation with deep convolutional nets and fully connected crfs. In: 3rd international conference on learning representations. arXiv:1412.7062 Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2015) Semantic image segmentation with deep convolutional nets and fully connected crfs. In: 3rd international conference on learning representations. arXiv:​1412.​7062
2.
go back to reference Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223 Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223
3.
go back to reference Ess A, Müller T, Grabner H, van Gool L (2009) Segmentation-based urban traffic scene understanding. In: Proceedings of the 20th British machine vision conference, pp 84–1 Ess A, Müller T, Grabner H, van Gool L (2009) Segmentation-based urban traffic scene understanding. In: Proceedings of the 20th British machine vision conference, pp 84–1
4.
go back to reference Fritsch J, Kühnl T, Geiger A (2013) A new performance measure and evaluation benchmark for road detection algorithms. In: Proceedings of the 16th IEEE conference on intelligent transportation systems, pp 1693–1700 Fritsch J, Kühnl T, Geiger A (2013) A new performance measure and evaluation benchmark for road detection algorithms. In: Proceedings of the 16th IEEE conference on intelligent transportation systems, pp 1693–1700
5.
go back to reference Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Aistats, vol 15, p 275 Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Aistats, vol 15, p 275
6.
go back to reference Hariharan B, Arbeláez P, Girshick R, Malik J (2015) Hypercolumns for object segmentation and fine-grained localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 447–456 Hariharan B, Arbeláez P, Girshick R, Malik J (2015) Hypercolumns for object segmentation and fine-grained localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 447–456
7.
go back to reference He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
8.
go back to reference Hong S, Noh H, Han B (2015) Decoupled deep neural network for semi-supervised semantic segmentation. In: Advances in neural information processing systems, vol 28. MIT Press, pp 1495–1503 Hong S, Noh H, Han B (2015) Decoupled deep neural network for semi-supervised semantic segmentation. In: Advances in neural information processing systems, vol 28. MIT Press, pp 1495–1503
9.
go back to reference Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp. 675–678 Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp. 675–678
10.
go back to reference Kendall A, Badrinarayanan V, Cipolla R (2015) Bayesian segnet: model uncertainty in deep convolutional encoder-decoder architectures for scene understanding. arXiv:1511.02680 Kendall A, Badrinarayanan V, Cipolla R (2015) Bayesian segnet: model uncertainty in deep convolutional encoder-decoder architectures for scene understanding. arXiv:​1511.​02680
11.
go back to reference Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105 Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
12.
go back to reference Lin G, Shen C, van den Hengel A, Reid I (2016) Efficient piecewise training of deep structured models for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 3194–3203 Lin G, Shen C, van den Hengel A, Reid I (2016) Efficient piecewise training of deep structured models for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 3194–3203
14.
go back to reference Liu B, He X, Gould S (2015) Multi-class semantic video segmentation with exemplar-based object reasoning. In: Proceedings of the IEEE winter conference on applications of computer vision, pp 1014–1021 Liu B, He X, Gould S (2015) Multi-class semantic video segmentation with exemplar-based object reasoning. In: Proceedings of the IEEE winter conference on applications of computer vision, pp 1014–1021
15.
go back to reference Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440 Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
16.
go back to reference Mostajabi M, Yadollahpour P, Shakhnarovich G (2015) Feedforward semantic segmentation with zoom-out features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3376–3385 Mostajabi M, Yadollahpour P, Shakhnarovich G (2015) Feedforward semantic segmentation with zoom-out features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3376–3385
17.
go back to reference Papandreou G, Chen LC, Murphy K, Yuille AL (2015) Weakly-and semi-supervised learning of a dcnn for semantic image segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 648–656 Papandreou G, Chen LC, Murphy K, Yuille AL (2015) Weakly-and semi-supervised learning of a dcnn for semantic image segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 648–656
18.
go back to reference Posada LF, Hoffmann F, Bertram T (2014) Visual semantic robot navigation in indoor environments. In: Proceedings of the 41st international symposium on robotics, VDE, pp 1–7 Posada LF, Hoffmann F, Bertram T (2014) Visual semantic robot navigation in indoor environments. In: Proceedings of the 41st international symposium on robotics, VDE, pp 1–7
19.
go back to reference Posada LF, Narayanan KK, Hoffmann F, Bertram T (2013) Semantic classification of scenes and places with omnidirectional vision. In: Proceedings of the IEEE European conference on mobile robots, pp 113–118 Posada LF, Narayanan KK, Hoffmann F, Bertram T (2013) Semantic classification of scenes and places with omnidirectional vision. In: Proceedings of the IEEE European conference on mobile robots, pp 113–118
20.
go back to reference Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252. doi:10.1007/s11263-015-0816-y Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252. doi:10.​1007/​s11263-015-0816-y
21.
go back to reference Shuai B, Zuo Z, Wang B, Wang G (2016) Dag-recurrent neural networks for scene labeling. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3620–3629 Shuai B, Zuo Z, Wang B, Wang G (2016) Dag-recurrent neural networks for scene labeling. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3620–3629
22.
go back to reference Sikirić I, Brkić K, Krapac J, Šegvić S (2014) Image representations on a budget: traffic scene classification in a restricted bandwidth scenario. In: Proceedings of the IEEE intelligent vehicles symposium, pp 845–852 Sikirić I, Brkić K, Krapac J, Šegvić S (2014) Image representations on a budget: traffic scene classification in a restricted bandwidth scenario. In: Proceedings of the IEEE intelligent vehicles symposium, pp 845–852
24.
go back to reference Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. In: Proceedings of the 30th international conference on machine learning, pp 1139–1147 Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. In: Proceedings of the 30th international conference on machine learning, pp 1139–1147
25.
go back to reference Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2016) Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv:1602.07261 Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2016) Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv:​1602.​07261
26.
go back to reference Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9 Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
27.
go back to reference Teichmann M, Weber M, Zoellner M, Cipolla R, Urtasun R (2016) Multinet: Real-time joint semantic reasoning for autonomous driving. arXiv:1612.07695 Teichmann M, Weber M, Zoellner M, Cipolla R, Urtasun R (2016) Multinet: Real-time joint semantic reasoning for autonomous driving. arXiv:​1612.​07695
29.
go back to reference Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Advances in neural information processing systems, vol 27. MIT Press, pp 3320–3328 Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Advances in neural information processing systems, vol 27. MIT Press, pp 3320–3328
30.
go back to reference Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions. In: 4th International conference on learning representations. arXiv:1511.07122 Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions. In: 4th International conference on learning representations. arXiv:​1511.​07122
32.
go back to reference Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2015) Object detectors emerge in deep scene cnns. In: 3rd International conference on learning representations. arXiv:1412.6856 Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2015) Object detectors emerge in deep scene cnns. In: 3rd International conference on learning representations. arXiv:​1412.​6856
Metadata
Title
A Shared Encoder DNN for Integrated Recognition and Segmentation of Traffic Scenes
Authors
Malte Oeljeklaus
Frank Hoffmann
Torsten Bertram
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-67789-7_7

Premium Partner