Skip to main content
Erschienen in: Cognitive Computation 2/2018

27.11.2017

Segmentation of Drivable Road Using Deep Fully Convolutional Residual Network with Pyramid Pooling

verfasst von: Xiaolong Liu, Zhidong Deng

Erschienen in: Cognitive Computation | Ausgabe 2/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In recent years, the self-driving car has rapidly been developing around the world. Based on deep learning, monocular vision-based environmental perceptions of either ADAS or self-driving cars are regarded as a feasible and sophisticated solution, in terms of achieving human-level performance at a low cost. Perceived surroundings generally include lane markings, curbs, drivable roads, intersections, obstacles, traffic signs, and landmarks used for navigation. Reliable detection or segmentation of drivable roads provides a solid foundation for obstacle detection during autonomous driving of the self-driving car. This paper proposes an RPP model for monocular vision-based road detection based on the combination of fully convolutional network, residual learning, and pyramid pooling. Specifically, the RPP is a deep fully convolutional residual neural network with pyramid pooling. In order to greatly improve prediction accuracy on the KITTI-ROAD detection task, we present a new strategy through an addition of road edge labels and an introduction of an appropriate data augmentation so as to effectively handle small training samples contained in the KITTI road detection. The experiments demonstrate that our RPP has achieved remarkable results, which ranks second in both unmarked road and marked road tasks, fifth in multiple-marked-lane task, and third in combination task. In this paper, we propose a powerful 112-layer RPP model through the incorporation of residual connections and pyramid pooling into a fully convolutional neural network framework. For small training sample problems such as the KITTI-ROAD detection, we present a new strategy through an addition of road edge labels and data augmentation. It suggests that addition of more labels and introduction of appropriate data augmentation can help deal with small training image problems. Moreover, a larger size of crops or combination with more global information also benefit improvements in road segmentation accuracy. If regardless of restricted computing and memory resources for such large-scale networks like RPP, the use of raw images instead of any crops and the selection of a large batch size are expected to further increase road detection accuracy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
1 http://​www.​cvlibs.​net/​datasets/​kitti/​eval_​road.​php
Table 3
The road detection results of RPP_whole_c3_aug
KITTI
MaxF
AP
PRE
REC
FPR
FNR
(%)
(%)
(%)
(%)
(%)
(%)
(%)
UM_ROAD
96.04
89.77
95.61
96.48
2.02
3.52
UMM_ROAD
97.03
92.36
96.36
97.70
4.06
2.30
UU_ROAD
95.47
88.74
95.16
95.77
1.59
4.23
URBAN_ROAD
96.36
90.36
95.85
96.87
2.31
3.13
 
Literatur
1.
Zurück zum Zitat Alvarez J, Gevers T, LeCun Y, Lopez A. Road scene segmentation from a single image. Computer Vision–ECCV 2012;2012:376–389. Alvarez J, Gevers T, LeCun Y, Lopez A. Road scene segmentation from a single image. Computer Vision–ECCV 2012;2012:376–389.
2.
Zurück zum Zitat Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL. 2014. Semantic image segmentation with deep convolutional nets and fully connected CRFS. arXiv:1412.7062. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL. 2014. Semantic image segmentation with deep convolutional nets and fully connected CRFS. arXiv:1412.​7062.
3.
Zurück zum Zitat Ding C, Choi J, Tao D, Davis LS. Multi-directional multi-level dual-cross patterns for robust face recognition. IEEE Trans Pattern Anal Mach Intell 2016;38(3):518–531.CrossRefPubMed Ding C, Choi J, Tao D, Davis LS. Multi-directional multi-level dual-cross patterns for robust face recognition. IEEE Trans Pattern Anal Mach Intell 2016;38(3):518–531.CrossRefPubMed
4.
Zurück zum Zitat Fang L, Wang X. Lane boundary detection algorithm based on vector fuzzy connectedness. Cogn Comput. 2017:1–12. Fang L, Wang X. Lane boundary detection algorithm based on vector fuzzy connectedness. Cogn Comput. 2017:1–12.
5.
Zurück zum Zitat Fritsch J, Kuhnl T, Geiger A. A new performance measure and evaluation benchmark for road detection algorithms. 2013 16th international IEEE conference on intelligent transportation systems-(ITSC). Piscataway: IEEE; 2013. p. 1693–1700. Fritsch J, Kuhnl T, Geiger A. A new performance measure and evaluation benchmark for road detection algorithms. 2013 16th international IEEE conference on intelligent transportation systems-(ITSC). Piscataway: IEEE; 2013. p. 1693–1700.
6.
Zurück zum Zitat Goldman DB. Vignette and exposure calibration and compensation. IEEE Trans Pattern Anal Mach Intell 2010; 32(12):2276–2288.CrossRefPubMed Goldman DB. Vignette and exposure calibration and compensation. IEEE Trans Pattern Anal Mach Intell 2010; 32(12):2276–2288.CrossRefPubMed
7.
Zurück zum Zitat Goodfellow IJ, Warde-Farley D, Lamblin P, Dumoulin V, Mirza M, Pascanu R, Bergstra J, Bastien F, Bengio Y. 2013. Pylearn2: a machine learning research library. arXiv:1308.4214. Goodfellow IJ, Warde-Farley D, Lamblin P, Dumoulin V, Mirza M, Pascanu R, Bergstra J, Bastien F, Bengio Y. 2013. Pylearn2: a machine learning research library. arXiv:1308.​4214.
8.
Zurück zum Zitat Goodfellow IJ, Warde-Farley D, Mirza M, Courville A, Bengio Y. 2013. Maxout networks. arXiv:1302.4389. Goodfellow IJ, Warde-Farley D, Mirza M, Courville A, Bengio Y. 2013. Maxout networks. arXiv:1302.​4389.
9.
Zurück zum Zitat Guo C, Mita S, McAllester D. Stereovision-based road boundary detection for intelligent vehicles in challenging scenarios. IROS 2009. IEEE/RSJ international conference on intelligent robots and systems, 2009. Piscataway: IEEE; 2009. p. 1723–1728. Guo C, Mita S, McAllester D. Stereovision-based road boundary detection for intelligent vehicles in challenging scenarios. IROS 2009. IEEE/RSJ international conference on intelligent robots and systems, 2009. Piscataway: IEEE; 2009. p. 1723–1728.
10.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 770–778. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 770–778.
11.
Zurück zum Zitat Huang G, Liu Z, Weinberger KQ, van der Maaten L. 2016. Densely connected convolutional networks. arXiv:1608.06993. Huang G, Liu Z, Weinberger KQ, van der Maaten L. 2016. Densely connected convolutional networks. arXiv:1608.​06993.
12.
Zurück zum Zitat Ioffe S, Szegedy C. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167. Ioffe S, Szegedy C. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.​03167.
13.
Zurück zum Zitat Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. 2014. Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. 2014. Caffe: convolutional architecture for fast feature embedding. arXiv:1408.​5093.
14.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems; 2012. p. 1097–1105. Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems; 2012. p. 1097–1105.
15.
Zurück zum Zitat Laddha A, Kocamaz MK, Navarro-Serment LE, Hebert M. Map-supervised road detection. Intelligent vehicles symposium (IV), 2016 IEEE. Piscataway: IEEE; 2016. p. 118–123. Laddha A, Kocamaz MK, Navarro-Serment LE, Hebert M. Map-supervised road detection. Intelligent vehicles symposium (IV), 2016 IEEE. Piscataway: IEEE; 2016. p. 118–123.
16.
Zurück zum Zitat Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition; 2015. p. 3431–3440. Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition; 2015. p. 3431–3440.
17.
Zurück zum Zitat Meftah B, Lézoray O, Benyettou A. Novel approach using echo state networks for microscopic cellular image segmentation. Cogn Comput 2016;8(2):237–245.CrossRef Meftah B, Lézoray O, Benyettou A. Novel approach using echo state networks for microscopic cellular image segmentation. Cogn Comput 2016;8(2):237–245.CrossRef
18.
Zurück zum Zitat Mendes CCT, Frémont V, Wolf DF. Exploiting fully convolutional neural networks for fast road detection. 2016 IEEE international conference on Robotics and automation (ICRA). Piscataway: IEEE; 2016. p. 3174–3179. Mendes CCT, Frémont V, Wolf DF. Exploiting fully convolutional neural networks for fast road detection. 2016 IEEE international conference on Robotics and automation (ICRA). Piscataway: IEEE; 2016. p. 3174–3179.
19.
Zurück zum Zitat Neto AM, Victorino AC, Fantoni I, Ferreira JV. Real-time estimation of drivable image area based on monocular vision. Intelligent vehicles symposium (IV), 2013 IEEE. Piscataway: IEEE; 2013. p. 63–68. Neto AM, Victorino AC, Fantoni I, Ferreira JV. Real-time estimation of drivable image area based on monocular vision. Intelligent vehicles symposium (IV), 2013 IEEE. Piscataway: IEEE; 2013. p. 63–68.
20.
Zurück zum Zitat Oliveira GL, Burgard W, Brox T. Efficient deep methods for monocular road segmentation. IEEE/RSJ international conference on intelligent robots and systems (IROS 2016); 2016. Oliveira GL, Burgard W, Brox T. Efficient deep methods for monocular road segmentation. IEEE/RSJ international conference on intelligent robots and systems (IROS 2016); 2016.
21.
Zurück zum Zitat Ouyang W, Wang X, Zhang C, Yang X. Factors in finetuning deep model for object detection with long-tail distribution. Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 864–873. Ouyang W, Wang X, Zhang C, Yang X. Factors in finetuning deep model for object detection with long-tail distribution. Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 864–873.
22.
Zurück zum Zitat Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 2017;39(4):640–651.CrossRefPubMed Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 2017;39(4):640–651.CrossRefPubMed
23.
Zurück zum Zitat Simonyan K, Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. Simonyan K, Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.​1556.
24.
Zurück zum Zitat Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition; 2015. p. 1–9. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition; 2015. p. 1–9.
25.
Zurück zum Zitat Wang B, Frémont V, Rodríguez SA. Color-based road detection and its evaluation on the KITTI road benchmark. Intelligent vehicles symposium proceedings, 2014 IEEE. Piscataway: IEEE; 2014. p. 31–36. Wang B, Frémont V, Rodríguez SA. Color-based road detection and its evaluation on the KITTI road benchmark. Intelligent vehicles symposium proceedings, 2014 IEEE. Piscataway: IEEE; 2014. p. 31–36.
26.
Zurück zum Zitat Wijesoma WS, Kodagoda KS, Balasuriya AP. Road-boundary detection and tracking using ladar sensing. IEEE Trans Rob Autom 2004;20(3):456–464.CrossRef Wijesoma WS, Kodagoda KS, Balasuriya AP. Road-boundary detection and tracking using ladar sensing. IEEE Trans Rob Autom 2004;20(3):456–464.CrossRef
27.
Zurück zum Zitat Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, et al. 2016. Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv:1609.08144. Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, et al. 2016. Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv:1609.​08144.
28.
Zurück zum Zitat Xie J, Yu L, Zhu L, Chen X. Semantic image segmentation method with multiple adjacency trees and multiscale features. Cogn Comput 2017;9(2):168–179.CrossRef Xie J, Yu L, Zhu L, Chen X. Semantic image segmentation method with multiple adjacency trees and multiscale features. Cogn Comput 2017;9(2):168–179.CrossRef
29.
Zurück zum Zitat Xiong W, Droppo J, Huang X, Seide F, Seltzer M, Stolcke A, Yu D, Zweig G. 2016. Achieving human parity in conversational speech recognition. arXiv:1610.05256. Xiong W, Droppo J, Huang X, Seide F, Seltzer M, Stolcke A, Yu D, Zweig G. 2016. Achieving human parity in conversational speech recognition. arXiv:1610.​05256.
30.
Zurück zum Zitat Yu F, Koltun V. 2015. Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122. Yu F, Koltun V. 2015. Multi-scale context aggregation by dilated convolutions. arXiv:1511.​07122.
31.
Zurück zum Zitat Zeng X, Ouyang W, Yang B, Yan J, Wang X. Gated bi-directional CNN for object detection. European conference on computer vision. Berlin: Springer; 2016. p. 354–369. Zeng X, Ouyang W, Yang B, Yan J, Wang X. Gated bi-directional CNN for object detection. European conference on computer vision. Berlin: Springer; 2016. p. 354–369.
32.
Metadaten
Titel
Segmentation of Drivable Road Using Deep Fully Convolutional Residual Network with Pyramid Pooling
verfasst von
Xiaolong Liu
Zhidong Deng
Publikationsdatum
27.11.2017
Verlag
Springer US
Erschienen in
Cognitive Computation / Ausgabe 2/2018
Print ISSN: 1866-9956
Elektronische ISSN: 1866-9964
DOI
https://doi.org/10.1007/s12559-017-9524-y

Weitere Artikel der Ausgabe 2/2018

Cognitive Computation 2/2018 Zur Ausgabe

Premium Partner