Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 11/2019

28.08.2019 | Original Article

SegFast-V2: Semantic image segmentation with less parameters in deep learning for autonomous driving

verfasst von: Swarnendu Ghosh, Anisha Pal, Shourya Jaiswal, K. C. Santosh, Nibaran Das, Mita Nasipuri

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 11/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Semantic image segmentation can be used in various driving applications, such as automatic braking, road sign alerts, park assists, and pedestrian warnings. More often, AI applications, such as autonomous modules are available in expensive vehicles. It would be appreciated if such facilities can be made available in the lower end of the price spectrum. Existing methodologies, come with a costly overhead with large number of parameters and need of costly hardware. Within this scope, the key contribution of this work is to promote the possibility of compact semantic image segmentation so that it can be extended to deploy AI based solutions to less expensive vehicles. While developing cheap and fast models one must also not compromise the factor of reliability and robustness. The proposed work is primarily based on our previous model named “SegFast”, and is aimed to perform thorough analysis across a multitude of datasets. Beside “spark” modules and depth-wise separable transposed convolutions, kernel factorization is implemented to further reduce the number of parameters. The effect of MobileNet as an encoder to our model has also been analyzed. The proposed method shows a promising decrease in the number of parameters and significant gain in terms of runtime even on a single CPU environment. Despite all those speedups, the proposed approach performs at a similar level to many popular but heavier networks, such as SegNet, UNet, PSPNet, and FCN.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Machine Intell 39(12):2481–2495CrossRef Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Machine Intell 39(12):2481–2495CrossRef
2.
Zurück zum Zitat Brostow GJ, Fauqueur J, Cipolla R (2009) Semantic object classes in video: a high-definition ground truth database. Pattern Recognit Lett 30(2):88–97CrossRef Brostow GJ, Fauqueur J, Cipolla R (2009) Semantic object classes in video: a high-definition ground truth database. Pattern Recognit Lett 30(2):88–97CrossRef
3.
Zurück zum Zitat Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef
4.
Zurück zum Zitat Chollet F (2017) Xception: deep learning with depthwise separable convolutions, pp 1610–02357. arXiv preprint Chollet F (2017) Xception: deep learning with depthwise separable convolutions, pp 1610–02357. arXiv preprint
5.
Zurück zum Zitat Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223 Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223
6.
Zurück zum Zitat Dahl GE, Sainath TN, Hinton GE (2013) Improving deep neural networks for lvcsr using rectified linear units and dropout. In: 2013 IEEE international conference on acoustics, speech and signal processing, pp 8609–8613. IEEE Dahl GE, Sainath TN, Hinton GE (2013) Improving deep neural networks for lvcsr using rectified linear units and dropout. In: 2013 IEEE international conference on acoustics, speech and signal processing, pp 8609–8613. IEEE
7.
Zurück zum Zitat Haykin S (1994) Neural networks: a comprehensive foundation. Prentice Hall PTR, Upper Saddle RiverMATH Haykin S (1994) Neural networks: a comprehensive foundation. Prentice Hall PTR, Upper Saddle RiverMATH
8.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
9.
Zurück zum Zitat Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:​1704.​04861
10.
Zurück zum Zitat Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: CVPR, vol. 1, p 3 Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: CVPR, vol. 1, p 3
11.
Zurück zum Zitat Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and \(< 0.5 \text{mb}\) model size. arXiv preprint arXiv:1602.07360 Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and \(< 0.5 \text{mb}\) model size. arXiv preprint arXiv:​1602.​07360
12.
Zurück zum Zitat Jégou S, Drozdzal M, Vazquez D, Romero A, Bengio Y (2017) The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In: 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp. 1175–1183. IEEE Jégou S, Drozdzal M, Vazquez D, Romero A, Bengio Y (2017) The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In: 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp. 1175–1183. IEEE
13.
Zurück zum Zitat Kaiser L, Gomez AN, Chollet F (2017) Depthwise separable convolutions for neural machine translation. arXiv preprint arXiv:1706.03059 Kaiser L, Gomez AN, Chollet F (2017) Depthwise separable convolutions for neural machine translation. arXiv preprint arXiv:​1706.​03059
14.
Zurück zum Zitat Lin G, Milan A, Shen C, Reid I (2017) Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. In: IEEE conference on computer vision and pattern recognition (CVPR) Lin G, Milan A, Shen C, Reid I (2017) Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. In: IEEE conference on computer vision and pattern recognition (CVPR)
15.
Zurück zum Zitat Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440 Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
16.
Zurück zum Zitat Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814 Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814
17.
Zurück zum Zitat Neuhold G, Ollmann T, Bulò SR, Kontschieder P (2017) The mapillary vistas dataset for semantic understanding of street scenes. In: ICCV, pp 5000–5009 Neuhold G, Ollmann T, Bulò SR, Kontschieder P (2017) The mapillary vistas dataset for semantic understanding of street scenes. In: ICCV, pp 5000–5009
18.
Zurück zum Zitat Pal A, Jaiswal S, Ghosh S, Das N, Nasipuri M (2018) Segfast : a faster squeezenet based semantic image segmentation technique using depth-wise separable convolutions. In: Proceedings of the 11th Indian conference on computer vision, graphics and image processing (ICVGIP 2018), p 7. ACM Pal A, Jaiswal S, Ghosh S, Das N, Nasipuri M (2018) Segfast : a faster squeezenet based semantic image segmentation technique using depth-wise separable convolutions. In: Proceedings of the 11th Indian conference on computer vision, graphics and image processing (ICVGIP 2018), p 7. ACM
19.
Zurück zum Zitat Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, New York, pp 234–241 Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, New York, pp 234–241
20.
Zurück zum Zitat Ros G, Sellart L, Materzynska J, Vazquez D, Lopez AM (2016) The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3234–3243 Ros G, Sellart L, Materzynska J, Vazquez D, Lopez AM (2016) The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3234–3243
21.
Zurück zum Zitat Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:​1409.​1556
22.
Zurück zum Zitat Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826 Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
23.
Zurück zum Zitat Xu H, Gao Y, Yu F, Darrell T (2017) End-to-end learning of driving models from large-scale video datasets. arXiv preprint Xu H, Gao Y, Yu F, Darrell T (2017) End-to-end learning of driving models from large-scale video datasets. arXiv preprint
24.
Zurück zum Zitat Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2881–2890 Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2881–2890
Metadaten
Titel
SegFast-V2: Semantic image segmentation with less parameters in deep learning for autonomous driving
verfasst von
Swarnendu Ghosh
Anisha Pal
Shourya Jaiswal
K. C. Santosh
Nibaran Das
Mita Nasipuri
Publikationsdatum
28.08.2019
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 11/2019
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-019-01005-5

Weitere Artikel der Ausgabe 11/2019

International Journal of Machine Learning and Cybernetics 11/2019 Zur Ausgabe

Neuer Inhalt