Skip to main content
Top

2023 | OriginalPaper | Chapter

Camera and LiDAR Fusion for Point Cloud Semantic Segmentation

Authors : Ali Abdelkader, Mohamed Moustafa

Published in: Proceedings of Seventh International Congress on Information and Communication Technology

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Perception is a fundamental component of any autonomous driving system. Semantic segmentation is the perception task of assigning semantic class labels to sensor inputs. While autonomous driving systems are currently equipped with a suite of sensors, much focus in the literature has been on semantic segmentation of camera images only. Research in the fusion of different sensor modalities for semantic segmentation has not been investigated as much. Deep learning models based on transformer architectures have proven successful in many tasks in computer vision and natural language processing. This work explores the use of deep learning transformers to fuse information from LiDAR and camera sensors to improve the segmentation of LiDAR point clouds. It also addresses the question of which fusion level in this deep learning framework provides better performance.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Behley J, Garbade M, Milioto A, Quenzel J, Behnke S, Stachniss C, Gall J (2019) SemanticKITTI: a dataset for semantic scene understanding of LiDAR sequences. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Behley J, Garbade M, Milioto A, Quenzel J, Behnke S, Stachniss C, Gall J (2019) SemanticKITTI: a dataset for semantic scene understanding of LiDAR sequences. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
2.
go back to reference Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale. In: International conference on learning representations Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale. In: International conference on learning representations
3.
go back to reference Feng D, Haase-Schütz C, Rosenbaum L, Hertlein H, Glaeser C, Timm F, Wiesbeck W, Dietmayer K (2020) Deep multi-modal object detection and semantic segmentation for autonomous driving: datasets, methods, and challenges. IEEE Trans Intelligent Transp Syst 22(3):1341–1360CrossRef Feng D, Haase-Schütz C, Rosenbaum L, Hertlein H, Glaeser C, Timm F, Wiesbeck W, Dietmayer K (2020) Deep multi-modal object detection and semantic segmentation for autonomous driving: datasets, methods, and challenges. IEEE Trans Intelligent Transp Syst 22(3):1341–1360CrossRef
4.
go back to reference Jaritz M, Vu TH, Charette R, Wirbel E, Perez P (2020) xmuda: cross-modal unsupervised domain adaptation for 3d semantic segmentation. In: Proceedings of the IEEE/CVF vonference on Computer Vision and Pattern Recognition (CVPR) Jaritz M, Vu TH, Charette R, Wirbel E, Perez P (2020) xmuda: cross-modal unsupervised domain adaptation for 3d semantic segmentation. In: Proceedings of the IEEE/CVF vonference on Computer Vision and Pattern Recognition (CVPR)
5.
go back to reference Krispel G, Opitz M, Waltner G, Possegger H, Bischof H (2020) Fuseseg: lidar point cloud segmentation fusing multi-modal data. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1874–1883 Krispel G, Opitz M, Waltner G, Possegger H, Bischof H (2020) Fuseseg: lidar point cloud segmentation fusing multi-modal data. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1874–1883
7.
go back to reference Liu Z, Tang H, Lin Y, Han S (2019) Point-voxel cnn for efficient 3d deep learning. In: Advances in neural information processing systems Liu Z, Tang H, Lin Y, Han S (2019) Point-voxel cnn for efficient 3d deep learning. In: Advances in neural information processing systems
8.
go back to reference Meyer GP, Charland J, Hegde D, Laddha A, Vallespi-Gonzalez C (2019) Sensor fusion for joint 3d object detection and semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops Meyer GP, Charland J, Hegde D, Laddha A, Vallespi-Gonzalez C (2019) Sensor fusion for joint 3d object detection and semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops
9.
go back to reference Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 652–660 Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 652–660
10.
go back to reference Su H, Jampani V, Sun D, Maji S, Kalogerakis E, Yang MH, Kautz J (2018) Splatnet: sparse lattice networks for point cloud processing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2530–2539 Su H, Jampani V, Sun D, Maji S, Kalogerakis E, Yang MH, Kautz J (2018) Splatnet: sparse lattice networks for point cloud processing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2530–2539
11.
go back to reference Tang H, Liu Z, Zhao S, Lin Y, Lin J, Wang H, Han S (2020) Searching efficient 3d architectures with sparse point-voxel convolution. In: European conference on computer vision Tang H, Liu Z, Zhao S, Lin Y, Lin J, Wang H, Han S (2020) Searching efficient 3d architectures with sparse point-voxel convolution. In: European conference on computer vision
12.
go back to reference Tchapmi LP, Choy CB, Armeni I, Gwak J, Savarese S (2017) Segcloud: semantic segmentation of 3d point clouds. In: International conference on 3D Vision (3DV) Tchapmi LP, Choy CB, Armeni I, Gwak J, Savarese S (2017) Segcloud: semantic segmentation of 3d point clouds. In: International conference on 3D Vision (3DV)
13.
go back to reference Touvron H, Cord M, Douze M, Massa F, Sablayrolles A, Jégou H (2021) Training data-efficient image transformers & distillation through attention. In: International conference on machine learning, PMLR, pp 10,347–10,357 Touvron H, Cord M, Douze M, Massa F, Sablayrolles A, Jégou H (2021) Training data-efficient image transformers & distillation through attention. In: International conference on machine learning, PMLR, pp 10,347–10,357
14.
go back to reference Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Metadata
Title
Camera and LiDAR Fusion for Point Cloud Semantic Segmentation
Authors
Ali Abdelkader
Mohamed Moustafa
Copyright Year
2023
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-19-2394-4_45