Skip to main content
Top
Published in: Neural Processing Letters 2/2022

04-11-2021

3D Point Convolutional Network for Dense Scene Flow Estimation

Authors: Xuezhi Xiang, Rokia Abdein, Mingliang Zhai, Ning Lv

Published in: Neural Processing Letters | Issue 2/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Scene flow estimation is one of the most crucial components of many scene understanding tasks, which represents the complete 3D motion of objects in dynamic scene. Most of existing scene flow estimation approaches are usually based on joint learning framework, which cast the problem of scene flow estimation to dense prediction of optical flow and stereo matching. However, these approaches need to reconstruct 3D motion from optical flow and disparity using 2D stereo image pairs, which leads the estimation process to be indirect. Recently, FlowNet3D attempts to learn scene flow from 3D point clouds, which adopts element-wise max-pooling to aggregate features from different points. Nevertheless, max-pooling operation only can obtain the strongest activation on features across a local or global region, which may increase the loss of some useful detailed and contextual. Specifically, for dense estimation task, the ability to transfer information gradually from coarse to fine layers is important. To address this problem, we investigate a new deep architecture, 3D point convolutional network, to learn scene flow from 3D point clouds. This specific architecture uses multi-layer perceptron (MLP) to approximate the weight function and applies a density scale to re-weight the learned weight functions for each convolutional filter, which can make the network permutation-invariant and translation invariant on 3D space which is beneficial for feature aggregation. Extensive experimental results are conducted on FlyingThings3D and KITTI scene flow datasets and show the effectiveness of our proposed approach. Our algorithm can achieve state-of-the-art performance on FlyingThings3D and KITTI datasets.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Huguet F, Devernay F (2007) A variational method for scene flow estimation from stereo sequences. In: IEEE 11th international conference on computer vision (ICCV), pp 1–7 Huguet F, Devernay F (2007) A variational method for scene flow estimation from stereo sequences. In: IEEE 11th international conference on computer vision (ICCV), pp 1–7
2.
go back to reference Brox T, Malik J (2011) Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans Pattern Anal Mach Intell 33(3):500–513CrossRef Brox T, Malik J (2011) Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans Pattern Anal Mach Intell 33(3):500–513CrossRef
3.
go back to reference Menze M, Geiger A (2015) Object scene flow for autonomous vehicles. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3061–3070 Menze M, Geiger A (2015) Object scene flow for autonomous vehicles. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3061–3070
4.
go back to reference Vogel C, Schindler K, Roth S (2015) 3D scene flow estimation with a piecewise rigid scene model. Int J Comput Vis 115(1):1–28MathSciNetCrossRef Vogel C, Schindler K, Roth S (2015) 3D scene flow estimation with a piecewise rigid scene model. Int J Comput Vis 115(1):1–28MathSciNetCrossRef
5.
go back to reference Quiroga J, Devernay F, Crowley J (2013) Local/global scene flow estimation. In: IEEE international conference on image processing (ICIP), pp 3850–3854 Quiroga J, Devernay F, Crowley J (2013) Local/global scene flow estimation. In: IEEE international conference on image processing (ICIP), pp 3850–3854
6.
go back to reference Sun D, Sudderth EB, Pfister H (2015) Layered rgbd scene flow estimation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 548–556 Sun D, Sudderth EB, Pfister H (2015) Layered rgbd scene flow estimation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 548–556
7.
go back to reference Park J, Oh TH, Jung J, Tai Y-W, Kweon IS (2012) A tensor voting approach for multi-view 3d scene flow estimation and refinement. In: European conference on computer vision (ECCV), pp 288–302 Park J, Oh TH, Jung J, Tai Y-W, Kweon IS (2012) A tensor voting approach for multi-view 3d scene flow estimation and refinement. In: European conference on computer vision (ECCV), pp 288–302
8.
go back to reference Zhou T, Brown M, Snavely N, Lowe DG (2017) Unsupervised learning of depth and ego-motion from video. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 6612–6619 Zhou T, Brown M, Snavely N, Lowe DG (2017) Unsupervised learning of depth and ego-motion from video. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 6612–6619
9.
go back to reference Mahjourian R, Wicke M, Angelova A (2018) Unsupervised learning of depth and ego-motion from monocular video using 3d geometric constraints. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 5667–5675 Mahjourian R, Wicke M, Angelova A (2018) Unsupervised learning of depth and ego-motion from monocular video using 3d geometric constraints. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 5667–5675
10.
go back to reference Yang Z, Wang P, Wang Y, Xu W, Nevatia R (2018) Lego: Learning edge with geometry all at once by watching videos. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 225–234 Yang Z, Wang P, Wang Y, Xu W, Nevatia R (2018) Lego: Learning edge with geometry all at once by watching videos. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 225–234
11.
go back to reference Yang Z, Wang P, Wang Y, Xu W, Nevatia R (2019) Every pixel counts: unsupervised geometry learning with holistic 3d motion understanding. In: European conference on computer vision (ECCV), pp 691–709 Yang Z, Wang P, Wang Y, Xu W, Nevatia R (2019) Every pixel counts: unsupervised geometry learning with holistic 3d motion understanding. In: European conference on computer vision (ECCV), pp 691–709
12.
go back to reference Yin Z, Shi J (2018) Geonet: unsupervised learning of dense depth, optical flow and camera pose. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 1983–1992 Yin Z, Shi J (2018) Geonet: unsupervised learning of dense depth, optical flow and camera pose. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 1983–1992
13.
go back to reference Ranjan A, Jampani V, Balles L, Kim K, Sun D, Wulff J, Black MJ (2019) Competitive collaboration: joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 12240–12249 Ranjan A, Jampani V, Balles L, Kim K, Sun D, Wulff J, Black MJ (2019) Competitive collaboration: joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 12240–12249
14.
go back to reference Wang Y, Wang P, Yang Z, Luo C, Yang Y, Xu W (2019) Unos: unified unsupervised optical-flow and stereo-depth estimation by watching videos. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 8071–8081 Wang Y, Wang P, Yang Z, Luo C, Yang Y, Xu W (2019) Unos: unified unsupervised optical-flow and stereo-depth estimation by watching videos. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 8071–8081
15.
go back to reference Lai HY, Tsai YH, Chiu W-C (2019) Bridging stereo matching and optical flow via spatiotemporal correspondence. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 1890–1899 Lai HY, Tsai YH, Chiu W-C (2019) Bridging stereo matching and optical flow via spatiotemporal correspondence. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 1890–1899
16.
go back to reference Dewan A, Caselitz T, Tipaldi GD, Burgard W (2016) Rigid scene flow for 3d lidar scans. In: IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 1765–1770 Dewan A, Caselitz T, Tipaldi GD, Burgard W (2016) Rigid scene flow for 3d lidar scans. In: IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 1765–1770
17.
go back to reference Tombari F, Salti S, Di Stefano L (2010) Unique signatures of histograms for local surface description. In: European conference on computer vision (ECCV), pp 356–369 Tombari F, Salti S, Di Stefano L (2010) Unique signatures of histograms for local surface description. In: European conference on computer vision (ECCV), pp 356–369
18.
go back to reference Behl A, Paschalidou D, Donne S, Geiger A (2019) Pointflownet: Learning representations for rigid motion estimation from point clouds. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 7962–7971 Behl A, Paschalidou D, Donne S, Geiger A (2019) Pointflownet: Learning representations for rigid motion estimation from point clouds. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 7962–7971
19.
go back to reference Liu X, Qi CR, Guibas LJ (2019) Flownet3d: Learning scene flow in 3d point clouds. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 529–537 Liu X, Qi CR, Guibas LJ (2019) Flownet3d: Learning scene flow in 3d point clouds. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 529–537
20.
go back to reference Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In: Advances in neural information processing systems, pp 5099–5108 Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In: Advances in neural information processing systems, pp 5099–5108
21.
go back to reference Basha T, Moses Y, Kiryati N (2013) Multi-view scene flow estimation: a view centered variational approach. Int J Comput Vis 101(1):6–21MathSciNetCrossRef Basha T, Moses Y, Kiryati N (2013) Multi-view scene flow estimation: a view centered variational approach. Int J Comput Vis 101(1):6–21MathSciNetCrossRef
22.
go back to reference Jaimez M, Souiai M, Gonzalez-Jimenez J, Cremers D (2015) A primaldual framework for real-time dense rgb-d scene flow. In: 2015 IEEE international conference on robotics and automation (ICRA), pp 98–104 Jaimez M, Souiai M, Gonzalez-Jimenez J, Cremers D (2015) A primaldual framework for real-time dense rgb-d scene flow. In: 2015 IEEE international conference on robotics and automation (ICRA), pp 98–104
23.
go back to reference Horncek M, Fitzgibbon A, Rother C (2014) Sphereflow: 6 dof scene flow from rgb-d pairs. In: 2014 IEEE conference on computer vision and pattern recognition, pp 3526–3533 Horncek M, Fitzgibbon A, Rother C (2014) Sphereflow: 6 dof scene flow from rgb-d pairs. In: 2014 IEEE conference on computer vision and pattern recognition, pp 3526–3533
24.
go back to reference Wedel A, Brox T, Vaudrey T, Rabe C, Franke U, Cremers D (2011) Stereoscopic scene flow computation for 3d motion understanding. Int J Comput Vis 95(1):29–51CrossRef Wedel A, Brox T, Vaudrey T, Rabe C, Franke U, Cremers D (2011) Stereoscopic scene flow computation for 3d motion understanding. Int J Comput Vis 95(1):29–51CrossRef
25.
go back to reference Pons J-P, Keriven R, Faugeras O (2007) Multi-view stereo reconstruction and scene flow estimation with a global image-based matching score. Int J Comput Vis 72(2):179–193CrossRef Pons J-P, Keriven R, Faugeras O (2007) Multi-view stereo reconstruction and scene flow estimation with a global image-based matching score. Int J Comput Vis 72(2):179–193CrossRef
26.
go back to reference Dosovitskiy A, Fischer P, Ilg E, Husser P, Hazirbas C, Golkov V, Smagt P, Cremers D, Brox T (2015) Flownet: learning optical flow with convolutional networks. In: 2015 IEEE international conference on computer vision (ICCV), pp 2758–2766 Dosovitskiy A, Fischer P, Ilg E, Husser P, Hazirbas C, Golkov V, Smagt P, Cremers D, Brox T (2015) Flownet: learning optical flow with convolutional networks. In: 2015 IEEE international conference on computer vision (ICCV), pp 2758–2766
27.
go back to reference Zou Y, Luo Z, Huang J-B (2018) Df-net: unsupervised joint learning of depth and flow using cross-task consistency. In: European conference on computer vision (ECCV), pp 36–53 Zou Y, Luo Z, Huang J-B (2018) Df-net: unsupervised joint learning of depth and flow using cross-task consistency. In: European conference on computer vision (ECCV), pp 36–53
28.
go back to reference Lv Z, Kim K, Troccoli A, Sun D, Rehg JM, Kautz J (2018) Learning rigidity in dynamic scenes with a moving camera for 3d motion field estimation. In: The European conference on computer vision (ECCV), pp 468–484 Lv Z, Kim K, Troccoli A, Sun D, Rehg JM, Kautz J (2018) Learning rigidity in dynamic scenes with a moving camera for 3d motion field estimation. In: The European conference on computer vision (ECCV), pp 468–484
29.
go back to reference Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef
30.
go back to reference Wang Z, Li S, Howard-Jenkins H, Prisacariu C, Chen M (2020) Flownet3d++: Geometric losses for deep scene flow estimation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 91–98 Wang Z, Li S, Howard-Jenkins H, Prisacariu C, Chen M (2020) Flownet3d++: Geometric losses for deep scene flow estimation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 91–98
31.
go back to reference Ouyang B, Raviv D (2021) Occlusion guided scene flow estimation on 3D point clouds. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2805–2814 Ouyang B, Raviv D (2021) Occlusion guided scene flow estimation on 3D point clouds. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2805–2814
32.
go back to reference Mittal H, Okorn B, Held D (2020) Just go with the flow: Self-supervised scene flow estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11177–11185 Mittal H, Okorn B, Held D (2020) Just go with the flow: Self-supervised scene flow estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11177–11185
33.
go back to reference Puy G, Boulch A, Marlet R (2020) Flot: scene flow on point clouds guided by optimal transport. In: European conference on computer vision (ECCV), pp 527–544 Puy G, Boulch A, Marlet R (2020) Flot: scene flow on point clouds guided by optimal transport. In: European conference on computer vision (ECCV), pp 527–544
34.
go back to reference Battrawy R, Schuster R, Wasenmüller O, Rao Q, Stricker D (2019) Lidar-flow: dense scene flow estimation from sparse lidar and stereo images. In: 2019 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 7762–7769 Battrawy R, Schuster R, Wasenmüller O, Rao Q, Stricker D (2019) Lidar-flow: dense scene flow estimation from sparse lidar and stereo images. In: 2019 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 7762–7769
35.
go back to reference Yi L, Kim VG, Ceylan D, Shen I, Yan M, Su H, Lu C, Huang Q, Sheffer A, Guibas L (2016) A scalable active framework for region annotation in 3d shape collections. ACM Trans Graph (ToG) 35(6):1–12CrossRef Yi L, Kim VG, Ceylan D, Shen I, Yan M, Su H, Lu C, Huang Q, Sheffer A, Guibas L (2016) A scalable active framework for region annotation in 3d shape collections. ACM Trans Graph (ToG) 35(6):1–12CrossRef
36.
go back to reference Besl PJ, McKay ND (1992) A method for registration of 3-d shapes. IEEE Trans Pattern Anal Mach Intell 14(2):239–256CrossRef Besl PJ, McKay ND (1992) A method for registration of 3-d shapes. IEEE Trans Pattern Anal Mach Intell 14(2):239–256CrossRef
37.
go back to reference Mayer N, Ilg E, Husser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 4040–4048 Mayer N, Ilg E, Husser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 4040–4048
38.
go back to reference Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3354–3361 Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3354–3361
39.
go back to reference Girija SS (2016) Tensorflow: large-scale machine learning on heterogeneous distributed systems. Software available from tensorflow. org 39, no. 9 (2016) Girija SS (2016) Tensorflow: large-scale machine learning on heterogeneous distributed systems. Software available from tensorflow. org 39, no. 9 (2016)
Metadata
Title
3D Point Convolutional Network for Dense Scene Flow Estimation
Authors
Xuezhi Xiang
Rokia Abdein
Mingliang Zhai
Ning Lv
Publication date
04-11-2021
Publisher
Springer US
Published in
Neural Processing Letters / Issue 2/2022
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-021-10673-w

Other articles of this Issue 2/2022

Neural Processing Letters 2/2022 Go to the issue

OriginalPaper

Diffusion Modelling