Skip to main content
Erschienen in: Neural Processing Letters 4/2023

26.09.2022

DDCNet-Multires: Effective Receptive Field Guided Multiresolution CNN for Dense Prediction

verfasst von: Ali Salehi, Madhusudhanan Balasubramanian

Erschienen in: Neural Processing Letters | Ausgabe 4/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Dense optical flow estimation is challenging when there are large displacements in a scene with heterogeneous motion dynamics, occlusion, and scene homogeneity. Traditional approaches to handle these challenges include hierarchical and multiresolution processing methods. Learning-based optical flow methods typically use a multiresolution approach with image warping when a broad range of flow velocities and heterogeneous motion is present. Accuracy of such coarse-to-fine methods is affected by the ghosting artifacts when images are warped across multiple resolutions and by the vanishing problem in smaller scene extents with higher motion contrast. Previously, we devised strategies for building compact dense prediction networks guided by the effective receptive field (ERF) characteristics of the network (DDCNet). The DDCNet design was intentionally simple and compact allowing it to be used as a building block for designing more complex yet compact networks. In this work, we extend the DDCNet strategies to handle heterogeneous motion dynamics by cascading DDCNet based sub-nets with decreasing extents of their ERF. Our DDCNet with multiresolution capability (DDCNet-Multires) is compact without any specialized network layers. We evaluate the performance of the DDCNet-Multires network using standard optical flow benchmark datasets. Our experiments demonstrate that DDCNet-Multires improves over the DDCNet-B0 and -B1 and provides optical flow estimates with accuracy comparable to similar lightweight learning-based methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Anandan P (1989) A computational framework and an algorithm for the measurement of visual motion. Int J Comput Vision 2(3):283–310CrossRef Anandan P (1989) A computational framework and an algorithm for the measurement of visual motion. Int J Comput Vision 2(3):283–310CrossRef
2.
Zurück zum Zitat Baker S, Daniel Scharstein JP, Lewis SR, Black MJ, Szeliski R (2011) A database and evaluation methodology for optical flow. Int J Comput Vision 92(1):1–31CrossRef Baker S, Daniel Scharstein JP, Lewis SR, Black MJ, Szeliski R (2011) A database and evaluation methodology for optical flow. Int J Comput Vision 92(1):1–31CrossRef
3.
Zurück zum Zitat Balasubramanian M (2006) A Computational framework for the structural change analysis of 3D volumes of microscopic specimens. PhD thesis, Louisiana State University Balasubramanian M (2006) A Computational framework for the structural change analysis of 3D volumes of microscopic specimens. PhD thesis, Louisiana State University
4.
Zurück zum Zitat Bergen JR, Anandan P, Hanna KJ, Hingorani R (1992)Hierarchical model-based motion estimation. In: European conference on computer vision. Springer, pp 237–252 Bergen JR, Anandan P, Hanna KJ, Hingorani R (1992)Hierarchical model-based motion estimation. In: European conference on computer vision. Springer, pp 237–252
5.
Zurück zum Zitat Butler DJ, Wulff J, Stanley GB, Black MJ (2012) A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon A et al. (ed) European conf. on computer vision (ECCV), Part IV, LNCS 7577. Springer-Verlag, pp 611–625 October Butler DJ, Wulff J, Stanley GB, Black MJ (2012) A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon A et al. (ed) European conf. on computer vision (ECCV), Part IV, LNCS 7577. Springer-Verlag, pp 611–625 October
6.
Zurück zum Zitat Dosovitskiy A, Fischer P, Ilg E, Hausser P, Hazirbas C, Golkov V, Van der Smagt P, Cremers D, Brox T (2015) Flownet: Learning optical flow with convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 2758–2766 Dosovitskiy A, Fischer P, Ilg E, Hausser P, Hazirbas C, Golkov V, Van der Smagt P, Cremers D, Brox T (2015) Flownet: Learning optical flow with convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 2758–2766
7.
Zurück zum Zitat David E, Christian P, Rob F (2014) Depth map prediction from a single image using a multi-scale deep network. Adv Neural Inf Process Syst 27:2366–2374 David E, Christian P, Rob F (2014) Depth map prediction from a single image using a multi-scale deep network. Adv Neural Inf Process Syst 27:2366–2374
8.
Zurück zum Zitat Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: the kitti dataset. Int J Robot Res 32(11):1231–1237CrossRef Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: the kitti dataset. Int J Robot Res 32(11):1231–1237CrossRef
9.
Zurück zum Zitat Gur S, Wolf L (2019) Single image depth estimation trained via depth from defocus cues. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7683–7692 Gur S, Wolf L (2019) Single image depth estimation trained via depth from defocus cues. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7683–7692
10.
Zurück zum Zitat Hui T-W, Loy CC (2020) Liteflownet3: Resolving correspondence ambiguity for more accurate optical flow estimation. In: European conference on computer vision. Springer, pp 169–184 Hui T-W, Loy CC (2020) Liteflownet3: Resolving correspondence ambiguity for more accurate optical flow estimation. In: European conference on computer vision. Springer, pp 169–184
11.
Zurück zum Zitat Hui T-W, Tang X, Loy CC (2018) Liteflownet: A lightweight convolutional neural network for optical flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8981–8989 Hui T-W, Tang X, Loy CC (2018) Liteflownet: A lightweight convolutional neural network for optical flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8981–8989
12.
13.
Zurück zum Zitat Ilg E, Mayer N, Saikia T, Keuper M, Dosovitskiy A, Brox T (2017) Flownet 2.0: Evolution of optical flow estimation with deep networks. In: IEEE conference on computer vision and pattern recognition (CVPR), vol 2, p 6 Ilg E, Mayer N, Saikia T, Keuper M, Dosovitskiy A, Brox T (2017) Flownet 2.0: Evolution of optical flow estimation with deep networks. In: IEEE conference on computer vision and pattern recognition (CVPR), vol 2, p 6
14.
Zurück zum Zitat Ilg E, Saikia T, Keuper M, Brox T (2018) Occlusions, motion and depth boundaries with a generic network for disparity, optical flow or scene flow estimation. In Proceedings of the European conference on computer vision (ECCV), pp 614–630 Ilg E, Saikia T, Keuper M, Brox T (2018) Occlusions, motion and depth boundaries with a generic network for disparity, optical flow or scene flow estimation. In Proceedings of the European conference on computer vision (ECCV), pp 614–630
15.
Zurück zum Zitat Janai J, Guney F, Ranjan A, Black M, Geiger A (2018) Unsupervised learning of multi-frame optical flow with occlusions. In: Proceedings of the European conference on computer vision (ECCV), pp 690–706 Janai J, Guney F, Ranjan A, Black M, Geiger A (2018) Unsupervised learning of multi-frame optical flow with occlusions. In: Proceedings of the European conference on computer vision (ECCV), pp 690–706
16.
Zurück zum Zitat Jason JY, Harley AW, Derpanis KG (2016) Back to basics: Unsupervised learning of optical flow via brightness constancy and motion smoothness. In European conference on computer vision. Springer, pp 3–10 Jason JY, Harley AW, Derpanis KG (2016) Back to basics: Unsupervised learning of optical flow via brightness constancy and motion smoothness. In European conference on computer vision. Springer, pp 3–10
17.
Zurück zum Zitat Jiang S, Campbell D, Lu Y, Li H, Hartley R (2021) Learning to estimate hidden motions with global motion aggregation. In: Proceedings of the IEEE international conference on computer vision, pp 9752–9761. Institute of Electrical and Electronics Engineers Inc., ISBN 9781665428125. https://doi.org/10.1109/ICCV48922.2021.00963 Jiang S, Campbell D, Lu Y, Li H, Hartley R (2021) Learning to estimate hidden motions with global motion aggregation. In: Proceedings of the IEEE international conference on computer vision, pp 9752–9761. Institute of Electrical and Electronics Engineers Inc., ISBN 9781665428125. https://​doi.​org/​10.​1109/​ICCV48922.​2021.​00963
18.
Zurück zum Zitat Liu C, Chen L-C, Schroff F, Adam H, Hua W, Yuille AL, Fei-Fei L (2019) Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 82–92 Liu C, Chen L-C, Schroff F, Adam H, Hua W, Yuille AL, Fei-Fei L (2019) Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 82–92
19.
Zurück zum Zitat Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440 Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
20.
Zurück zum Zitat Lotter W, Kreiman G, Cox D (2016) Deep predictive coding networks for video prediction and unsupervised learning. arXiv:1605.08104 Lotter W, Kreiman G, Cox D (2016) Deep predictive coding networks for video prediction and unsupervised learning. arXiv:​1605.​08104
21.
Zurück zum Zitat Mayer N, Ilg E, Hausser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4040–4048, This is DispNet. They have introduced several methods here Mayer N, Ilg E, Hausser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4040–4048, This is DispNet. They have introduced several methods here
22.
Zurück zum Zitat Menze M, Geiger A (2015) Object scene flow for autonomous vehicles. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3061–3070 Menze M, Geiger A (2015) Object scene flow for autonomous vehicles. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3061–3070
23.
Zurück zum Zitat Ranjan A, Black MJ (2017) Optical flow estimation using a spatial pyramid network. In IEEE conference on computer vision and pattern recognition (CVPR), vol 2, p 2. IEEE Ranjan A, Black MJ (2017) Optical flow estimation using a spatial pyramid network. In IEEE conference on computer vision and pattern recognition (CVPR), vol 2, p 2. IEEE
24.
Zurück zum Zitat Ren Z, Yan J, Ni B, Liu B, Yang X, Zha H (2017) Unsupervised deep learning for optical flow estimation. In: AAAI, pp 1495–1501 Ren Z, Yan J, Ni B, Liu B, Yang X, Zha H (2017) Unsupervised deep learning for optical flow estimation. In: AAAI, pp 1495–1501
25.
Zurück zum Zitat Salehi A, Balasubramanian M (2021) DDCNet: Deep dilated convolutional neural network for dense prediction. arXiv preprint arXiv:2107.04715 Salehi A, Balasubramanian M (2021) DDCNet: Deep dilated convolutional neural network for dense prediction. arXiv preprint arXiv:​2107.​04715
26.
Zurück zum Zitat Sun D, Yang X, Liu M-Y, Kautz J (2018) Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8934–8943 Sun D, Yang X, Liu M-Y, Kautz J (2018) Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8934–8943
27.
Zurück zum Zitat Sun D, Yang X, Liu M-Y, Kautz J (2019) Models matter, so does training: an empirical study of CNNS for optical flow estimation. IEEE Trans Pattern Anal Mach Intell 42(6):1408–1423CrossRef Sun D, Yang X, Liu M-Y, Kautz J (2019) Models matter, so does training: an empirical study of CNNS for optical flow estimation. IEEE Trans Pattern Anal Mach Intell 42(6):1408–1423CrossRef
33.
Zurück zum Zitat Zhang J, Zheng Y, Sun J, Qi D (2019) Flow prediction in spatio-temporal networks based on multitask deep learning. IEEE Trans Knowl Data Eng 32(3):468–478CrossRef Zhang J, Zheng Y, Sun J, Qi D (2019) Flow prediction in spatio-temporal networks based on multitask deep learning. IEEE Trans Knowl Data Eng 32(3):468–478CrossRef
Metadaten
Titel
DDCNet-Multires: Effective Receptive Field Guided Multiresolution CNN for Dense Prediction
verfasst von
Ali Salehi
Madhusudhanan Balasubramanian
Publikationsdatum
26.09.2022
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 4/2023
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-022-11039-6

Weitere Artikel der Ausgabe 4/2023

Neural Processing Letters 4/2023 Zur Ausgabe

Neuer Inhalt