Skip to main content
Top
Published in: Neural Processing Letters 4/2023

26-09-2022

DDCNet-Multires: Effective Receptive Field Guided Multiresolution CNN for Dense Prediction

Authors: Ali Salehi, Madhusudhanan Balasubramanian

Published in: Neural Processing Letters | Issue 4/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Dense optical flow estimation is challenging when there are large displacements in a scene with heterogeneous motion dynamics, occlusion, and scene homogeneity. Traditional approaches to handle these challenges include hierarchical and multiresolution processing methods. Learning-based optical flow methods typically use a multiresolution approach with image warping when a broad range of flow velocities and heterogeneous motion is present. Accuracy of such coarse-to-fine methods is affected by the ghosting artifacts when images are warped across multiple resolutions and by the vanishing problem in smaller scene extents with higher motion contrast. Previously, we devised strategies for building compact dense prediction networks guided by the effective receptive field (ERF) characteristics of the network (DDCNet). The DDCNet design was intentionally simple and compact allowing it to be used as a building block for designing more complex yet compact networks. In this work, we extend the DDCNet strategies to handle heterogeneous motion dynamics by cascading DDCNet based sub-nets with decreasing extents of their ERF. Our DDCNet with multiresolution capability (DDCNet-Multires) is compact without any specialized network layers. We evaluate the performance of the DDCNet-Multires network using standard optical flow benchmark datasets. Our experiments demonstrate that DDCNet-Multires improves over the DDCNet-B0 and -B1 and provides optical flow estimates with accuracy comparable to similar lightweight learning-based methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Anandan P (1989) A computational framework and an algorithm for the measurement of visual motion. Int J Comput Vision 2(3):283–310CrossRef Anandan P (1989) A computational framework and an algorithm for the measurement of visual motion. Int J Comput Vision 2(3):283–310CrossRef
2.
go back to reference Baker S, Daniel Scharstein JP, Lewis SR, Black MJ, Szeliski R (2011) A database and evaluation methodology for optical flow. Int J Comput Vision 92(1):1–31CrossRef Baker S, Daniel Scharstein JP, Lewis SR, Black MJ, Szeliski R (2011) A database and evaluation methodology for optical flow. Int J Comput Vision 92(1):1–31CrossRef
3.
go back to reference Balasubramanian M (2006) A Computational framework for the structural change analysis of 3D volumes of microscopic specimens. PhD thesis, Louisiana State University Balasubramanian M (2006) A Computational framework for the structural change analysis of 3D volumes of microscopic specimens. PhD thesis, Louisiana State University
4.
go back to reference Bergen JR, Anandan P, Hanna KJ, Hingorani R (1992)Hierarchical model-based motion estimation. In: European conference on computer vision. Springer, pp 237–252 Bergen JR, Anandan P, Hanna KJ, Hingorani R (1992)Hierarchical model-based motion estimation. In: European conference on computer vision. Springer, pp 237–252
5.
go back to reference Butler DJ, Wulff J, Stanley GB, Black MJ (2012) A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon A et al. (ed) European conf. on computer vision (ECCV), Part IV, LNCS 7577. Springer-Verlag, pp 611–625 October Butler DJ, Wulff J, Stanley GB, Black MJ (2012) A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon A et al. (ed) European conf. on computer vision (ECCV), Part IV, LNCS 7577. Springer-Verlag, pp 611–625 October
6.
go back to reference Dosovitskiy A, Fischer P, Ilg E, Hausser P, Hazirbas C, Golkov V, Van der Smagt P, Cremers D, Brox T (2015) Flownet: Learning optical flow with convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 2758–2766 Dosovitskiy A, Fischer P, Ilg E, Hausser P, Hazirbas C, Golkov V, Van der Smagt P, Cremers D, Brox T (2015) Flownet: Learning optical flow with convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 2758–2766
7.
go back to reference David E, Christian P, Rob F (2014) Depth map prediction from a single image using a multi-scale deep network. Adv Neural Inf Process Syst 27:2366–2374 David E, Christian P, Rob F (2014) Depth map prediction from a single image using a multi-scale deep network. Adv Neural Inf Process Syst 27:2366–2374
8.
go back to reference Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: the kitti dataset. Int J Robot Res 32(11):1231–1237CrossRef Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: the kitti dataset. Int J Robot Res 32(11):1231–1237CrossRef
9.
go back to reference Gur S, Wolf L (2019) Single image depth estimation trained via depth from defocus cues. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7683–7692 Gur S, Wolf L (2019) Single image depth estimation trained via depth from defocus cues. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7683–7692
10.
go back to reference Hui T-W, Loy CC (2020) Liteflownet3: Resolving correspondence ambiguity for more accurate optical flow estimation. In: European conference on computer vision. Springer, pp 169–184 Hui T-W, Loy CC (2020) Liteflownet3: Resolving correspondence ambiguity for more accurate optical flow estimation. In: European conference on computer vision. Springer, pp 169–184
11.
go back to reference Hui T-W, Tang X, Loy CC (2018) Liteflownet: A lightweight convolutional neural network for optical flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8981–8989 Hui T-W, Tang X, Loy CC (2018) Liteflownet: A lightweight convolutional neural network for optical flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8981–8989
12.
13.
go back to reference Ilg E, Mayer N, Saikia T, Keuper M, Dosovitskiy A, Brox T (2017) Flownet 2.0: Evolution of optical flow estimation with deep networks. In: IEEE conference on computer vision and pattern recognition (CVPR), vol 2, p 6 Ilg E, Mayer N, Saikia T, Keuper M, Dosovitskiy A, Brox T (2017) Flownet 2.0: Evolution of optical flow estimation with deep networks. In: IEEE conference on computer vision and pattern recognition (CVPR), vol 2, p 6
14.
go back to reference Ilg E, Saikia T, Keuper M, Brox T (2018) Occlusions, motion and depth boundaries with a generic network for disparity, optical flow or scene flow estimation. In Proceedings of the European conference on computer vision (ECCV), pp 614–630 Ilg E, Saikia T, Keuper M, Brox T (2018) Occlusions, motion and depth boundaries with a generic network for disparity, optical flow or scene flow estimation. In Proceedings of the European conference on computer vision (ECCV), pp 614–630
15.
go back to reference Janai J, Guney F, Ranjan A, Black M, Geiger A (2018) Unsupervised learning of multi-frame optical flow with occlusions. In: Proceedings of the European conference on computer vision (ECCV), pp 690–706 Janai J, Guney F, Ranjan A, Black M, Geiger A (2018) Unsupervised learning of multi-frame optical flow with occlusions. In: Proceedings of the European conference on computer vision (ECCV), pp 690–706
16.
go back to reference Jason JY, Harley AW, Derpanis KG (2016) Back to basics: Unsupervised learning of optical flow via brightness constancy and motion smoothness. In European conference on computer vision. Springer, pp 3–10 Jason JY, Harley AW, Derpanis KG (2016) Back to basics: Unsupervised learning of optical flow via brightness constancy and motion smoothness. In European conference on computer vision. Springer, pp 3–10
17.
go back to reference Jiang S, Campbell D, Lu Y, Li H, Hartley R (2021) Learning to estimate hidden motions with global motion aggregation. In: Proceedings of the IEEE international conference on computer vision, pp 9752–9761. Institute of Electrical and Electronics Engineers Inc., ISBN 9781665428125. https://doi.org/10.1109/ICCV48922.2021.00963 Jiang S, Campbell D, Lu Y, Li H, Hartley R (2021) Learning to estimate hidden motions with global motion aggregation. In: Proceedings of the IEEE international conference on computer vision, pp 9752–9761. Institute of Electrical and Electronics Engineers Inc., ISBN 9781665428125. https://​doi.​org/​10.​1109/​ICCV48922.​2021.​00963
18.
go back to reference Liu C, Chen L-C, Schroff F, Adam H, Hua W, Yuille AL, Fei-Fei L (2019) Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 82–92 Liu C, Chen L-C, Schroff F, Adam H, Hua W, Yuille AL, Fei-Fei L (2019) Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 82–92
19.
go back to reference Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440 Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
20.
21.
go back to reference Mayer N, Ilg E, Hausser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4040–4048, This is DispNet. They have introduced several methods here Mayer N, Ilg E, Hausser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4040–4048, This is DispNet. They have introduced several methods here
22.
go back to reference Menze M, Geiger A (2015) Object scene flow for autonomous vehicles. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3061–3070 Menze M, Geiger A (2015) Object scene flow for autonomous vehicles. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3061–3070
23.
go back to reference Ranjan A, Black MJ (2017) Optical flow estimation using a spatial pyramid network. In IEEE conference on computer vision and pattern recognition (CVPR), vol 2, p 2. IEEE Ranjan A, Black MJ (2017) Optical flow estimation using a spatial pyramid network. In IEEE conference on computer vision and pattern recognition (CVPR), vol 2, p 2. IEEE
24.
go back to reference Ren Z, Yan J, Ni B, Liu B, Yang X, Zha H (2017) Unsupervised deep learning for optical flow estimation. In: AAAI, pp 1495–1501 Ren Z, Yan J, Ni B, Liu B, Yang X, Zha H (2017) Unsupervised deep learning for optical flow estimation. In: AAAI, pp 1495–1501
25.
go back to reference Salehi A, Balasubramanian M (2021) DDCNet: Deep dilated convolutional neural network for dense prediction. arXiv preprint arXiv:2107.04715 Salehi A, Balasubramanian M (2021) DDCNet: Deep dilated convolutional neural network for dense prediction. arXiv preprint arXiv:​2107.​04715
26.
go back to reference Sun D, Yang X, Liu M-Y, Kautz J (2018) Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8934–8943 Sun D, Yang X, Liu M-Y, Kautz J (2018) Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8934–8943
27.
go back to reference Sun D, Yang X, Liu M-Y, Kautz J (2019) Models matter, so does training: an empirical study of CNNS for optical flow estimation. IEEE Trans Pattern Anal Mach Intell 42(6):1408–1423CrossRef Sun D, Yang X, Liu M-Y, Kautz J (2019) Models matter, so does training: an empirical study of CNNS for optical flow estimation. IEEE Trans Pattern Anal Mach Intell 42(6):1408–1423CrossRef
33.
go back to reference Zhang J, Zheng Y, Sun J, Qi D (2019) Flow prediction in spatio-temporal networks based on multitask deep learning. IEEE Trans Knowl Data Eng 32(3):468–478CrossRef Zhang J, Zheng Y, Sun J, Qi D (2019) Flow prediction in spatio-temporal networks based on multitask deep learning. IEEE Trans Knowl Data Eng 32(3):468–478CrossRef
Metadata
Title
DDCNet-Multires: Effective Receptive Field Guided Multiresolution CNN for Dense Prediction
Authors
Ali Salehi
Madhusudhanan Balasubramanian
Publication date
26-09-2022
Publisher
Springer US
Published in
Neural Processing Letters / Issue 4/2023
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-022-11039-6

Other articles of this Issue 4/2023

Neural Processing Letters 4/2023 Go to the issue