Parallel multiscale context-based edge-preserving optical flow estimation with occlusion detection

https://doi.org/10.1016/j.image.2021.116560Get rights and content

Highlights

  • We construct a parallel multiscale context network for occlusion detection, which extracts multiscale context information to refine the occlusion boundaries.

  • We combine the PMC network with a context network to establish an occlusion detection module and incorporate it into a pyramid, warping, and cost volume network to construct an edge-preserving optical flow model.

  • We exploit a novel loss function by integrating an edge loss with an EPE-based loss and a binary cross-entropy loss. The proposed loss function supervises the network to estimate flow field and occlusions simultaneously.

Abstract

Although convolutional neural network (CNN)-based optical flow approaches have exhibited good performance in terms of computational accuracy and efficiency in recent years, the issue of edge-blurring caused by motion occlusions remains. In this paper, we propose a parallel multiscale context-based edge-preserving optical flow estimation method with occlusion detection, named PMC-PWC. First, we exploit a parallel multiscale context (PMC) network for occlusion detection, in which the proposed PMC model is able to aggregate the multiscale context information to develop the performance of occlusion detection near motion boundaries. Second, we combine the PMC model with a context network to plan an occlusion estimation module and incorporate it into a pyramid, warping, and cost volume model to construct an edge-preserving optical flow computation network. Third, we design a novel loss function including an endpoint error (EPE)-based loss, a binary cross-entropy loss and an edge loss to supervise the proposed PMC-PWC network to produce optical flow and occlusion simultaneously. Finally, we run the proposed PMC-PWC method on the MPI-Sintel and KITTI datasets to conduct a comprehensive comparison with several state-of-the-art approaches. The experimental results indicate that the proposed PMC-PWC method performed well in terms of both accuracy and robustness, especially due to the significant benefits of edge preservation and occlusion handling.

Introduction

Optical flow computation is a research focus in image processing, computer vision and pattern recognition. It has been applied to many vision-related applications, e.g., expression recognition [1], [2], action recognition [3], [4], [5], video object segmentation [6], [7], tracking [8], [9] and medical image analysis [10], [11].

After the seminal works of Horn & Schunck [12] and Lucas & Kanade [13], variational optical flow estimation was the dominant method used in previous studies because it can produce an accurate and dense flow field [14], [15], [16]. However, these variational approaches usually require numerous iterations to minimize the energy function, which may dramatically increase the computational complexity and time consumption.

With the great success of convolutional neural network (CNN) modeling in recent years, CNN-based methods have been increasingly popular in optical flow estimation [17], [18], [19]. Although CNN-based models have shown to have high accuracy and good robustness in optical flow computation from several public optical flow databases, e.g., MPI-Sintel [20] and KITTI [21] benchmarks, the issue of edge-blurring caused by motion occlusions remains a challenge for most CNN-based optical flow methods. As shown in Fig. 1, although the PWC-Net+ [22] method performed the best when published on the MPI-Sintel online database, its optical flow result appears to exhibit an obvious edge-blurring problem near the image and motion boundaries.

To address the issue of edge-blurring caused by motion occlusions, we propose in this paper a parallel multiscale context-based pyramid, warping and cost volume network with occlusion detection for edge-preserving optical flow estimation, called PMC-PWC. As shown in Fig. 1, our PMC-PWC method produces better optical flow with the remarkable benefit of edge-preserving, especially in occluded regions. Our main contributions are summarized as follows.

  • We construct a parallel multiscale context network for occlusion detection, in which the proposed PMC network extracts multiscale context information to refine the occlusion boundaries.

  • We combine the PMC network with a context network to establish an occlusion detection module and then incorporate the occlusion module into a pyramid, warping, and cost volume network to construct an edge-preserving optical flow model. The presented PMC-PWC method can preserve the image and motion edges around the occlusion areas.

  • We exploit a novel loss function by integrating an edge loss with an EPE-based loss and a binary cross-entropy loss, in which the proposed loss function is able to supervise the proposed network to estimate the flow field and occlusions simultaneously.

The remainder of this paper is organized as follows. Section 2 briefly reviews the previous work. In Section 3, we describe the proposed multiscale context-based optical flow network. The experimental results and discussion are presented in Section 4. Finally, we conclude the project in Section 5.

Section snippets

Related work

Optical flow estimation is a popular research area of image processing and computer vision, a large number of publications and related studies have been presented in recent decades [23], [24], [25], [26]. However, it is beyond the scope of this article to summarize all these studies. To present a targeted overview, we only review and discuss the most related studies that focused on CNN-based optical flow technologies.

Tracing back to the early studies of CNN-based optical flow computation,

Parallel multiscale context network for occlusion detection

As a remarkable study of CNN-based optical flow approach, PWC-Net method constructs a compact but effective model by applying a context network to a feature pyramid network. In PWC-Net [41], the context network is a feed-forward CNN and it is designed by stacking several conventional layers with different dilation rates, which is employed to refine the output flow field. Although PWC-Net method achieves goof performance on large displacements, it is usually powerless on motion occlusions. To

Implementation details

Datasets. The ground truth of optical flow in real scenes is difficult to obtain. Therefore, using synthetic datasets to train the flow estimation network is a common practice at present. To achieve good performance on both synthetic and real scenes, we maintain the same training schedule as the previous studies [22], [29], [41], [52]. We first train the PMC-PWC network on the FlyingChairsOCC datasets with 108 epochs, then fine-tune on the FlyingThings3D subset with 50 epochs, and finally

Conclusion

In this paper, we presented a parallel multiscale context-based edge-preserving optical flow estimation network with occlusion detection and a hybrid loss function: (1) Parallel multiscale context network, which aggregates multiscale context information from the input frames to improve the performance of occlusion detection in regions of image and motion boundaries. (2) Edge-preserving optical flow estimation network with occlusion detection. We presented a novel occlusion estimation module by

CRediT authorship contribution statement

Congxuan Zhang: Conceptualization, Methodology, Writing – original draft. Cheng Feng: Data curation, Software, Experiments. Zhen Chen: Conceptualization, Methodology, Funding acquisition. Weiming Hu: Writing – review & editing. Ming Li: Methodology.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported in part by the National Key Research and Development Program of China [2020YFC2003800], the National Natural Science Foundation of China [61772255, 61866026 and 61866025], the Advantage Subject Team Project of Jiangxi Province, China [20165BCB19007], the Outstanding Young Talents Program of Jiangxi Province, China [20192BCB23011], the National Natural Science Foundation of Jiangxi Province, China [20202ACB214007], the Aeronautical Science Foundation of China [2018ZC56008

References (65)

  • SongX. et al.

    STC-flow: Spatio-temporal context-aware optical flow estimation

    Signal Process., Image Commun.

    (2021)
  • H. et al.

    Automatic generation of dense non-rigid optical flow

    Comput. Vis. Image Underst.

    (2021)
  • SolariF. et al.

    What can we expect from a V1-MT feedforward architecture for optical flow estimation?

    Signal Process., Image Commun.

    (2015)
  • ZhaiM. et al.

    Optical flow estimation using channel attention mechanism and dilated convolutional neural networks

    Neurocomputing

    (2019)
  • TuZ. et al.

    A survey of variational and CNN-based optical flow techniques

    Signal Process., Image Commun.

    (2019)
  • RenZ. et al.

    Unsupervised learning of optical flow with patch consistency and occlusion estimation

    Pattern Recognit.

    (2020)
  • ZhaiM. et al.

    Optical flow and scene flow estimation: A survey

    Pattern Recognit.

    (2021)
  • LeiW. et al.

    A weighted feature extraction method based on temporal accumulation of optical flow for micro-expression recognition

    Signal Process., Image Commun.

    (2019)
  • MartinP. et al.

    Optimal choice of motion estimation methods for fine-grained action classification with 3D convolutional networks

  • WangX. et al.

    Aerial infrared object tracking via an improved long-term correlation filter with optical flow estimation and SURF matching

    Infrared Phys. Technol.

    (2019)
  • LucasB.D. et al.

    An iterative image registration technique with an application to stereo vision

  • ZhangC. et al.

    Robust non-local TV-L1 optical flow estimation with occlusion detection

    IEEE Trans. Image Process.

    (2017)
  • ZhangC. et al.

    Refined TV-L1 optical flow estimation using joint filtering

    IEEE Trans. Multimedia

    (2020)
  • ZhangC. et al.

    Self-attention-based multiscale feature learning optical flow with occlusion feature map prediction

    IEEE Trans. Multimedia

    (2021)
  • ButlerD.J. et al.

    A naturalistic open source movie for optical flow evaluation

  • MenzeM. et al.

    Object scene flow for autonomous vehicles

  • SunD. et al.

    So does training: An empirical study of CNNs for optical flow estimation

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2019)
  • ZhengY. et al.

    Optical flow in the dark

  • DosovitskiyA. et al.

    FlowNet: Learning optical flow with convolutional networks

  • RonnebergerO. et al.

    U-net: Convolutional networks for biomedical image segmentation

  • IlgE. et al.

    Flownet 2.0: Evolution of optical flow estimation with deep networks

  • HuiT.W. et al.

    LiteFlowNet: A lightweight convolutional neural network for optical flow estimation

  • Cited by (6)

    View full text