Parallel multiscale context-based edge-preserving optical flow estimation with occlusion detection
Introduction
Optical flow computation is a research focus in image processing, computer vision and pattern recognition. It has been applied to many vision-related applications, e.g., expression recognition [1], [2], action recognition [3], [4], [5], video object segmentation [6], [7], tracking [8], [9] and medical image analysis [10], [11].
After the seminal works of Horn & Schunck [12] and Lucas & Kanade [13], variational optical flow estimation was the dominant method used in previous studies because it can produce an accurate and dense flow field [14], [15], [16]. However, these variational approaches usually require numerous iterations to minimize the energy function, which may dramatically increase the computational complexity and time consumption.
With the great success of convolutional neural network (CNN) modeling in recent years, CNN-based methods have been increasingly popular in optical flow estimation [17], [18], [19]. Although CNN-based models have shown to have high accuracy and good robustness in optical flow computation from several public optical flow databases, e.g., MPI-Sintel [20] and KITTI [21] benchmarks, the issue of edge-blurring caused by motion occlusions remains a challenge for most CNN-based optical flow methods. As shown in Fig. 1, although the PWC-Net+ [22] method performed the best when published on the MPI-Sintel online database, its optical flow result appears to exhibit an obvious edge-blurring problem near the image and motion boundaries.
To address the issue of edge-blurring caused by motion occlusions, we propose in this paper a parallel multiscale context-based pyramid, warping and cost volume network with occlusion detection for edge-preserving optical flow estimation, called PMC-PWC. As shown in Fig. 1, our PMC-PWC method produces better optical flow with the remarkable benefit of edge-preserving, especially in occluded regions. Our main contributions are summarized as follows.
- •
We construct a parallel multiscale context network for occlusion detection, in which the proposed PMC network extracts multiscale context information to refine the occlusion boundaries.
- •
We combine the PMC network with a context network to establish an occlusion detection module and then incorporate the occlusion module into a pyramid, warping, and cost volume network to construct an edge-preserving optical flow model. The presented PMC-PWC method can preserve the image and motion edges around the occlusion areas.
- •
We exploit a novel loss function by integrating an edge loss with an EPE-based loss and a binary cross-entropy loss, in which the proposed loss function is able to supervise the proposed network to estimate the flow field and occlusions simultaneously.
The remainder of this paper is organized as follows. Section 2 briefly reviews the previous work. In Section 3, we describe the proposed multiscale context-based optical flow network. The experimental results and discussion are presented in Section 4. Finally, we conclude the project in Section 5.
Section snippets
Related work
Optical flow estimation is a popular research area of image processing and computer vision, a large number of publications and related studies have been presented in recent decades [23], [24], [25], [26]. However, it is beyond the scope of this article to summarize all these studies. To present a targeted overview, we only review and discuss the most related studies that focused on CNN-based optical flow technologies.
Tracing back to the early studies of CNN-based optical flow computation,
Parallel multiscale context network for occlusion detection
As a remarkable study of CNN-based optical flow approach, PWC-Net method constructs a compact but effective model by applying a context network to a feature pyramid network. In PWC-Net [41], the context network is a feed-forward CNN and it is designed by stacking several conventional layers with different dilation rates, which is employed to refine the output flow field. Although PWC-Net method achieves goof performance on large displacements, it is usually powerless on motion occlusions. To
Implementation details
Datasets. The ground truth of optical flow in real scenes is difficult to obtain. Therefore, using synthetic datasets to train the flow estimation network is a common practice at present. To achieve good performance on both synthetic and real scenes, we maintain the same training schedule as the previous studies [22], [29], [41], [52]. We first train the PMC-PWC network on the FlyingChairsOCC datasets with 108 epochs, then fine-tune on the FlyingThings3D subset with 50 epochs, and finally
Conclusion
In this paper, we presented a parallel multiscale context-based edge-preserving optical flow estimation network with occlusion detection and a hybrid loss function: (1) Parallel multiscale context network, which aggregates multiscale context information from the input frames to improve the performance of occlusion detection in regions of image and motion boundaries. (2) Edge-preserving optical flow estimation network with occlusion detection. We presented a novel occlusion estimation module by
CRediT authorship contribution statement
Congxuan Zhang: Conceptualization, Methodology, Writing – original draft. Cheng Feng: Data curation, Software, Experiments. Zhen Chen: Conceptualization, Methodology, Funding acquisition. Weiming Hu: Writing – review & editing. Ming Li: Methodology.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported in part by the National Key Research and Development Program of China [2020YFC2003800], the National Natural Science Foundation of China [61772255, 61866026 and 61866025], the Advantage Subject Team Project of Jiangxi Province, China [20165BCB19007], the Outstanding Young Talents Program of Jiangxi Province, China [20192BCB23011], the National Natural Science Foundation of Jiangxi Province, China [20202ACB214007], the Aeronautical Science Foundation of China [2018ZC56008
References (65)
- et al.
Micro-expression recognition using advanced genetic algorithm
Signal Process., Image Commun.
(2021) - et al.
Pose-guided inflated 3D ConvNet for action recognition in videos
Signal Process., Image Commun.
(2021) - et al.
Correlation Net: Spatiotemporal multimodal deep learning for action recognition
Signal Process., Image Commun.
(2020) - et al.
Fast pixel-matching for video object segmentation
Signal Process., Image Commun.
(2021) - et al.
A variational image segmentation method exploring both intensity means and texture patterns
Signal Process., Image Commun.
(2019) - et al.
Occlusion detection and drift-avoidance framework for 2D visual object tracking
Signal Process., Image Commun.
(2021) - et al.
Adaptive local-fitting-based active contour model for medical image segmentation
Signal Process., Image Commun.
(2019) - et al.
Segmentation of left ventricle on dynamic MRI sequences for blood flow cancellation in thermotherapy
Signal Process., Image Commun.
(2017) - et al.
Determining optical flow
Artificial Intelligence
(1981) - et al.
Robust optical flow estimation via edge preserving filtering
Signal Process., Image Commun.
(2021)
STC-flow: Spatio-temporal context-aware optical flow estimation
Signal Process., Image Commun.
Automatic generation of dense non-rigid optical flow
Comput. Vis. Image Underst.
What can we expect from a V1-MT feedforward architecture for optical flow estimation?
Signal Process., Image Commun.
Optical flow estimation using channel attention mechanism and dilated convolutional neural networks
Neurocomputing
A survey of variational and CNN-based optical flow techniques
Signal Process., Image Commun.
Unsupervised learning of optical flow with patch consistency and occlusion estimation
Pattern Recognit.
Optical flow and scene flow estimation: A survey
Pattern Recognit.
A weighted feature extraction method based on temporal accumulation of optical flow for micro-expression recognition
Signal Process., Image Commun.
Optimal choice of motion estimation methods for fine-grained action classification with 3D convolutional networks
Aerial infrared object tracking via an improved long-term correlation filter with optical flow estimation and SURF matching
Infrared Phys. Technol.
An iterative image registration technique with an application to stereo vision
Robust non-local TV-L1 optical flow estimation with occlusion detection
IEEE Trans. Image Process.
Refined TV-L1 optical flow estimation using joint filtering
IEEE Trans. Multimedia
Self-attention-based multiscale feature learning optical flow with occlusion feature map prediction
IEEE Trans. Multimedia
A naturalistic open source movie for optical flow evaluation
Object scene flow for autonomous vehicles
So does training: An empirical study of CNNs for optical flow estimation
IEEE Trans. Pattern Anal. Mach. Intell.
Optical flow in the dark
FlowNet: Learning optical flow with convolutional networks
U-net: Convolutional networks for biomedical image segmentation
Flownet 2.0: Evolution of optical flow estimation with deep networks
LiteFlowNet: A lightweight convolutional neural network for optical flow estimation
Cited by (6)
UDF-GAN: Unsupervised dense optical-flow estimation using cycle Generative Adversarial Networks
2023, Knowledge-Based SystemsLCIF-Net: Local criss-cross attention based optical flow method using multi-scale image features and feature pyramid
2023, Signal Processing: Image CommunicationDense Matchers for Dense Tracking
2024, arXivDDCNet-Multires: Effective Receptive Field Guided Multiresolution CNN for Dense Prediction
2023, Neural Processing LettersMFT: Long-Term Tracking of Every Pixel
2023, arXiv