Skip to main content
Erschienen in: Memetic Computing 3/2023

28.08.2023 | Regular research paper

Learning to estimate optical flow using dual-frequency paradigm

verfasst von: Yujin Zheng, Chu He, Yan Huang, Shenghua Fan, Min Jiang, Dingwen Wang, Yang Yi

Erschienen in: Memetic Computing | Ausgabe 3/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Deep learning-based optical flow estimation achieved impressive success with faster inference time and outperformed performance. Optical flow estimation networks are usually treated as a black box relying on large amounts of synthetic data for training, therefore the generalization and robustness of the network applying in realities remains a challenge. To overcome these problems, a dual-frequency paradigm is proposed for optical flow estimation. The proposed dual-frequency encoder captures discriminative features with both high-frequency and low-frequency biases. It is experimentally demonstrated that our method achieves better generalization while only pre-trained on FlyingChiars. Furthermore, our method improves the prediction of optical flow in occluded regions by enhancing the perception of high-frequency features that further improve the robustness of the network. Compared to the start-of-the-art RAFT, our approach obtains an improvement of the average end-point error by 10.6% on the Sintel Clean datasets and 11.7% on the challenging Sintel Final dataset.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Dosovitskiy A, Fischer P, Ilg E, Hausser P, Hazirbas C, Golkov V, Van Der Smagt P, Cremers D, Brox T (2015) Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 2758–2766 Dosovitskiy A, Fischer P, Ilg E, Hausser P, Hazirbas C, Golkov V, Van Der Smagt P, Cremers D, Brox T (2015) Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 2758–2766
4.
Zurück zum Zitat Rahaman N, Baratin A, Arpit D, Draxler F, Lin M, Hamprecht F, Bengio Y, Courville A (2019) On the spectral bias of neural networks. In: International conference on machine learning, pp 5301–5310. PMLR Rahaman N, Baratin A, Arpit D, Draxler F, Lin M, Hamprecht F, Bengio Y, Courville A (2019) On the spectral bias of neural networks. In: International conference on machine learning, pp 5301–5310. PMLR
6.
Zurück zum Zitat Basri R, Galun M, Geifman A, Jacobs D, Kasten Y, Kritchman S (2020) Frequency bias in neural networks for input of non-uniform density. In: International conference on machine learning, pp 685–694. PMLR Basri R, Galun M, Geifman A, Jacobs D, Kasten Y, Kritchman S (2020) Frequency bias in neural networks for input of non-uniform density. In: International conference on machine learning, pp 685–694. PMLR
7.
Zurück zum Zitat Wang H, Wu X, Huang Z, Xing EP (2020) High-frequency component helps explain the generalization of convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8684–8694 Wang H, Wu X, Huang Z, Xing EP (2020) High-frequency component helps explain the generalization of convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8684–8694
8.
9.
Zurück zum Zitat Chui CK (1992) Wavelets: a tutorial in theory and applications. Academic Press, CambridgeMATH Chui CK (1992) Wavelets: a tutorial in theory and applications. Academic Press, CambridgeMATH
10.
Zurück zum Zitat Ilg E, Mayer N, Saikia T, Keuper M, Dosovitskiy A, Brox T (2017) Flownet 2.0: evolution of optical flow estimation with deep networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1647–1655. https://doi.org/10.1109/CVPR.2017.179 Ilg E, Mayer N, Saikia T, Keuper M, Dosovitskiy A, Brox T (2017) Flownet 2.0: evolution of optical flow estimation with deep networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1647–1655. https://​doi.​org/​10.​1109/​CVPR.​2017.​179
11.
Zurück zum Zitat Ranjan A, Black MJ (2017) Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4161–4170 Ranjan A, Black MJ (2017) Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4161–4170
12.
Zurück zum Zitat Hui T-W, Tang X, Loy CC (2018) Liteflownet: a lightweight convolutional neural network for optical flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8981–8989 Hui T-W, Tang X, Loy CC (2018) Liteflownet: a lightweight convolutional neural network for optical flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8981–8989
13.
Zurück zum Zitat Sun D, Yang X, Liu M-Y, Kautz J (2018) PWC-net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8934–8943 Sun D, Yang X, Liu M-Y, Kautz J (2018) PWC-net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8934–8943
14.
Zurück zum Zitat Yang G, Ramanan D (2019) Volumetric correspondence networks for optical flow. In: Advances in neural information processing systems 32 Yang G, Ramanan D (2019) Volumetric correspondence networks for optical flow. In: Advances in neural information processing systems 32
15.
Zurück zum Zitat Hur J, Roth S (2019) Iterative residual refinement for joint optical flow and occlusion estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5754–5763 Hur J, Roth S (2019) Iterative residual refinement for joint optical flow and occlusion estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5754–5763
16.
Zurück zum Zitat Zheng Y, Zhang M, Lu F (2020) Optical flow in the dark. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6749–6757 Zheng Y, Zhang M, Lu F (2020) Optical flow in the dark. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6749–6757
17.
Zurück zum Zitat Yan W, Sharma A, Tan RT (2020) Optical flow in dense foggy scenes using semi-supervised learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13259–13268 Yan W, Sharma A, Tan RT (2020) Optical flow in dense foggy scenes using semi-supervised learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13259–13268
18.
Zurück zum Zitat Zhang Y, Jin X, Wang Z (2017) A new modified panoramic UAV image stitching model based on the GA-sift and adaptive threshold method. Memet Comput 9(3):231–244CrossRef Zhang Y, Jin X, Wang Z (2017) A new modified panoramic UAV image stitching model based on the GA-sift and adaptive threshold method. Memet Comput 9(3):231–244CrossRef
19.
Zurück zum Zitat WangPing Z, Min J, JunFeng Y, KunHong L, QingQiang W (2022) The design of evolutionary feature selection operator for the micro-expression recognition. Memet Comput 14(1):61–76CrossRef WangPing Z, Min J, JunFeng Y, KunHong L, QingQiang W (2022) The design of evolutionary feature selection operator for the micro-expression recognition. Memet Comput 14(1):61–76CrossRef
20.
Zurück zum Zitat Teed Z, Deng J (2020) Raft: eecurrent all-pairs field transforms for optical flow. In: European conference on computer vision. Springer, Berlin, pp 402–419 Teed Z, Deng J (2020) Raft: eecurrent all-pairs field transforms for optical flow. In: European conference on computer vision. Springer, Berlin, pp 402–419
21.
Zurück zum Zitat Jiang S, Campbell D, Lu Y, Li H, Hartley R (2021) Learning to estimate hidden motions with global motion aggregation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9772–9781 Jiang S, Campbell D, Lu Y, Li H, Hartley R (2021) Learning to estimate hidden motions with global motion aggregation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9772–9781
22.
Zurück zum Zitat Bai S, Geng Z, Savani Y, Kolter JZ (2022) Deep equilibrium optical flow estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 620–630 Bai S, Geng Z, Savani Y, Kolter JZ (2022) Deep equilibrium optical flow estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 620–630
23.
Zurück zum Zitat Luo A, Yang F, Li X, Liu S (2022) Learning optical flow with kernel patch attention. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8906–8915 Luo A, Yang F, Li X, Liu S (2022) Learning optical flow with kernel patch attention. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8906–8915
24.
Zurück zum Zitat Zhang F, Woodford OJ, Prisacariu VA, Torr PH (2021) Separable flow: Learning motion cost volumes for optical flow estimation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10807–10817 Zhang F, Woodford OJ, Prisacariu VA, Torr PH (2021) Separable flow: Learning motion cost volumes for optical flow estimation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10807–10817
25.
Zurück zum Zitat Zhao S, Zhao L, Zhang Z, Zhou E, Metaxas D (2022) Global matching with overlapping attention for optical flow estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 17592–17601 Zhao S, Zhao L, Zhang Z, Zhou E, Metaxas D (2022) Global matching with overlapping attention for optical flow estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 17592–17601
26.
Zurück zum Zitat Peebles W, Zhu J-Y, Zhang R, Torralba A, Efros A, Shechtman E (2022) Gan-supervised dense visual alignment. In: CVPR Peebles W, Zhu J-Y, Zhang R, Torralba A, Efros A, Shechtman E (2022) Gan-supervised dense visual alignment. In: CVPR
28.
Zurück zum Zitat Huang J, Guan D, Xiao A, Lu S (2021) Rda: Robust domain adaptation via Fourier adversarial attacking. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8988–8999 Huang J, Guan D, Xiao A, Lu S (2021) Rda: Robust domain adaptation via Fourier adversarial attacking. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8988–8999
29.
Zurück zum Zitat Hong D, Wu X, Ghamisi P, Chanussot J, Yokoya N, Zhu XX (2020) Invariant attribute profiles: a spatial-frequency joint feature extractor for hyperspectral image classification. IEEE Trans Geosci Remote Sens 58(6):3791–3808CrossRef Hong D, Wu X, Ghamisi P, Chanussot J, Yokoya N, Zhu XX (2020) Invariant attribute profiles: a spatial-frequency joint feature extractor for hyperspectral image classification. IEEE Trans Geosci Remote Sens 58(6):3791–3808CrossRef
30.
Zurück zum Zitat Liu Y, Li Q, Sun Z (2019) Attribute-aware face aging with wavelet-based generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11877–11886 Liu Y, Li Q, Sun Z (2019) Attribute-aware face aging with wavelet-based generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11877–11886
31.
Zurück zum Zitat Chen Y, Fan H, Xu B, Yan Z, Kalantidis Y, Rohrbach M, Shuicheng Y, Feng J (2019) Drop an octave: reducing spatial redundancy in convolutional neural networks with octave convolution. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 3434–3443. https://doi.org/10.1109/ICCV.2019.00353 Chen Y, Fan H, Xu B, Yan Z, Kalantidis Y, Rohrbach M, Shuicheng Y, Feng J (2019) Drop an octave: reducing spatial redundancy in convolutional neural networks with octave convolution. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 3434–3443. https://​doi.​org/​10.​1109/​ICCV.​2019.​00353
32.
Zurück zum Zitat Williams T, Li R (2018) Wavelet pooling for convolutional neural networks. In: International conference on learning representations Williams T, Li R (2018) Wavelet pooling for convolutional neural networks. In: International conference on learning representations
33.
Zurück zum Zitat Ferra A, Aguilar E, Radeva P (2018) Multiple wavelet pooling for CNNs. In: Proceedings of the European conference on computer vision (ECCV) workshops Ferra A, Aguilar E, Radeva P (2018) Multiple wavelet pooling for CNNs. In: Proceedings of the European conference on computer vision (ECCV) workshops
34.
Zurück zum Zitat Li Q, Shen L, Guo S, Lai Z (2020) Wavelet integrated CNNs for noise-robust image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) Li Q, Shen L, Guo S, Lai Z (2020) Wavelet integrated CNNs for noise-robust image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)
35.
Zurück zum Zitat Gomez AN, Ren M, Urtasun R, Grosse RB (2017) The reversible residual network: Backpropagation without storing activations. In: Advances in neural information processing systems, 30 Gomez AN, Ren M, Urtasun R, Grosse RB (2017) The reversible residual network: Backpropagation without storing activations. In: Advances in neural information processing systems, 30
37.
Zurück zum Zitat Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems. NIPS’17. Curran Associates Inc., Red Hook, NY, USA, pp 6000–6010 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems. NIPS’17. Curran Associates Inc., Red Hook, NY, USA, pp 6000–6010
38.
Zurück zum Zitat Claypoole RL, Davis GM, Sweldens W, Baraniuk RG (2003) Nonlinear wavelet transforms for image coding via lifting. IEEE Trans Image Process 12(12):1449–1459MathSciNetCrossRef Claypoole RL, Davis GM, Sweldens W, Baraniuk RG (2003) Nonlinear wavelet transforms for image coding via lifting. IEEE Trans Image Process 12(12):1449–1459MathSciNetCrossRef
39.
Zurück zum Zitat Zheng Y, Wang R, Li J (2010) Nonlinear wavelets and BP neural networks adaptive lifting scheme. In: The 2010 international conference on apperceiving computing and intelligence analysis proceeding. IEEE, pp 316–319 Zheng Y, Wang R, Li J (2010) Nonlinear wavelets and BP neural networks adaptive lifting scheme. In: The 2010 international conference on apperceiving computing and intelligence analysis proceeding. IEEE, pp 316–319
40.
Zurück zum Zitat Mayer N, Ilg E, Häusser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 4040–4048. https://doi.org/10.1109/CVPR.2016.438 Mayer N, Ilg E, Häusser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 4040–4048. https://​doi.​org/​10.​1109/​CVPR.​2016.​438
41.
Zurück zum Zitat Butler DJ, Wulff J, Stanley GB, Black MJ (2012) A naturalistic open source movie for optical flow evaluation. In: European conference on computer vision. Springer, Berlin, pp 611–625 Butler DJ, Wulff J, Stanley GB, Black MJ (2012) A naturalistic open source movie for optical flow evaluation. In: European conference on computer vision. Springer, Berlin, pp 611–625
42.
Zurück zum Zitat Kondermann D, Nair R, Honauer K, Krispin K, Andrulis J, Brock A, Güssefeld B, Rahimimoghaddam M, Hofmann S, Brenner C, Jähne B (2016) The HCI benchmark suite: Stereo and flow ground truth with uncertainties for urban autonomous driving. In: 2016 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 19–28. https://doi.org/10.1109/CVPRW.2016.10 Kondermann D, Nair R, Honauer K, Krispin K, Andrulis J, Brock A, Güssefeld B, Rahimimoghaddam M, Hofmann S, Brenner C, Jähne B (2016) The HCI benchmark suite: Stereo and flow ground truth with uncertainties for urban autonomous driving. In: 2016 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 19–28. https://​doi.​org/​10.​1109/​CVPRW.​2016.​10
Metadaten
Titel
Learning to estimate optical flow using dual-frequency paradigm
verfasst von
Yujin Zheng
Chu He
Yan Huang
Shenghua Fan
Min Jiang
Dingwen Wang
Yang Yi
Publikationsdatum
28.08.2023
Verlag
Springer Berlin Heidelberg
Erschienen in
Memetic Computing / Ausgabe 3/2023
Print ISSN: 1865-9284
Elektronische ISSN: 1865-9292
DOI
https://doi.org/10.1007/s12293-023-00395-y

Weitere Artikel der Ausgabe 3/2023

Memetic Computing 3/2023 Zur Ausgabe

Premium Partner