Skip to main content
Top
Published in: The Journal of Supercomputing 4/2024

22-09-2023

Saliency-based dual-attention network for unsupervised video object segmentation

Authors: Guifang Zhang, Hon-Cheng Wong

Published in: The Journal of Supercomputing | Issue 4/2024

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper solves the task of unsupervised video object segmentation (UVOS) that segments the objects of interest through the entire videos without any annotation. In recent years, many unsupervised video object segmentation (UVOS) methods have been proposed. Although these methods perform well, they rely on networks with heavy weights, often leading to large model size. In order to reduce the model size while keeping a competitive performance, we propose a saliency-based dual-attention (SDA) method for UVOS in this paper. In our method, we take optical flow and video frames as inputs and extract the appearance information and motion information from optical flow and video frames. We design a two-branch network with appearance information and motion information. The information from these two branches is fused via a saliency-based dual-attention module to segment the primary object in one path. The saliency-based dual-attention module is composed of saliency attention and saliency-based reverse attention. To demonstrate the effectiveness of our network, we tested it on the DAVIS-2016 and SegtrackV2 datasets. Experimental results demonstrate that our method can achieve competitive results in terms of accuracy and model size.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Tokmakov Pavel KA, Schmid C (2017) Learning video object segmentation with visual memory. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Tokmakov Pavel KA, Schmid C (2017) Learning video object segmentation with visual memory. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
2.
go back to reference Li S, Seybold B, Vorobyov A, et al (2018) Unsupervised video object segmentation with motion-based bilateral networks. In: proceedings of the European Conference on Computer Vision (ECCV) Li S, Seybold B, Vorobyov A, et al (2018) Unsupervised video object segmentation with motion-based bilateral networks. In: proceedings of the European Conference on Computer Vision (ECCV)
4.
go back to reference Wang W, Song H, Zhao S, et al (2019) Learning unsupervised video object segmentation through visual attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Wang W, Song H, Zhao S, et al (2019) Learning unsupervised video object segmentation through visual attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
5.
go back to reference Lu X, Wang W, Ma C, et al (2019) See more, know more: Unsupervised video object segmentation with co-attention siamese networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Lu X, Wang W, Ma C, et al (2019) See more, know more: Unsupervised video object segmentation with co-attention siamese networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
6.
go back to reference Yang Z, Wang Q, Bertinetto L, et al (2019) Anchor diffusion for unsupervised video object segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Yang Z, Wang Q, Bertinetto L, et al (2019) Anchor diffusion for unsupervised video object segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
7.
go back to reference Caelles S, Montes A, Maninis KK, et al (2018) The 2018 davis challenge on video object segmentation. arXiv preprint arXiv:1803.00557 Caelles S, Montes A, Maninis KK, et al (2018) The 2018 davis challenge on video object segmentation. arXiv preprint arXiv:​1803.​00557
8.
go back to reference Zhao X, Pang Y, Yang J, et al (2021) Multi-source fusion and automatic predictor selection for zero-shot video object segmentation. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 2645–2653 Zhao X, Pang Y, Yang J, et al (2021) Multi-source fusion and automatic predictor selection for zero-shot video object segmentation. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 2645–2653
9.
go back to reference Cho S, Lee M, Lee S, et al (2023) Treating motion as option to reduce motion dependency in unsupervised video object segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 5140–5149 Cho S, Lee M, Lee S, et al (2023) Treating motion as option to reduce motion dependency in unsupervised video object segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 5140–5149
10.
go back to reference Pei G, Shen F, Yao Y, et al (2022) Hierarchical feature alignment network for unsupervised video object segmentation. In: European Conference on Computer Vision, Springer, pp. 596–613 Pei G, Shen F, Yao Y, et al (2022) Hierarchical feature alignment network for unsupervised video object segmentation. In: European Conference on Computer Vision, Springer, pp. 596–613
11.
go back to reference Lee M, Cho S, Lee S, et al (2023) Unsupervised video object segmentation via prototype memory network. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 5924–5934 Lee M, Cho S, Lee S, et al (2023) Unsupervised video object segmentation via prototype memory network. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 5924–5934
12.
go back to reference Zhen M, Li S, Zhou L, et al (2020) Learning discriminative feature with crf for unsupervised video object segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV) Zhen M, Li S, Zhou L, et al (2020) Learning discriminative feature with crf for unsupervised video object segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV)
13.
go back to reference Mahadevan S, Athar A, Ošep A, et al (2020) Making a case for 3D convolutions for object segmentation in videos. arXiv preprint arXiv:2008.11516 Mahadevan S, Athar A, Ošep A, et al (2020) Making a case for 3D convolutions for object segmentation in videos. arXiv preprint arXiv:​2008.​11516
14.
go back to reference Caelles S, Pont-Tuset J, Perazzi F, et al (2019) The 2019 davis challenge on vos: Unsupervised multi-object segmentation. arXiv preprint arXiv:1905.00737 Caelles S, Pont-Tuset J, Perazzi F, et al (2019) The 2019 davis challenge on vos: Unsupervised multi-object segmentation. arXiv preprint arXiv:​1905.​00737
15.
go back to reference Ventura C, Bellver M, Girbau A, et al (2019) Rvos: End-to-end recurrent network for video object segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Ventura C, Bellver M, Girbau A, et al (2019) Rvos: End-to-end recurrent network for video object segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
16.
go back to reference Luiten J, Zulfikar IE, Leibe B (2020) Unovost: Unsupervised offline video object segmentation and tracking. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Luiten J, Zulfikar IE, Leibe B (2020) Unovost: Unsupervised offline video object segmentation and tracking. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision
17.
go back to reference Zhou T, Li J, Li X, et al (2021) Target-aware object discovery and association for unsupervised video multi-object segmentation. arXiv preprint arXiv:2104.04782 Zhou T, Li J, Li X, et al (2021) Target-aware object discovery and association for unsupervised video multi-object segmentation. arXiv preprint arXiv:​2104.​04782
18.
go back to reference Caelles S, Maninis KK, Pont-Tuset J, et al (2017) One-shot video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Caelles S, Maninis KK, Pont-Tuset J, et al (2017) One-shot video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
19.
go back to reference Voigtlaender P, Leibe B (2017) Online adaptation of convolutional neural networks for video object segmentation. arXiv preprint arXiv:1706.09364 Voigtlaender P, Leibe B (2017) Online adaptation of convolutional neural networks for video object segmentation. arXiv preprint arXiv:​1706.​09364
21.
go back to reference Li X, Loy CC (2018) Video object segmentation with joint re-identification and attention-aware mask propagation. In: Proceedings of the European Conference on Computer Vision (ECCV) Li X, Loy CC (2018) Video object segmentation with joint re-identification and attention-aware mask propagation. In: Proceedings of the European Conference on Computer Vision (ECCV)
22.
go back to reference Perazzi F, Khoreva A, Benenson R, et al (2017) Learning video object segmentation from static images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Perazzi F, Khoreva A, Benenson R, et al (2017) Learning video object segmentation from static images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
23.
go back to reference Yang Z, Wei Y, Yang Y (2020) Collaborative video object segmentation by foreground-background integration. In: European Conference on Computer Vision (ECCV) Yang Z, Wei Y, Yang Y (2020) Collaborative video object segmentation by foreground-background integration. In: European Conference on Computer Vision (ECCV)
24.
go back to reference Hu YT, Huang JB, Schwing AG (2018) Videomatch: Matching based video object segmentation. In: Proceedings of the European conference on computer vision (ECCV) Hu YT, Huang JB, Schwing AG (2018) Videomatch: Matching based video object segmentation. In: Proceedings of the European conference on computer vision (ECCV)
25.
go back to reference Cheng J, Tsai YH, Hung WC, et al (2018) Fast and accurate online video object segmentation via tracking parts. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Cheng J, Tsai YH, Hung WC, et al (2018) Fast and accurate online video object segmentation via tracking parts. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
26.
go back to reference Li H, Chen G, Li G, et al (2019) Motion guided attention for video salient object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Li H, Chen G, Li G, et al (2019) Motion guided attention for video salient object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
27.
go back to reference Fan DP, Wang W, Cheng MM, et al (2019) Shifting more attention to video salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Fan DP, Wang W, Cheng MM, et al (2019) Shifting more attention to video salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
28.
go back to reference Song H, Wang W, Zhao S, et al (2018) Pyramid dilated deeper convlstm for video salient object detection. In: Proceedings of the European Conference on Computer Vision (ECCV) Song H, Wang W, Zhao S, et al (2018) Pyramid dilated deeper convlstm for video salient object detection. In: Proceedings of the European Conference on Computer Vision (ECCV)
29.
go back to reference Su Y, Wang W, Liu J, et al (2020) Ds-net: Dynamic spatiotemporal network for video salient object detection. arXiv preprint arXiv:2012.04886 Su Y, Wang W, Liu J, et al (2020) Ds-net: Dynamic spatiotemporal network for video salient object detection. arXiv preprint arXiv:​2012.​04886
30.
go back to reference Chen C, Song J, Peng C, et al (2020) A novel video salient object detection method via semi-supervised motion quality perception. arXiv preprint arXiv:2008.02966 Chen C, Song J, Peng C, et al (2020) A novel video salient object detection method via semi-supervised motion quality perception. arXiv preprint arXiv:​2008.​02966
32.
go back to reference Sun D, Yang X, Liu MY, et al (2018) Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Sun D, Yang X, Liu MY, et al (2018) Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
33.
go back to reference He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
35.
go back to reference Wang L, Lu H, Wang Y, et al (2017) Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Wang L, Lu H, Wang Y, et al (2017) Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
36.
go back to reference Wang W, Lu X, Shen J, et al (2019) Zero-shot video object segmentation via attentive graph neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Wang W, Lu X, Shen J, et al (2019) Zero-shot video object segmentation via attentive graph neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
37.
go back to reference Yang Y, Loquercio A, Scaramuzza D, et al (2019) Unsupervised moving object detection via contextual information separation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (ICCV) Yang Y, Loquercio A, Scaramuzza D, et al (2019) Unsupervised moving object detection via contextual information separation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (ICCV)
Metadata
Title
Saliency-based dual-attention network for unsupervised video object segmentation
Authors
Guifang Zhang
Hon-Cheng Wong
Publication date
22-09-2023
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 4/2024
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-023-05637-x

Other articles of this Issue 4/2024

The Journal of Supercomputing 4/2024 Go to the issue

Premium Partner