Skip to main content
Top
Published in: Pattern Analysis and Applications 4/2023

04-10-2023 | Theoretical Advances

\(\mathcal{L}\mathcal{O}^2\)net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision

Authors: Tao Ruan, Shikui Wei, Yao Zhao, Baoqing Guo, Zujun Yu

Published in: Pattern Analysis and Applications | Issue 4/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Video foreground extraction has been widely applied to quantitative fields and attracts great attention all over the world. Nevertheless, the performance of a such method can be easily reduced due to the dizzy environment. To tackle this problem, the global semantics (e.g., background statistics) and the local semantics (e.g., boundary areas) can be utilized to better distinguish foreground objects from the complex background. In this paper, we investigate how to effectively leverage the above two kinds of semantics. For global semantics, two convolutional modules are designed to take advantage of data-level background priors and feature-level multi-scale characteristics, respectively; for local semantics, another module is further put forward to be aware of the semantic edges between foreground and background. The three modules are intertwined with each other, yielding a simple yet effective deep framework named g\(\mathcal{L}\mathcal{O}\)bal–\(\mathcal{L}\mathcal{O}\)cal Semantics Coupled Network (\(\mathcal{L}\mathcal{O}^2\)Net), which is end-to-end trainable in a scene-specific manner. Benefiting from the \(\mathcal{L}\mathcal{O}^2\)Net, we achieve superior performance on multiple public datasets, with less supervision trained against several state-of-the-art methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Sharif M, Khan MA, Zahid F, Shah JH, Akram T (2020) Human action recognition: a framework of statistical weighted segmentation and rank correlation-based selection. Springer Pattern Anal Appl 23(1):281–294CrossRef Sharif M, Khan MA, Zahid F, Shah JH, Akram T (2020) Human action recognition: a framework of statistical weighted segmentation and rank correlation-based selection. Springer Pattern Anal Appl 23(1):281–294CrossRef
2.
go back to reference Li B, Huang H, Zhang A, Liu P, Liu C (2021) Approaches on crowd counting and density estimation: a review. Springer Pattern Anal Appl 24(3):853–874CrossRef Li B, Huang H, Zhang A, Liu P, Liu C (2021) Approaches on crowd counting and density estimation: a review. Springer Pattern Anal Appl 24(3):853–874CrossRef
3.
go back to reference Ding S, Li M, Yang T, Qian R, Xu H, Chen Q, Wang J, Xiong H (2022) Motion-aware contrastive video representation learning via foreground-background merging. In: IEEE conference on computer vision and pattern recognition, pp 9716–9726 Ding S, Li M, Yang T, Qian R, Xu H, Chen Q, Wang J, Xiong H (2022) Motion-aware contrastive video representation learning via foreground-background merging. In: IEEE conference on computer vision and pattern recognition, pp 9716–9726
4.
go back to reference Cao Q, Wang Z, Long K (2021) Traffic foreground detection at complex urban intersections using a novel background dictionary learning model. Hindawi J Adv Transp 2021:1–14 Cao Q, Wang Z, Long K (2021) Traffic foreground detection at complex urban intersections using a novel background dictionary learning model. Hindawi J Adv Transp 2021:1–14
5.
go back to reference Harikrishnan PM, Thomas A, Nisha JS, Gopi VP, Palanisamy P (2021) Pixel matching search algorithm for counting moving vehicle in highway traffic videos. Springer Multimed Tools Appl 80(2):3153–3172CrossRef Harikrishnan PM, Thomas A, Nisha JS, Gopi VP, Palanisamy P (2021) Pixel matching search algorithm for counting moving vehicle in highway traffic videos. Springer Multimed Tools Appl 80(2):3153–3172CrossRef
6.
go back to reference Tang Y, Wang Y, Qian Y (2023) Railroad crossing surveillance and foreground extraction network: Weakly supervised artificial-intelligence approach, SAGE Publications Transportation Research Record, p 03611981231159406 Tang Y, Wang Y, Qian Y (2023) Railroad crossing surveillance and foreground extraction network: Weakly supervised artificial-intelligence approach, SAGE Publications Transportation Research Record, p 03611981231159406
7.
go back to reference Chandrakar R, Raja R, Miri R, Sinha U, Kushwaha AKS, Raja H (2022) Enhanced the moving object detection and object tracking for traffic surveillance using RBF-FDLNN and CBF algorithm. Elsevier Expert Syst Appl 191:116306CrossRef Chandrakar R, Raja R, Miri R, Sinha U, Kushwaha AKS, Raja H (2022) Enhanced the moving object detection and object tracking for traffic surveillance using RBF-FDLNN and CBF algorithm. Elsevier Expert Syst Appl 191:116306CrossRef
8.
go back to reference Zivkovic Z (2004) Improved adaptive gaussian mixture model for background subtraction. IEEE Int Conf Pattern Recogn 2:28–31 Zivkovic Z (2004) Improved adaptive gaussian mixture model for background subtraction. IEEE Int Conf Pattern Recogn 2:28–31
9.
go back to reference Barnich O, Van Droogenbroeck M (2011) Vibe: a universal background subtraction algorithm for video sequences. IEEE Trans Image Process 20(6):1709–1724MathSciNetCrossRefMATH Barnich O, Van Droogenbroeck M (2011) Vibe: a universal background subtraction algorithm for video sequences. IEEE Trans Image Process 20(6):1709–1724MathSciNetCrossRefMATH
10.
go back to reference St-Charles P-L, Bilodeau G-A, Bergevin R (2014) Subsense: a universal change detection method with local adaptive sensitivity. IEEE Trans Image Process 24(1):359–373MathSciNetCrossRefMATH St-Charles P-L, Bilodeau G-A, Bergevin R (2014) Subsense: a universal change detection method with local adaptive sensitivity. IEEE Trans Image Process 24(1):359–373MathSciNetCrossRefMATH
11.
go back to reference Ramirez-Quintana JA, Chacon-Murguia MI, Ramirez-Alonso GM (2018) Adaptive background modeling of complex scenarios based on pixel level learning modeled with a retinotopic self-organizing map and radial basis mapping. Springer Appl Intell 48(12):4976–4997CrossRef Ramirez-Quintana JA, Chacon-Murguia MI, Ramirez-Alonso GM (2018) Adaptive background modeling of complex scenarios based on pixel level learning modeled with a retinotopic self-organizing map and radial basis mapping. Springer Appl Intell 48(12):4976–4997CrossRef
12.
go back to reference Sanches SR, Oliveira C, Sementille AC, Freire V (2019) Challenging situations for background subtraction algorithms. Springer Appl Intell 49(5):1771–1784CrossRef Sanches SR, Oliveira C, Sementille AC, Freire V (2019) Challenging situations for background subtraction algorithms. Springer Appl Intell 49(5):1771–1784CrossRef
13.
go back to reference Braham M, Van Droogenbroeck M (2016) Deep background subtraction with scene-specific convolutional neural networks. In: IEEE international conference on systems, signals and image processing, pp 1–4 Braham M, Van Droogenbroeck M (2016) Deep background subtraction with scene-specific convolutional neural networks. In: IEEE international conference on systems, signals and image processing, pp 1–4
14.
go back to reference Wang Y, Luo Z, Jodoin P-M (2017) Interactive deep learning method for segmenting moving objects. Elsevier Pattern Recogn Lett 96:66–75CrossRef Wang Y, Luo Z, Jodoin P-M (2017) Interactive deep learning method for segmenting moving objects. Elsevier Pattern Recogn Lett 96:66–75CrossRef
15.
go back to reference Lim LA, Keles HY (2020) Learning multi-scale features for foreground segmentation. Springer Pattern Anal Appl 23(3):1369–1380CrossRef Lim LA, Keles HY (2020) Learning multi-scale features for foreground segmentation. Springer Pattern Anal Appl 23(3):1369–1380CrossRef
16.
17.
go back to reference Lim LA, Keles HY (2018) Foreground segmentation using convolutional neural networks for multiscale feature encoding. Elsevier Pattern Recogn Lett 112:256–262CrossRef Lim LA, Keles HY (2018) Foreground segmentation using convolutional neural networks for multiscale feature encoding. Elsevier Pattern Recogn Lett 112:256–262CrossRef
18.
go back to reference Wang Y, Jodoin P-M, Porikli F, Konrad J, Benezeth Y, Ishwar P (2014) Cdnet 2014: an expanded change detection benchmark dataset. In: IEEE conference on computer vision and pattern recognition workshops, pp 387–394 Wang Y, Jodoin P-M, Porikli F, Konrad J, Benezeth Y, Ishwar P (2014) Cdnet 2014: an expanded change detection benchmark dataset. In: IEEE conference on computer vision and pattern recognition workshops, pp 387–394
19.
go back to reference Maddalena L, Petrosino A (2015) Towards benchmarking scene background initialization. In: Springer international conference on image analysis and processing, pp 469–476 Maddalena L, Petrosino A (2015) Towards benchmarking scene background initialization. In: Springer international conference on image analysis and processing, pp 469–476
20.
go back to reference Mahadevan V, Vasconcelos N (2009) Spatiotemporal saliency in dynamic scenes. IEEE Trans Pattern Anal Mach Intell 32(1):171–177CrossRef Mahadevan V, Vasconcelos N (2009) Spatiotemporal saliency in dynamic scenes. IEEE Trans Pattern Anal Mach Intell 32(1):171–177CrossRef
21.
go back to reference Shimada A, Arita D, Taniguchi R-i (2006) Dynamic control of adaptive mixture-of-gaussians background model. In: IEEE international conference on video and signal based surveillance, pp 5–5 Shimada A, Arita D, Taniguchi R-i (2006) Dynamic control of adaptive mixture-of-gaussians background model. In: IEEE international conference on video and signal based surveillance, pp 5–5
22.
go back to reference Stauffer C, Grimson WEL (1999) Adaptive background mixture models for real-time tracking. IEEE Conf Comput Vis Pattern Recogn 2:246–252 Stauffer C, Grimson WEL (1999) Adaptive background mixture models for real-time tracking. IEEE Conf Comput Vis Pattern Recogn 2:246–252
23.
go back to reference Mittal A, Paragios N (2004) Motion-based background subtraction using adaptive kernel density estimation. In: IEEE conference on computer vision and pattern recognition, vol 2 Mittal A, Paragios N (2004) Motion-based background subtraction using adaptive kernel density estimation. In: IEEE conference on computer vision and pattern recognition, vol 2
24.
go back to reference Ianasi C, Gui V, Toma CI, Pescaru D (2005) A fast algorithm for background tracking in video surveillance, using nonparametric kernel density estimation. Facta Univ Ser Electron Energ 18(1):127–144CrossRef Ianasi C, Gui V, Toma CI, Pescaru D (2005) A fast algorithm for background tracking in video surveillance, using nonparametric kernel density estimation. Facta Univ Ser Electron Energ 18(1):127–144CrossRef
25.
go back to reference Kim K, Chalidabhongse TH, Harwood D, Davis L (2005) Real-time foreground-background segmentation using codebook model. Elsevier Real-Time Imag 11(3):172–185CrossRef Kim K, Chalidabhongse TH, Harwood D, Davis L (2005) Real-time foreground-background segmentation using codebook model. Elsevier Real-Time Imag 11(3):172–185CrossRef
26.
go back to reference Ilyas A, Scuturici M, Miguet S (2009) Real time foreground-background segmentation using a modified codebook model. In: IEEE international conference on advanced video and signal based surveillance, pp 454–459 Ilyas A, Scuturici M, Miguet S (2009) Real time foreground-background segmentation using a modified codebook model. In: IEEE international conference on advanced video and signal based surveillance, pp 454–459
27.
go back to reference Tuzel O, Porikli F, Meer P (2005) A Bayesian approach to background modeling. In: IEEE conference on computer vision and pattern recognition workshops, pp 58–58 Tuzel O, Porikli F, Meer P (2005) A Bayesian approach to background modeling. In: IEEE conference on computer vision and pattern recognition workshops, pp 58–58
28.
go back to reference Yu S-Y, Wang F-L, Xue Y-F, Yang J (2009) Bayesian moving object detection in dynamic scenes using an adaptive foreground model. Springer J Zhejiang Univ Sci A 10(12):1750–1758CrossRefMATH Yu S-Y, Wang F-L, Xue Y-F, Yang J (2009) Bayesian moving object detection in dynamic scenes using an adaptive foreground model. Springer J Zhejiang Univ Sci A 10(12):1750–1758CrossRefMATH
29.
go back to reference Acharya S, Nanda PK (2021) Adjacent LBP and LTP based background modeling with mixed-mode learning for foreground detection. Springer Pattern Anal Appl 24(3):1047–1074CrossRef Acharya S, Nanda PK (2021) Adjacent LBP and LTP based background modeling with mixed-mode learning for foreground detection. Springer Pattern Anal Appl 24(3):1047–1074CrossRef
30.
go back to reference Boufares O, Boussif M, Aloui N (2021) Moving object detection system based on the modified temporal difference and otsu algorithm. In: IEEE international multi-conference on systems, signals & devices (SSD), pp 1378–1382 Boufares O, Boussif M, Aloui N (2021) Moving object detection system based on the modified temporal difference and otsu algorithm. In: IEEE international multi-conference on systems, signals & devices (SSD), pp 1378–1382
31.
go back to reference Kerfa D (2023) Moving objects detection in thermal scene videos using unsupervised Bayesian classifier with bootstrap Gaussian expectation maximization algorithm, Springer Multimedia Tools and Applications, pp 1–16 Kerfa D (2023) Moving objects detection in thermal scene videos using unsupervised Bayesian classifier with bootstrap Gaussian expectation maximization algorithm, Springer Multimedia Tools and Applications, pp 1–16
32.
go back to reference LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. IEEE Proc IEEE 86(11):2278–2324CrossRef LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. IEEE Proc IEEE 86(11):2278–2324CrossRef
33.
34.
go back to reference He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp 770–778
35.
go back to reference Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef
36.
go back to reference Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: IEEE conference on computer vision and pattern recognition, pp 2881–2890 Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: IEEE conference on computer vision and pattern recognition, pp 2881–2890
37.
go back to reference Pal SK, Pramanik A, Maiti J, Mitra P (2021) Deep learning in multi-object detection and tracking: state of the art. Springer Appl Intell 51(9):6400–6429CrossRef Pal SK, Pramanik A, Maiti J, Mitra P (2021) Deep learning in multi-object detection and tracking: state of the art. Springer Appl Intell 51(9):6400–6429CrossRef
38.
go back to reference Wang Y, Ye H, Cao F (2022) A novel multi-discriminator deep network for image segmentation. Springer Appl Intell 52(1):1092–1109CrossRef Wang Y, Ye H, Cao F (2022) A novel multi-discriminator deep network for image segmentation. Springer Appl Intell 52(1):1092–1109CrossRef
39.
go back to reference Sakkos D, Liu H, Han J, Shao L (2018) End-to-end video background subtraction with 3d convolutional neural networks. Springer Multimed Tools Appl 77(17):23023–23041CrossRef Sakkos D, Liu H, Han J, Shao L (2018) End-to-end video background subtraction with 3d convolutional neural networks. Springer Multimed Tools Appl 77(17):23023–23041CrossRef
40.
go back to reference Jiang R, Zhu R, Su H, Li Y, Xie Y, Zou W (2023) Deep learning-based moving object segmentation: recent progress and research prospects, Springer Machine Intelligence Research, pp 1–35 Jiang R, Zhu R, Su H, Li Y, Xie Y, Zou W (2023) Deep learning-based moving object segmentation: recent progress and research prospects, Springer Machine Intelligence Research, pp 1–35
41.
go back to reference An Y, Zhao X, Yu T, Guo H, Zhao C, Tang M, Wang J (2023) Zbs: Zero-shot background subtraction via instance-level background modeling and foreground selection. In: IEEE conference on computer vision and pattern recognition, pp 6355–6364 An Y, Zhao X, Yu T, Guo H, Zhao C, Tang M, Wang J (2023) Zbs: Zero-shot background subtraction via instance-level background modeling and foreground selection. In: IEEE conference on computer vision and pattern recognition, pp 6355–6364
42.
go back to reference Kajo I, Kas M, Ruichek Y, Kamel N (2023) Tensor based completion meets adversarial learning: a win-win solution for change detection on unseen videos. Elsevier Comput Vis Image Underst 226:103584CrossRef Kajo I, Kas M, Ruichek Y, Kamel N (2023) Tensor based completion meets adversarial learning: a win-win solution for change detection on unseen videos. Elsevier Comput Vis Image Underst 226:103584CrossRef
43.
go back to reference Zhang H, Qu S, Li H, Xu W, Du X (2022) A motion-appearance-aware network for object change detection. Elsevier Knowl Based Syst 255:109612CrossRef Zhang H, Qu S, Li H, Xu W, Du X (2022) A motion-appearance-aware network for object change detection. Elsevier Knowl Based Syst 255:109612CrossRef
44.
go back to reference Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: International conference on artificial intelligence and statistics, pp 315–323 Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: International conference on artificial intelligence and statistics, pp 315–323
45.
go back to reference Xie S, Tu Z (2015) Holistically-nested edge detection. In: IEEE international conference on computer vision, pp 1395–1403 Xie S, Tu Z (2015) Holistically-nested edge detection. In: IEEE international conference on computer vision, pp 1395–1403
46.
go back to reference Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, et al (2019) Pytorch: an imperative style, high-performance deep learning library. Preprint arXiv:1912.01703 Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, et al (2019) Pytorch: an imperative style, high-performance deep learning library. Preprint arXiv:​1912.​01703
47.
go back to reference Berman M, Rannen Triki A, Blaschko MB (2018) The lovász-softmax loss: a tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In: IEEE conference on computer vision and pattern recognition, pp 4413–4421 Berman M, Rannen Triki A, Blaschko MB (2018) The lovász-softmax loss: a tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In: IEEE conference on computer vision and pattern recognition, pp 4413–4421
49.
go back to reference Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Springer European conference on computer vision, pp 801–818 Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Springer European conference on computer vision, pp 801–818
50.
go back to reference Bianco S, Ciocca G, Schettini R (2017) How far can you get by combining change detection algorithms? In: Springer international conference on image analysis and processing, pp 96–107 Bianco S, Ciocca G, Schettini R (2017) How far can you get by combining change detection algorithms? In: Springer international conference on image analysis and processing, pp 96–107
Metadata
Title
net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision
Authors
Tao Ruan
Shikui Wei
Yao Zhao
Baoqing Guo
Zujun Yu
Publication date
04-10-2023
Publisher
Springer London
Published in
Pattern Analysis and Applications / Issue 4/2023
Print ISSN: 1433-7541
Electronic ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-023-01193-5

Other articles of this Issue 4/2023

Pattern Analysis and Applications 4/2023 Go to the issue

Premium Partner