Skip to main content
Top

2021 | OriginalPaper | Chapter

Attentive Convolution Network-Based Video Summarization

Authors : Deeksha Gupta, Akashdeep Sharma

Published in: Applications of Artificial Intelligence and Machine Learning

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The availability of smart phones with embedded video capturing mechanisms along with gigantic storage facilities has led to generation of a plethora of videos. This deluge of videos grasped the attention of the computer vision research community to deal with the problem of efficient browsing, indexing, and retrieving the intended video. Video summarization has come up as a solution to aforementioned issues where a short summary video is generated containing important information from the original video. This paper proposes a supervised attentive convolution network for summarization (ACN-SUM) framework for binary labeling of video frames. ACN-SUM is based on encoder–decoder architecture where the encoder is an attention-aware convolution network module, while the decoder comprises the deconvolution network module. In ACN-SUM, the self-attention module captures the long-range temporal dependencies among frames and concatenation of convolution network and attention module feature map result in more informative encoded frame descriptors. These encoded features are passed to the deconvolution module to generate frames labeling for keyframe selection. Experimental results demonstrate the efficiency of the proposed model against state-of-the-art methods. The performance of the proposed network has been evaluated on two benchmark datasets.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Cisco (2020) Cisco annual internet report (2018–2023). Cisco, pp 1–41 Cisco (2020) Cisco annual internet report (2018–2023). Cisco, pp 1–41
2.
go back to reference Truong BT, Venkatesh S (2007) Video abstraction: a systematic review and classification. ACM Trans Multimed Comput Commun Appl 3(1):3:1–3:37 Truong BT, Venkatesh S (2007) Video abstraction: a systematic review and classification. ACM Trans Multimed Comput Commun Appl 3(1):3:1–3:37
3.
go back to reference Feng Y, Yuan Y, Lu X (2017) Learning deep event models for crowd anomaly detection. Neurocomputing 219:548–556CrossRef Feng Y, Yuan Y, Lu X (2017) Learning deep event models for crowd anomaly detection. Neurocomputing 219:548–556CrossRef
5.
go back to reference Mundur P, Rao Y, Yesha Y (2006) Keyframe-based video summarization using Delaunay clustering. Int J Digit Libr 6:219–232CrossRef Mundur P, Rao Y, Yesha Y (2006) Keyframe-based video summarization using Delaunay clustering. Int J Digit Libr 6:219–232CrossRef
6.
go back to reference Avila S, Brandaolopes A, Luz A, Araujo A (2011) VSUMM: a mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recogn Lett 32(1):56–68CrossRef Avila S, Brandaolopes A, Luz A, Araujo A (2011) VSUMM: a mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recogn Lett 32(1):56–68CrossRef
8.
go back to reference Mahmoud KM, Ismail MA, Ghanem NM (2013) VSCAN: an enhanced video summarization using density-based spatial clustering. In: Petrosino A (eds) Image analysis and processing—ICIAP 2013. Lecture notes in computer science, vol 8156. Springer, Berlin, Heidelberg Mahmoud KM, Ismail MA, Ghanem NM (2013) VSCAN: an enhanced video summarization using density-based spatial clustering. In: Petrosino A (eds) Image analysis and processing—ICIAP 2013. Lecture notes in computer science, vol 8156. Springer, Berlin, Heidelberg
10.
go back to reference Kumar K, Shrimankar DD, Singh N (2018) Eratosthenes sieve based key-frame extraction technique for event summarization in videos. Multimed Tools Appl 77:7383–7404CrossRef Kumar K, Shrimankar DD, Singh N (2018) Eratosthenes sieve based key-frame extraction technique for event summarization in videos. Multimed Tools Appl 77:7383–7404CrossRef
11.
go back to reference Shroff N, Turaga SP, Chellappa R (2010) Video précis: highlighting diverse aspects of videos. IEEE Trans Multimed 12(8):853–868 Shroff N, Turaga SP, Chellappa R (2010) Video précis: highlighting diverse aspects of videos. IEEE Trans Multimed 12(8):853–868
14.
go back to reference Srinivas M, Pai MM, Pai RM (2016) An improved algorithm for video summarization a rank based approach. Procedia Comput Sci 89:812–819 Srinivas M, Pai MM, Pai RM (2016) An improved algorithm for video summarization a rank based approach. Procedia Comput Sci 89:812–819
15.
go back to reference Gong B, Grauman K (2014) Diverse sequential subset selection for supervised video summarization. Adv Neural Inf Process Syst 3:2069–2077 Gong B, Grauman K (2014) Diverse sequential subset selection for supervised video summarization. Adv Neural Inf Process Syst 3:2069–2077
17.
go back to reference Sharghi A, Borji A, Li C, Yang T, Gong B (2018) Improving sequential determinantal point processes for supervised video summarization. In: Lecture notes computer science (including Lecture notes artificial intelligence. Lecture notes bioinformatics) LNCS, vol 11207, pp 533–550 Sharghi A, Borji A, Li C, Yang T, Gong B (2018) Improving sequential determinantal point processes for supervised video summarization. In: Lecture notes computer science (including Lecture notes artificial intelligence. Lecture notes bioinformatics) LNCS, vol 11207, pp 533–550
18.
go back to reference Zhang K, Chao WL, Sha F, Grauman K (2016) Video summarization with long short-term memory. In: ECCV. Springer, pp 766–782 Zhang K, Chao WL, Sha F, Grauman K (2016) Video summarization with long short-term memory. In: ECCV. Springer, pp 766–782
21.
go back to reference Wei H, Ni B, Yan Y, Yu H, Yang X, Yao C (2018) Video summarization via semantic attended networks. In: Proceeding 22nd AAAI conference, artificial intelligence, pp 216–223 Wei H, Ni B, Yan Y, Yu H, Yang X, Yao C (2018) Video summarization via semantic attended networks. In: Proceeding 22nd AAAI conference, artificial intelligence, pp 216–223
22.
go back to reference Gygli M, Grabner H, Riemenschneider H, Gool NV (2014) Creating summaries from user videos. In: Proceeding European conference on computer vision. Springer, pp 505–520 Gygli M, Grabner H, Riemenschneider H, Gool NV (2014) Creating summaries from user videos. In: Proceeding European conference on computer vision. Springer, pp 505–520
25.
go back to reference Mahasseni B, Lam M, Todorovic S (2017) Unsupervised video summarization with adversarial LSTM networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, CVPR 2017, vol 2017, Jan 2017, pp 2982–2991 Mahasseni B, Lam M, Todorovic S (2017) Unsupervised video summarization with adversarial LSTM networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, CVPR 2017, vol 2017, Jan 2017, pp 2982–2991
26.
go back to reference Garcia A, Boix X, Lim J, Tan A (2012) Active video summarization: customized summaries via on-line interaction with the user. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, pp 4046–4052 Garcia A, Boix X, Lim J, Tan A (2012) Active video summarization: customized summaries via on-line interaction with the user. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, pp 4046–4052
28.
go back to reference Sharghi A, Borji A, Li C, Yang T, Gong B (2018) Improving sequential determinantal point processes for supervised video summarization. In: Proceedings of the European conference on computer vision (ECCV), pp 517–533 Sharghi A, Borji A, Li C, Yang T, Gong B (2018) Improving sequential determinantal point processes for supervised video summarization. In: Proceedings of the European conference on computer vision (ECCV), pp 517–533
29.
go back to reference Yandong L, Wang L, Yang T, Gong B (2018) How local is the local diversity? Reinforcing sequential determinantal point processes with dynamic ground sets for supervised video summarization. In: European conference on computer vision Yandong L, Wang L, Yang T, Gong B (2018) How local is the local diversity? Reinforcing sequential determinantal point processes with dynamic ground sets for supervised video summarization. In: European conference on computer vision
30.
go back to reference He X, Hua Y, Song T, Zhang Z, Xue Z, Ma R, Robertson N, Guan H (2019) Unsupervised video summarization with attentive conditional generative adversarial networks. In: Proceedings of the 27th ACM international conference on multimedia (MM ‘19). Association for Computing Machinery, New York, NY, pp 2296–2304. https://doi.org/10.1145/3343031.3351056 He X, Hua Y, Song T, Zhang Z, Xue Z, Ma R, Robertson N, Guan H (2019) Unsupervised video summarization with attentive conditional generative adversarial networks. In: Proceedings of the 27th ACM international conference on multimedia (MM ‘19). Association for Computing Machinery, New York, NY, pp 2296–2304. https://​doi.​org/​10.​1145/​3343031.​3351056
31.
go back to reference Rochan M, Ye L, Wang Y (2018) Video summarization using fully convolutional sequence networks. In: European conference on computer vision (ECCV-2018) Rochan M, Ye L, Wang Y (2018) Video summarization using fully convolutional sequence networks. In: European conference on computer vision (ECCV-2018)
38.
go back to reference Ejaz N, Mehmood I, Baik S (2013) Efficient visual attention based framework for extracting key frames. Signal Process Image Commun 28:34–44CrossRef Ejaz N, Mehmood I, Baik S (2013) Efficient visual attention based framework for extracting key frames. Signal Process Image Commun 28:34–44CrossRef
39.
go back to reference Lee YJ, Ghosh J, Grauman K (2012) Discovering important people and objects for egocentric video summarization. In: IEEE conference on computer vision and pattern recognition (CVPR) Lee YJ, Ghosh J, Grauman K (2012) Discovering important people and objects for egocentric video summarization. In: IEEE conference on computer vision and pattern recognition (CVPR)
40.
go back to reference Song Y, Vallmitjana J, Stent A, Jaimes A (2015) Tvsum: summarizing web videos using titles. In: IEEE conference on computer vision and pattern recognition Song Y, Vallmitjana J, Stent A, Jaimes A (2015) Tvsum: summarizing web videos using titles. In: IEEE conference on computer vision and pattern recognition
41.
go back to reference Cai S, Zuo W, Davis LS, Zhang L (2018) Weakly-supervised video summarization using variational encoder-decoder and web prior. In: European conference on computer vision (ECCV-2018) Cai S, Zuo W, Davis LS, Zhang L (2018) Weakly-supervised video summarization using variational encoder-decoder and web prior. In: European conference on computer vision (ECCV-2018)
42.
go back to reference Mehmood I, Sajjad M, Rho S, Baik SW (2016) Divide-and-conquer based summarization framework for extracting affective video content. Neurocomputing 174:393–403 Mehmood I, Sajjad M, Rho S, Baik SW (2016) Divide-and-conquer based summarization framework for extracting affective video content. Neurocomputing 174:393–403
43.
go back to reference Zhang K, Grauman K, Sha F (2018) Retrospective encoders for video summarization. In: European conference on computer vision Zhang K, Grauman K, Sha F (2018) Retrospective encoders for video summarization. In: European conference on computer vision
45.
go back to reference Zhao B, Li X, Lu X (2017) Hierarchical recurrent neural network for video summarization. In: ACM multimedia Zhao B, Li X, Lu X (2017) Hierarchical recurrent neural network for video summarization. In: ACM multimedia
47.
go back to reference Zhang Y, Liang X, Dingwen Z, Tan M, Xing EP (2018) Unsupervised object-level video summarization with online motion auto-encoder. Pattern Recogn Lett Zhang Y, Liang X, Dingwen Z, Tan M, Xing EP (2018) Unsupervised object-level video summarization with online motion auto-encoder. Pattern Recogn Lett
48.
go back to reference Zhou K, Qiao Y, Xiang T (2017) Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward. arXiv preprint arXiv: 1801.00054 Zhou K, Qiao Y, Xiang T (2017) Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward. arXiv preprint arXiv: 1801.00054
49.
go back to reference Zhang K, Chao WLF, Grauman K (2016) Summary transfer: exemplar-based subset selection for video summarization. In: Proceeding IEEE conference on computer vision and pattern recognition (CVPR), Dec 2016, vol 2016, pp 1059–1067 Zhang K, Chao WLF, Grauman K (2016) Summary transfer: exemplar-based subset selection for video summarization. In: Proceeding IEEE conference on computer vision and pattern recognition (CVPR), Dec 2016, vol 2016, pp 1059–1067
51.
go back to reference Xu K, Ba JL, Kiros R, Cho K, Courville A, Salakhutdinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: Proceeding of international conference on learning representations. http://arxiv.org/abs/1502.03044 Xu K, Ba JL, Kiros R, Cho K, Courville A, Salakhutdinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: Proceeding of international conference on learning representations. http://​arxiv.​org/​abs/​1502.​03044
52.
go back to reference Zhang Y, Qiu Z, Yao T, Liu D, Mei T (2018) Fully convolutional adaptation networks for semantic segmentation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR-2018), pp 6810–6818 Zhang Y, Qiu Z, Yao T, Liu D, Mei T (2018) Fully convolutional adaptation networks for semantic segmentation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR-2018), pp 6810–6818
53.
go back to reference Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 7794–7803 Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 7794–7803
Metadata
Title
Attentive Convolution Network-Based Video Summarization
Authors
Deeksha Gupta
Akashdeep Sharma
Copyright Year
2021
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-16-3067-5_25

Premium Partner