Skip to main content
Top
Published in:
Cover of the book

2013 | OriginalPaper | Chapter

1. On the Use of Audio Events for Improving Video Scene Segmentation

Authors : Panagiotis Sidiropoulos, Vasileios Mezaris, Ioannis Kompatsiaris, Hugo Meinedo, Miguel Bugalho, Isabel Trancoso

Published in: Analysis, Retrieval and Delivery of Multimedia Content

Publisher: Springer New York

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This work deals with the problem of automatic temporal segmentation of a video into elementary semantic units known as scenes. Its novelty lies in the use of high-level audio information, in the form of audio events, for the improvement of scene segmentation performance. More specifically, the proposed technique is built upon a recently proposed audio-visual scene segmentation approach that involves the construction of multiple scene transition graphs (STGs) that separately exploit information coming from different modalities. In the extension of the latter approach presented in this work, audio event detection results are introduced to the definition of an audio-based scene transition graph, while a visual-based scene transition graph is also defined independently. The results of these two types of STGs are subsequently combined. The results of the application of the proposed technique to broadcast videos demonstrate the usefulness of audio events for scene segmentation and highlight the importance of introducing additional high-level information to the scene segmentation algorithms.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Netherlands Institute for Sound and Vision, http://​www.​instituut.​beeldengeluid.​nl/​
 
2
A recent version of the GD component achieved the first place in the Interspeech 2010 Paralinguistic Challenge in the category of Male/Female/Child classification [15].
 
Literature
1.
go back to reference Tsamoura E, Mezaris V, Kompatsiaris I (2008) Gradual transition detection using color coherence and other criteria in a video shot meta-segmentation framework. In: Proceedings of IEEE international conference on image processing, workshop on multimedia information retrieval (ICIP-MIR 2008), pp 45–48 Tsamoura E, Mezaris V, Kompatsiaris I (2008) Gradual transition detection using color coherence and other criteria in a video shot meta-segmentation framework. In: Proceedings of IEEE international conference on image processing, workshop on multimedia information retrieval (ICIP-MIR 2008), pp 45–48
2.
go back to reference Hanjalic A, Lagendijk RL, Biemond J (1999) Automated high-level movie segmentation for advanced video-retrieval systems. IEEE Trans Circ Syst Video Technol 9(4):580–588 Hanjalic A, Lagendijk RL, Biemond J (1999) Automated high-level movie segmentation for advanced video-retrieval systems. IEEE Trans Circ Syst Video Technol 9(4):580–588
3.
go back to reference Yeung M, Yeo BL, Liu B (1998) Segmentation of video by clustering and graph analysis. Comput Vis Image Understand 71(1):94–109CrossRef Yeung M, Yeo BL, Liu B (1998) Segmentation of video by clustering and graph analysis. Comput Vis Image Understand 71(1):94–109CrossRef
4.
go back to reference Chasanis V, Likas A, Galatsanos N (2009) Scene detection in videos using shot clustering and sequence alignment. IEEE Trans Multimed 11(1):89–100CrossRef Chasanis V, Likas A, Galatsanos N (2009) Scene detection in videos using shot clustering and sequence alignment. IEEE Trans Multimed 11(1):89–100CrossRef
5.
go back to reference Nitanda N, Haseyama M, Kitajima H (2005) Audio signal segmentation and classification for scene-cut detection. In: Proc IEEE Int Symp Circ Syst 4:4030–4033 Nitanda N, Haseyama M, Kitajima H (2005) Audio signal segmentation and classification for scene-cut detection. In: Proc IEEE Int Symp Circ Syst 4:4030–4033
6.
go back to reference Chianese A, Moscato V, Penta A, Picariello A (2008) Scene detection using visual and audio attention. In: Proceedings of Ambi-Sys workshop on ambient media delivery and interactive television Chianese A, Moscato V, Penta A, Picariello A (2008) Scene detection using visual and audio attention. In: Proceedings of Ambi-Sys workshop on ambient media delivery and interactive television
7.
go back to reference Wilson K, Divakaran A (2009) Discriminative genre-independent audio-visual scene change detection. In: Proceedings of SPIE conference on multimedia content access: algorithms and systems III, vol 7255 Wilson K, Divakaran A (2009) Discriminative genre-independent audio-visual scene change detection. In: Proceedings of SPIE conference on multimedia content access: algorithms and systems III, vol 7255
8.
go back to reference Wang J, Duan L, Liu Q, Lu H, Jin J (2008) A multimodal scheme for program segmentation and representation in broadcast video streams. IEEE Trans Multimed 10(3):393–408CrossRef Wang J, Duan L, Liu Q, Lu H, Jin J (2008) A multimodal scheme for program segmentation and representation in broadcast video streams. IEEE Trans Multimed 10(3):393–408CrossRef
9.
go back to reference Sidiropoulos P, Mezaris V, Kompatsiaris I, Meinedo H, Trancoso I (2009) Multi-modal scene segmentation using scene transition graphs. In: Proceedings of ACM Multimedia, pp 665–668 Sidiropoulos P, Mezaris V, Kompatsiaris I, Meinedo H, Trancoso I (2009) Multi-modal scene segmentation using scene transition graphs. In: Proceedings of ACM Multimedia, pp 665–668
10.
go back to reference Amaral R, Meinedo H, Caseiro D, Trancoso I, Neto J (2007) A prototype system for selective dissemination of broadcast news in European Portuguese. EURASIP J Adv Sig Proces 2007:1–11 Amaral R, Meinedo H, Caseiro D, Trancoso I, Neto J (2007) A prototype system for selective dissemination of broadcast news in European Portuguese. EURASIP J Adv Sig Proces 2007:1–11
11.
go back to reference Meinedo H (2008) Audio pre-processing and speech recognition for Broadcast News. PhD thesis, IST, Technical University of Lisbon Meinedo H (2008) Audio pre-processing and speech recognition for Broadcast News. PhD thesis, IST, Technical University of Lisbon
12.
go back to reference Trancoso I, Pellegrini T, Portelo J, Meinedo H, Bugalho M, Abad A, Neto J (2009) Audio contributions to semantic video search. In: Proceedings of IEEE international conference on multimedia and expo, pp 630–633 Trancoso I, Pellegrini T, Portelo J, Meinedo H, Bugalho M, Abad A, Neto J (2009) Audio contributions to semantic video search. In: Proceedings of IEEE international conference on multimedia and expo, pp 630–633
13.
go back to reference Bugalho M, Portelo J, Trancoso I, Pellegrini T, Abad A (2009) Detecting audio events for semantic video search. In: Proceedings of interspeech 2009 Bugalho M, Portelo J, Trancoso I, Pellegrini T, Abad A (2009) Detecting audio events for semantic video search. In: Proceedings of interspeech 2009
15.
go back to reference Meinedo H, Trancoso I (2010) Age and gender classification using fusion of acoustic and prosodic features. In: Proceedings of Interspeech 2010 Meinedo H, Trancoso I (2010) Age and gender classification using fusion of acoustic and prosodic features. In: Proceedings of Interspeech 2010
16.
go back to reference Vendrig J, Worring M (2002) Systematic evaluation of logical story unit segmentation. IEEE Trans Multimed 4(4):492–499CrossRef Vendrig J, Worring M (2002) Systematic evaluation of logical story unit segmentation. IEEE Trans Multimed 4(4):492–499CrossRef
Metadata
Title
On the Use of Audio Events for Improving Video Scene Segmentation
Authors
Panagiotis Sidiropoulos
Vasileios Mezaris
Ioannis Kompatsiaris
Hugo Meinedo
Miguel Bugalho
Isabel Trancoso
Copyright Year
2013
Publisher
Springer New York
DOI
https://doi.org/10.1007/978-1-4614-3831-1_1