Abstract
The recent expansion of broadband Internet access led to an exponential increase of potential consumers of video on the Web. The huge success of video upload websites shows that the online world, with its virtually unlimited possibilities of active user participation, is an ideal complement to traditional consumption-only media like TV and DVD. It is evident that users are willing to interact with content-providing systems in order to get the content they desire. In parallel to these developments, innovative tools for producing interactive, non-linear audio-visual content are being created. They support the authoring process alongside management of media and metadata, enabling on-demand assembly of videos based on the consumer’s wishes. The quality of such a dynamic video remixing system mainly depends on the expressiveness of associated metadata. Eliminating the need for manual input as far as possible, we aim at designing a system which is able to automatically enrich its own media and metadata repositories continuously. Currently, video content remixing is available on the Web mostly in very basic forms. Most platforms offer upload and simple modification of content. Although several implementations exist, to the best of our knowledge no solution uses metadata to its full extent to dynamically render a video stream based on consumers’ wishes. With the research presented in this paper, we propose a novel concept to interactive video assembly on the Web. In this approach, consumers may describe the desired content using a set of domain-specific parameters. Based on the metadata the video clips are annotated with, the system chooses clips fitting the user criteria. They are aligned in an aesthetically pleasing manner while the user furthermore is able to interactively influence content selection during playback at any time. We use a practical example to clarify the concept and further outline what it takes to implement a suchlike system.
Similar content being viewed by others
Notes
http://myvideo.nba.com (see also: http://www.gotuit.com)
http://www.ist-nm2.org/media_productions.html (“Accidental Lovers”)
An NBA regular season comprises 82 games for each team.
In certain game situations, e.g. time-outs, free throws, and game breaks, the game clock is stopped.
An NBA game consists of four quarters of 12 minutes each plus potential overtimes á 5 minutes, but raw material will also contain game breaks.
References
Babaguchi N, Kawai Y, Kitahashi T (2001) Generation of personalized abstract of sports video. In: IEEE international conference on multimedia and expo (ICME ’01), ICME, pp 158
Babaguchi N, Kawai Y, Ogura T, Kitahashi T (2004) Personalized abstraction of broadcasted american football video by highlight selection. IEEE Trans Multimedia 6(4):575–586
Bailer W, Schallauer P, Hausenblas M, Thallinger G (2005) MPEG-7 based description infrastructure for an audiovisual content analysis and retrieval system. In:Proceedings of SPIE - storage and retrieval methods and applications for multimedia, vol 5682, pp 284–295
Bhagavathy S, Saban MAE (2004) Sketchit: basketball video retrieval using ball motion similarity. In: Advances in multimedia information processing - PCM 2004, 5th pacific rim conference on multimedia, Tokyo, Japan, 30 November–3 December 2004, proceedings, part II. Lecture notes in computer science, vol 3332. Springer, New York, pp 256–263
Bocconi S (2006) Vox populi: generating video documentaries from semantically annotated media repositories. Ph.D. thesis, Technische Universiteit Eindhoven
Ekin A (2003) Sports video processing for description, summarization, and search. Ph.D. thesis, Rochester Institute of Technology
Furini M, Ghini V (2006) An audio-video summarization scheme based on audio and video analysis. In: IEEE consumer communications and networking (CCNC-NIME 2006). IEEE Communication Society, Las Vegas
Hanjalic A (2005) Adaptive extraction of highlights from a sport video based on excitement modeling. IEEE Trans Multimedia 7(6):1114–1122
Hare JS, Sinclair PAS, Lewis PH, Martinez K, Enser PGB, Sandom CJ (2006) Bridging the semantic gap in multimedia information retrieval: top-down and bottom-up approaches. In: 3rd European semantic web conference, Budva, 12 June 2006
Hausenblas M (2007) Applying media semantics mapping in a non-linear, interactive movie production environment. In: 1st international conference on new media technology (I-Media ’07), Graz, Austria
Hausenblas M, Nack F (2007) Interactivity = Reflective Expressiveness. IEEE MultiMed 14(2):1–7
Hausenblas M, Troncy R, Halaschek-Wiener C, Bürger T, Celma O (2007) Multimedia semantics on the web: vocabularies. W3C Incubator Group Report, W3C Multimedia Semantics Incubator Group
Lei Y, Uren VS, Motta E (2006) Semsearch: a search engine for the semantic web. In: EKAW, pp 238–245
Lienhart R, Pfeiffer S, Effelsberg W (1997) Video abstracting. Commun ACM 40(12): 54–62
Liu S, Xu M, Yi H, Chia LT, Rajan D (2006) Multimodal semantic analysis and annotation for basketball video. EURASIP journal on applied signal processing, article ID 32135, p 13. doi:10.1155/ASP/2006/32135
MPEG-7 (2001) Multimedia content description interface. Standard no. ISO/IEC 15938
Murray JH (1997) Hamlet on the holodeck: the future of narrative in cyberspace. The Free Press New York, NY, USA
Natsev A, Naphade MR, Smith JR (2004) Semantic representation: search and mining of multimedia content. In: KDD ’04: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 641–646
Nepal S, Srinivasan U, Reynolds G (2001) Automatic detection of ‘goal’ segments in basketball videos. In: Multimedia ’01: proceedings of the ninth ACM international conference on multimedia. ACM, New York, pp 261–269
OWL (2004) Web ontology language reference. W3C Recommendation, 10 February 2004
Riedl M, Young R (2006) From linear story generation to branching story graphs. IEEE J Comput Graph Appl 26:23–31
Shaw R, Schmitz P (2006) Community annotation and remix: a research platform and pilot deployment. In: HCM ’06: Proceedings of the 1st ACM international workshop on human-centered multimedia. ACM Press, New York
Takahashi Y, Nitta N, Babaguchi N (2004) Automatic video summarization of sports videos using metadata. In: Aizawa K, Nakamura Y, Satoh S (eds) Advances in multimedia information processing - PCM 2004, 5th pacific rim conference on multimedia, Tokyo, Japan, 30 November–3 December 2004, proceedings, part II. Lecture notes in computer science, vol 3332. Springer, New York, pp 272–280
Takahashi Y, Nitta N, Babaguchi N (2005) Video summarization for large sports video archives. In: ICME. IEEE, Piscataway, pp 1170–1173
Tien M, Chen H, Chen Y, Hsiao M, Lee S (2007) Shot classification of basketball videos and its application in shooting position extraction. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP 2007). IEEE, Piscataway, pp I-1085–I-1088
Tjondronegoro D, Chen YPP, Pham B (2004) Integrating highlights for more complete sports video summarization. IEEE MultiMed 11(4):22–37
Tjondronegoro DW, Chen YPP, Pham B (2006) Extensible detection and indexing of highlight events in broadcasted sports video. In: ACSC ’06: proceedings of the 29th Australasian computer science conference. Australian Computer Society, Darlinghurst
Truong BT, Venkatesh S (2007) Video abstraction: a systematic review and classification. ACM Trans Multimedia Comput Commun Appl 3(1):3
Ursu MF, Cook J (2005) D5.3: languages for the representation of visual narratives. Deliverable to EC (permission required), NM2 consortium
Ursu MF, Cook JJ, Zsombori V, Zimmer R, Kegel I, Williams D, Thomas M, Wyver J, Mayer H (2007) Conceiving shapeshifting tv: a computational language for truly-interactive tv. In: César P, Chorianopoulos K, Jensen JF (eds) EuroITV. Lecture notes in computer science, vol 4471. Springer, New York, pp 96–106
von Ahn L (2006) Games with a purpose. Comput 39(6):92–94
Wang J, Xu C, Chng ES, Lu H, Tian Q (2008) Automatic composition of broadcast sports video. Multimedia Systems, vol 14, number 4. Springer, New York, pp 179–193
Williams D, Ursu M, Cook J, Zsombori V, Engler M, Kegel I (2006) ShapeShifted TV – enabling multi-sequential narrative productions for delivery over broadband. In: The 2nd IET multimedia conference, 29–30 November 2006. ACM, New York
Wu L, Meng X, Liu X, Chen S (2006) A new method of object segmentation in the basketball videos. In: Proceedings of the 18th international conference on pattern recognition (ICPR 2006), pp 319–322
Young M (1999) Notes on the use of plan structures in the creation of interactive plot. In: In the working notes of the AAAI fall symposium on narrative intelligence. Technical Report FS-99-01. AAAI Press, Menlo Park, pp 164–167
Zhou W, Vellaikal A, Kuo CCJ (2000) Rule-based video classification system for basketball video indexing. In: MULTIMEDIA ’00: proceedings of the 2000 ACM workshops on multimedia. ACM, New York, pp 213–216
Acknowledgements
This work was performed within the Integrated Project TA2, Together Anytime, Together Anywhere (website: http://www.ta2-project.eu). TA2 receives funding from the European Commission under the EU’s Seventh Framework Programme, grant agreement number 214793. The authors gratefully acknowledge the European Commission’s financial support and the productive collaboration with the other TA2 consortium partners. We would like to gratefully acknowledge Werner Bailer for his support, discussion, and feedback on various revisions of this paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kaiser, R., Hausenblas, M. & Umgeher, M. Metadata-driven interactive web video assembly. Multimed Tools Appl 41, 437–467 (2009). https://doi.org/10.1007/s11042-008-0242-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-008-0242-z