Top

2018 | Book

Read chapter Read first chapter

MediaSync

Handbook on Multimedia Synchronization

Editors: Mario Montagud, Pablo Cesar, Prof. Fernando Boronat, Jack Jansen

Publisher: Springer International Publishing

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

This book provides an approachable overview of the most recent advances in the fascinating field of media synchronization (mediasync), gathering contributions from the most representative and influential experts. Understanding the challenges of this field in the current multi-sensory, multi-device, and multi-protocol world is not an easy task. The book revisits the foundations of mediasync, including theoretical frameworks and models, highlights ongoing research efforts, like hybrid broadband broadcast (HBB) delivery and users’ perception modeling (i.e., Quality of Experience or QoE), and paves the way for the future (e.g., towards the deployment of multi-sensory and ultra-realistic experiences).

Although many advances around mediasync have been devised and deployed, this area of research is getting renewed attention to overcome remaining challenges in the next-generation (heterogeneous and ubiquitous) media ecosystem. Given the significant advances in this research area, its current relevance and the multiple disciplines it involves, the availability of a reference book on mediasync becomes necessary. This book fills the gap in this context. In particular, it addresses key aspects and reviews the most relevant contributions within the mediasync research space, from different perspectives.

Mediasync: Handbook on Multimedia Synchronization is the perfect companion for scholars and practitioners that want to acquire strong knowledge about this research area, and also approach the challenges behind ensuring the best mediated experiences, by providing the adequate synchronization between the media elements that constitute these experiences.

Frontmatter

Foundations

Frontmatter

1. Introduction to Media Synchronization (MediaSync)

Abstract

Media synchronization is a core research area in multimedia systems. This chapter introduces the area by providing key definitions, classifications, and examples. It also discusses the relevance of different types of media synchronization to ensure satisfactory Quality of Experience (QoE) levels and highlights their necessity, by comparing the magnitudes of delay differences in real scenarios to the tolerable limits to the human perception. The chapter concludes with a brief description of the main aspects and components of media synchronization solutions, with the goal of providing a better understanding of this timely research area.

Mario Montagud, Pablo Cesar, Fernando Boronat, Jack Jansen

Chapter 2. Evolution of Temporal Multimedia Synchronization Principles

Abstract

Ever since the invention of the world’s first telephone in the nineteenth century, the evolution of multimedia applications has drastically changed human life and behaviors, and has introduced new demands for multimedia synchronization. In this chapter, we present a historical view of temporal synchronization efforts with a focus on continuous multimedia (i.e., sequences of time-correlated multimedia data). We demonstrate how the development of multimedia systems has advanced the research on synchronization, and what additional challenges have been imposed by next-generation multimedia technologies. We conclude with a new application-dependent multilocation multi-demand synchronization framework to address these new challenges.

Zixia Huang, Klara Nahrstedt, Ralf Steinmetz

Chapter 3. Theoretical Foundations: Formalized Temporal Models for Hyperlinked Multimedia Documents

Abstract

Consistent linking and accurate synchronization of multimedia elements in hypervideos or multimedia documents are essential to provide a good quality of experience to viewers. Temporal models are needed to define relationships and constraints between multimedia elements and create an appealing presentation. However, no commonly used description language for temporal models exists. This makes existing temporal models harder to understand, compare, and transform from one to another temporal model. Using a formal description is more accurate than commonly used textual descriptions or figures of temporal models. This abstract representation makes it is easier to precisely define algorithms and constraints for delivery and buffering, as well as behavior of user and/or multimedia document. The use of a common formalism for all temporal models makes it possible to define synchronization constraints and media management. The same variables and terminology can then be used for describing algorithms that are applied to the documents, for example, to implement pre-fetching or download and cache management in order to increase the quality of experience for users. In this chapter, we give an overview of different existing temporal models for linked and temporally synchronized multimedia documents, like point-based, event-based, or interval-based temporal models. We analyze their common features and formally define their elementary components. We then give formal definitions for each temporal model covering essential features. These can then be used to computationally solve existing problems. We show this by defining basic functions that can be used in algorithms. We also show how user interaction and resulting video behavior can be precisely defined.

Britta Meixner

4. Time, Frequency and Phase Synchronisation for Multimedia—Basics, Issues, Developments and Opportunities

Abstract

In this chapter, we provide a comprehensive overview of timing. We describe the underlying concepts that comprise timing through examples and then present a range of mature, standardised and evolving techniques to improve the so-called time awareness across the full Information and Communications Technology (ICT) infrastructure over which multimedia applications operate. Although the media synchronisation community is already acutely aware of timing issues, this chapter offers some valuable insights through its holistic approach to timing.

Hugh Melvin, Jonathan Shannon, Kevin Stanton

Applications, Use Cases, and Requirements

Frontmatter

5. Simultaneous Output-Timing Control in Networked Games and Virtual Environments

Abstract

In this chapter, we make a survey of techniques for simultaneous output-timing control, which adjusts the output timing of media streams among multiple terminals in networked games and virtual environments. When media units (MUs, each of which is the information unit, such as a video frame and a voice packet for media synchronization) are transmitted over non-guaranteed Quality of Service (QoS) networks like the Internet, the receiving times of each MU at the terminals may be different from each other owing to network delays and delay jitters. Therefore, for example, the fairness among players may be damaged in networked games, and collaborative work may not be done efficiently among users in virtual environments. It is important for multiple players/users to play/do networked games/collaborative work while watching the same displayed images simultaneously. To solve the problems, the simultaneous output-timing control, such as media synchronization control and causality control is needed. In this chapter, as the control, we mainly handle the group (or inter-destination) synchronization control, which is a type of media synchronization control, the adaptive Δ-causality control, and the dynamic local lag control. We also discuss the similarities and differences among the three types of control. Generally, the group synchronization control or adaptive Δ-causality control can be employed to keep the fairness and/or the consistency in good conditions among multiple terminals in networked games and virtual environments, and the dynamic local lag control is used for sound synchronization in networked virtual ensembles. However, the interactivity may seriously be deteriorated under such types of control. Therefore, we introduce prediction control to improve the interactivity. As a result of Quality of Experience (QoE) assessment, we demonstrate that the prediction control improves the interactivity and there is the optimum prediction time according to the network delay. Finally, we discuss the future directions of simultaneous output-timing control in networked games and virtual environments.

Pingguo Huang, Yutaka Ishibashi

Chapter 6. Automated Video Mashups: Research and Challenges

Abstract

The proliferation of video cameras, such as those embedded in smartphones and wearable devices, has made it increasingly easy for users to film interesting events (such as public performance, family events, and vacation highlights) in their daily lives. Moreover, often there are multiple cameras capturing the same event at the same time, from different views. Concatenating segments of the videos produced by these cameras together along the event time forms a video mashup, which could depict the event in a less monotonous and more informative manner. It is, however, inefficient and costly to manually create a video mashup. This chapter aims to introduce the problem of automated video mashup to the readers, survey the state-of-the-art research work in this area, and outline the set of open challenges that remain to be solved. It provides a comprehensive introduction to practitioners, researchers, and graduate students who are interested in the research and challenges of automated video mashup.

Mukesh Kumar Saini, Wei Tsang Ooi

7. MediaSynch Issues for Computer-Supported Cooperative Work

Abstract

Computer-supported Cooperative Work (CSCW) systems are often complex distributed applications that incorporate a number of different tools working in concert to support goal- and task-driven collaborations that involve multiple sites and participants. These systems exhibit a wide variety of different system and communication architectures both within and between participating sites. In this chapter, we describe and characterize CSCW systems with respect to media synchronization requirements, identify a number of common challenges that arise as a result, and review a variety of protocol coordination techniques and mechanisms that can be brought to bear.

Ketan Mayer-Patel

User Experience and Evaluation Methodologies

Frontmatter

Chapter 8. Perceiving, Interacting and Playing with Multimedia Delays

Abstract

Just like interactions with the physical world, humans prefer to interact with computers without noticeable delay. However, in the digital world, the time between cause and effect can far exceed what we are accustomed to in real life. System processes, network transmission and rendering all add to the delay between an input action and an output response. While we expect and accept a computer system to process a request within a period of time, the duration of this period is dependent on the nature of the task. Highly interactive tasks, such as control systems, computer games, design software and even word processing have stringent temporal requirements, where any sustained delay can be detrimental to performance. The research interest for the human capability to operate with delay has grown over the past decades, and it has grown in at least three separate fields. In this chapter, we review relevant work from cognitive psychology, human-computer interaction and multimedia research.

Ragnhild Eg, Kjetil Raaen

9. Methods for Human-Centered Evaluation of MediaSync in Real-Time Communication

Abstract

In an ideal world people interacting using real-time multimedia links experience perfectly synchronized media, and there is no latency of transmission: the interlocutors would hear and see each other with no delay. Methods to achieve the former are discussed in other chapters in this book, but for a variety of practical and physical reasons, delay-free communication will never be possible. In some cases, the delay will be very obvious since it will be possible to observe the reaction time of the listeners modified by the delay, or there may be some acoustic echo from the listeners’ audio equipment. However, in the absence of echo, the users themselves do not always explicitly notice the presence of delay, even for quite large values. Typically, they notice something is wrong (for example “we kept interrupting each other!”), but are unable to define what it is. Some useful insights into the impact of delay on a conversation can be obtained from the linguistic discipline of Conversation Analysis, and especially the analysis of “turn-taking” in a conversation. This chapter gives an overview of the challenges in evaluating media synchronicity in real-time communications, outlining appropriate tasks and methods for subjective testing and how in-depth analysis of such tests can be performed to gain a deep understanding of the effects of delay. The insights are based on recent studies of audio and audiovisual communication, but also show examples from other media synchronization applications like networked music interaction.

Gunilla Berndtsson, Marwin Schmitt, Peter Hughes, Janto Skowronek, Katrin Schoenenberg, Alexander Raake

10. Synchronization for Secondary Screens and Social TV: User Experience Aspects

Abstract

This chapter provides an in-depth discussion on the impact of synchronization of TV-related applications on the user experience. After all, applications meant to be used in conjunction with TV watching are created for the benefit and pleasure of the viewer. In order to explain the main user-related aspects of media synchronization, we will first sketch how the television and media landscape has evolved and which timing and synchronization aspects are important for the makers of TV-related applications. Then, we will delve into the core topic of this chapter by presenting the main user experience aspects involved in the creation of second-screen (Attentive readers will notice that ‘second screen’ is sometimes written with, and sometimes without a hyphen. When using ‘second screen’ without a hyphen, we use it as a substantive (the second screen). When using ‘second-screen’ with a hyphen, it is used as an adjective (second-screen applications).) applications—a specific type of TV-related application that has gained a lot of attention in the past years. Finally, we highlight the impact of synchronization issues on the user experience for Social TV applications. Our insights are gathered from our earlier publications in this area, results of research with users from the European research project TV-RING, and related literature. The chapter will, therefore, serve as an overview of media synchronization from the perspective of the viewer, based on our own insights and experiences, and complemented with the current state of the art.

Jeroen Vanattenhoven, David Geerts

11. Media Synchronization in Networked Multisensory Applications with Haptics

Abstract

In this chapter, we explain the present status of studies for media synchronization in networked multisensory applications with haptics. We also specify the characteristics of haptic media and different features of haptic media from other media such as olfactory, auditory, and visual media. By using such other media together with haptic media, we can get higher realistic sensation when we use the applications for various purposes such as remote education, entertainment, and networked games. When multisensory media streams are transmitted over a network like the Internet, the temporal relationships among the media streams may be disturbed owing to the network delay, delay jitter, and packet loss. Thus, the quality of experience (QoE) may seriously be degraded. To solve this problem, we need to carry out media synchronization control. To achieve a high quality of media synchronization, a number of media synchronization algorithms have been proposed so far. Especially, in networked multisensory applications, we need to take account of human perception of media synchronization errors in the algorithms. This is because the requirements of media synchronization quality depend on types of media. Some algorithms such as Virtual-Time Rendering (VTR) take human perception of the errors into account. For example, VTR tries to accomplish the synchronization by changing the buffering time of each media stream dynamically according to the network delay jitter with several threshold values about the errors. In the algorithms, instead of synchronizing the output timings of media streams exactly, ranges of human perception such as the allowable range, in which users feel that the synchronization error is allowable, the imperceptible range, in which users cannot perceive the error, and the operation range, which is narrower than the imperceptible range and should be kept usually, are taken into account for the sake of high synchronization quality. In this chapter, we explain the algorithms taking account of human perception of the media synchronization errors and enhance other algorithms such as the group (or inter-destination) synchronization control and the adaptive ∆-causality control algorithms for simultaneous output-timing control among multiple terminals by taking account of the perception. It is indispensable to clarify the ranges by QoE assessment in networked multisensory applications. In this chapter, we further make a survey of studies on the assessment. Finally, we discuss the future directions of media synchronization in networked multisensory applications with haptics.

Pingguo Huang, Mya Sithu, Yutaka Ishibashi

12. Olfaction-Enhanced Multimedia Synchronization

Abstract

This chapter introduces olfaction-enhanced multimedia synchronization and focuses on two key aspects: the specification of olfaction-enhanced multimedia, including the temporal relations between the media components; and secondly, the implementation of synchronized delivery of olfaction-enhanced multimedia. The relevance of this topic is supported by the fact that recently, multimedia researchers have begun to work with several new media components such as olfaction, haptic and gustation. The characteristics of these multisensory media differ significantly from traditional media. Multisensory media components cannot be classified as being continuous or discrete. Olfaction, the sense of smell, in particular, raises numerous research challenges. Synchronization, perceptual variability, sensor and display development are just some of the avenues among many others that require efforts from the research community. In terms of synchronization, implementing synchronized delivery as part of transmission across constrained networks is not the key research challenge (although adaptive mulsemedia delivery can play an important role here). Rather the principal problem, from a synchronization perspective, is understanding the experiential attributes of olfaction with respect to the temporal relations with other media components and the effect of these on the user’s perceived Quality of Experience (QoE). This task is non-trivial. There are many facets unique to olfaction, which need to be understood in order to design and execute even the most basic of evaluations. In this chapter, we present and discuss the results of a subjective study which considered the above-mentioned “specification” and “implementation” challenges. In particular, we focus on analysing the user’s ability to detect synchronization error and the resultant annoyance levels of synchronization error.

Niall Murray, Gabriel-Miro Muntean, Yuansong Qiao, Brian Lee

Document Formats and Standards

Frontmatter

13. SMIL: Synchronized Multimedia Integration Language

Abstract

The period from 1995 to 2010 can be considered to be networked multimedia’s Golden Age: Many formats were defined that allowed content to be captured, stored, retrieved, and presented in a networked, distributed environment. The Golden Age happened because network infrastructures had enough bandwidth available to meet the presentation needs for intramedia synchronization, and content codecs were making even complex audio/video objects storable on network servers. This period marked the end of the CD-ROM era for multimedia content distribution. Unlike the relative simplicity of CD-ROM multimedia, where timing constraints were well-understood and pre-delivery content customization was relatively simple, the network multimedia era demanded new languages that would allow content to be defined as a collection of independent media components that needed to be located, fetched, synchronized, and presented on a large collection of user devices (under greatly varying network characteristics). One of the most ambitious projects to define an open and commonly available multimedia content integration language was W3C’s SMIL. In a period of approximately ten years, SMIL grew from a simple synchronization language to a full content integration and scheduling facility for a wide range of Web documents. This chapter considers the timing and synchronization aspects of SMIL.

Dick C. A. Bulterman

14. Specifying Intermedia Synchronization with a Domain-Specific Language: The Nested Context Language (NCL)

Abstract

This chapter reports on the intermedia synchronization features of Nested Context Language (NCL), an XML-based domain-specific language (DSL) to support declarative specification of hypermedia applications. NCL takes media synchronization as a core aspect for the specification of hypermedia applications. Interestingly, NCL deals with media synchronization in a broad sense, by allowing for a uniform declaration of spatiotemporal relationships where user interactivity is included as a particular case. Following the W3C trends in modular XML specifications, NCL has been specified in a modular way, aiming at combining its modules into language profiles. Among the main NCL profiles are those targeting the domain of Digital TV (DTV) applications. Indeed, NCL and its standardized player named Ginga are part of ITU-T Recommendations for IPTV, Integrated Broadcast–Broadband (IBB) and DTV services, and Integrated Services Digital Broadcasting—Terrestrial (ISDB-T) International standards. This chapter discusses the main reasons that make NCL a comprehensive solution for the authoring of interactive multimedia applications. It also discusses the aspects of its conceptual model, the Nested Context Model (NCM), which defines an intrinsic support for easily specifying spatiotemporal synchronization among components (e.g., media and input assets).

Marcio Ferreira Moreno, Romualdo M. de R. Costa, Marcelo F. Moreno

15. Time and Timing Within MPEG Standards

Abstract

This chapter focuses on the time system used by the decoder at the end user side to replicate the encoder’s clock system to accomplish a synchronized media play-out at end user side. The time system usually uses tools such as clock references and timestamps coded within the media stream. This chapter does not go into detail of the protocols used in IP networks to perform the media delivery, but it explains in detail the time-related fields coded within the media streams which are used at the user side decoder to provide a synchronized media play-out. The principal Moving Picture Experts Group (MPEG) media standards and their time system are described in this chapter. That includes MPEG-2 Transport Streams (MP2T), MPEG-4, MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH), and MPEG Media Transport (MMT). This chapter also details the time information used by Digital Video Broadcasting (DVB) which is broadly used in multiple media delivery systems and used hand in hand with MPEG standards. First of all, this chapter describes the synchronization between video–audio media streams (lip sync) within a program in MP2T, and secondly, the synchronization between multiple programs, delivered within a multiplexed single MP2T stream. Secondly, this chapter describes timing issues in MPEG-4, which utilizes a different timeline system from MP2T to implement clock references and timestamps, as it is an object-oriented multimedia standard. Thirdly, this chapter describes timing issues in MPEG-DASH, which is an Adaptive Streaming over HTTP protocol widely used over the Internet. Additionally, this chapter also describes the time transmission in DVB systems, which use MP2T as a media container. The tools used are the DVB Service Information (DVB SI) and MPEG-2 Program-Specific Information (MPEG-2 PSI) tables. Finally, this chapter introduces the latest MPEG standard for media delivery, MMT, which aims to be a unique media delivery standard for heterogeneous networks, broadband technologies used in Internet TV and IPTV (Internet TV refers in this chapter to media delivery over a public non-managed IP network, such as Internet, and IPTV refers to media delivery over a private, managed IP network media delivery). It has also been proposed for broadcast (DVB) media delivery.

Lourdes Beloqui Yuste

16. Synchronization in MPEG-4 Systems

Abstract

The MPEG-4 standard was defined in the early days of broadband Internet, after successful deployments of digital television networks, with the goal of unifying both broadcast and broadband media architectures and protocols in a single standard, tackling natural media (audio, video, images) as well as synthetic 2D or 3D graphics and audio. As such MPEG-4 can be seen as one of the first attempt at building the so-called convergence of Web and TV. Some parts of the standard have changed the media world forever (AAC audio and AVC|H264 video compression, MP4 file format), and while other parts have not always met their markets in successful way, they paved the way for more recent works, including HTML5 media. In this chapter, we explain how the MPEG-4 standard manages playback and synchronization of audio-visual streams and graphics animations and how multiple timelines can be used to provide rich interactive presentation over broadband and broadcast.

Jean Le Feuvre, Cyril Concolato

Chapter 17. Media Synchronization on the Web

Abstract

The Web is a natural platform for multimedia, with universal reach, powerful backend services, and a rich selection of components for capture, interactivity, and presentation. In addition, with a strong commitment to modularity, composition, and interoperability, the Web should allow advanced media experiences to be constructed by harnessing the combined power of simpler components. Unfortunately, with timed media this may be complicated, as media components require synchronization to provide a consistent experience. This is particularly the case for distributed media experiences. In this chapter we focus on temporal interoperability on the Web, how to allow heterogeneous media components to operate consistently together, synchronized to a common timeline and subject to shared media control. A programming model based on external timing is presented, enabling modularity, interoperability, and precise timing among media components, in single-device as well as multi-device media experiences. The model has been proposed within the W3C Multi-device Timing Community Group as a new standard, and this could establish temporal interoperability as one of the foundations of the Web platform.

Ingar M. Arntzen, Njål T. Borch, François Daoust

18. Media Synchronisation for Television Services Through HbbTV

Abstract

Media synchronisation is getting renewed attention with ecosystems of smart televisions and connected devices enabling novel media consumption paradigms. Social TV, hybrid TV and companion screens are examples that are enabling people to consume multiple media streams on multiple devices together. These novel use cases place a number of demands on the synchronisation architecture. The systems for media synchronisation have to cope with delay differences between various distribution channels for television broadcast (terrestrial, cable, satellite) and Internet-delivered streaming media. Also they need to handle different content formats in use. Broadcasters have started using proprietary solutions for over-the-top media synchronisation, such as media fingerprinting or media watermarking technologies. Given the commercial interest in media synchronisation and the disadvantages of proprietary technologies, consumer equipment manufacturers, broadcasters, as well as telecom and cable operators have started developing a new wave of television products, services and international standards that support media synchronisation from multiple sources. This chapter provides an overview of media synchronisation in a television context as specified the Hybrid Broadcast Broadband Television (HbbTV) specification version 2 and based upon specifications by the Digital Video Broadcasting (DVB) group Companion Screens and Streams (CSS). In addition, we discuss solutions compliant with legacy HbbTV devices. Use cases include synchronisation of audio, video and data streams from multiple sources composed on a TV or on multiple devices including other consumer devices like smartphones.

M. Oskar van Deventer, Michael Probst, Christoph Ziegler

Algorithms, Protocols and Techniques

Frontmatter

Chapter 19. Video Delivery and Challenges: TV, Broadcast and Over The Top

Abstract

The TV production and broadcasting industry predates the ubiquitous computing and IP technologies of today. However, just as these advances have revolutionised other industries, they are also causing production and broadcasting to change. Here, we outline the opportunities that general computing and IP delivery offer this industry, and discuss how the precise synchronisation required by TV services could be implemented using these more generic technologies, and how this in turn could lead to newer ways of delivering TV-like services. We first discuss how today’s TV industry has been shaped by its analogue roots, and that the terminology and working practices still in some ways reflect the analogue world. We briefly cover TV history from the 1950s and the evolution of Public-Sector Broadcasting in the UK, before considering how newer services such as digital TV, satellite and video streaming have enabled services, but also throw up new issues around delay and synchronisation. We propose that some of these issues could be mitigated by moving to an IP delivery model, with media elements composed at the client device, and not globally time-locked to precise, system-wide clocks. Finally, we discuss some of the IP delivery technologies such as multicast, adaptive streaming and the newer protocols that are replacing traditional HTTP.

Tim Stevens, Stephen Appleby

Chapter 20. Camera Synchronization for Panoramic Videos

Abstract

Multi-camera systems are frequently used in applications such as panorama videos creation, free-viewpoint rendering, and 3D reconstruction. A critical aspect for visual quality in these systems is that the cameras are closely synchronized. In our research, we require high-definition panorama videos generated in real time using several cameras in parallel. This is an essential part of our sports analytics system called Bagadus, which has several synchronization requirements. The system is currently in use for soccer games at the Alfheim stadium for Tromsø IL and at the Ullevaal stadium for the Norwegian national soccer team. Each Bagadus installation is capable of combining the video from five 2 K cameras into a single 50 fps cylindrical panorama video. Due to proper camera synchronization, the produced panoramas exhibit neither ghosting effects nor other visual inconsistencies at the seams. Our panorama videos are designed to support several members of the trainer team at the same time. Using our system, they are able to pan, tilt, and zoom interactively, independently over the entire field, from an overview shot to close-ups of individual players in arbitrary locations. To create such panoramas, each of our cameras covers one part of the field with small overlapping regions, where the individual frames are transformed and stitched together into a single view. We faced two main synchronization challenges in the panorama generation process. First, to stitch frames together without visual artifacts and inconsistencies due to motion, the shutters in the cameras had to be synchronized with sub-millisecond accuracy. Second, to circumvent the need for software readjustment of color and brightness around the seams between cameras, the exposure settings were synchronized. This chapter describes these synchronization mechanisms that were designed, implemented, evaluated, and integrated in the Bagadus system.

Vamsidhar R. Gaddam, Ragnar Langseth, Håkon K. Stensland, Carsten Griwodz, Michael Riegler, Tomas Kupka, Håvard Espeland, Dag Johansen, Håvard D. Johansen, Pål Halvorsen

Chapter 21. Merge and Forward: A Self-Organized Inter-Destination Media Synchronization Scheme for Adaptive Media Streaming over HTTP

Abstract

In this chapter, we present Merge and Forward, an IDMS scheme for adaptive HTTP streaming as a distributed control scheme and adopting the MPEG-DASH standard as representation format. We introduce so-called IDMS sessions and describe how an unstructured peer-to-peer overlay can be created using the session information using MPEG-DASH. We objectively assess the performance of Merge and Forward with respect to convergence time (time needed until all clients hold the same reference time stamp) and scalability. After the negotiation on a reference time stamp, the clients have to synchronize their multimedia playback to the agreed reference time stamp. In order to achieve this, we propose a new adaptive media playout approach minimizing the impact of playback synchronization on the QoE. The proposed adaptive media playout is assessed subjectively using crowd sourcing. We further propose a crowd sourcing methodology for conducting subjective quality assessments in the field of IDMS by utilizing GWAP. We validate the applicability of our methodology by investigating the lower asynchronism threshold for IDMS in scenarios like online quiz games.

Benjamin Rainer, Stefan Petscharnig, Christian Timmerer

Chapter 22. Watermarking and Fingerprinting

Abstract

An important task in media synchronisation is to find out the playback position of a running media stream. Only based on this information, it is possible to provide additional information or additional streams synchronised to that running stream. This chapter gives an overview of two techniques for solving this basic task: watermarking and fingerprinting. In the former, synchronisation information is embedded imperceptibly in the media stream. In the latter, an excerpt of the media stream is identified based on an index of compact representations of known media content. In both cases, there are specific approaches for audio, image, and video signals. In all cases, the robustness of the methods may be increased by using error correcting codes.

Rolf Bardeli

23. Network Delay and Bandwidth Estimation for Cross-Device Synchronized Media

Abstract

Driven by the growth in mobile phone and tablet ownership, recent years have witnessed an increasing trend towards coordinated media experiences across multiple devices. The quality of experience (QoE) over such new generation applications is dictated by the quality of service (QoS) of underlying networks. Inevitable network delay and bandwidth fluctuations affect the communications and media synchronization between connected devices. Therefore, network measurement is becoming the key to providing essential information for the QoE assurance of cross-device synchronized media. Amongst many network measurement techniques, packet probing is considered as the most effective for end-to-end evaluations. Packet probing may seem straightforward, but it requires a good understanding of the methodologies and how the results should be interpreted. This chapter provides a guide and some best practices in packet probing, accompanied with a use case where delay measurement enhances cross-device media synchronization and the QoE of an immersive media application.

Mu Mu, Hans Stokking, Frank den Hartog

Backmatter

Title: MediaSync
Editors: Mario Montagud
Pablo Cesar
Prof. Fernando Boronat
Jack Jansen
Publisher: Springer International Publishing
Electronic ISBN: 978-3-319-65840-7
Print ISBN: 978-3-319-65839-1
DOI: https://doi.org/10.1007/978-3-319-65840-7

Springer Professional

About this book

Table of Contents

Frontmatter

Foundations

Frontmatter

1. Introduction to Media Synchronization (MediaSync)

Chapter 2. Evolution of Temporal Multimedia Synchronization Principles

Chapter 3. Theoretical Foundations: Formalized Temporal Models for Hyperlinked Multimedia Documents

4. Time, Frequency and Phase Synchronisation for Multimedia—Basics, Issues, Developments and Opportunities

Applications, Use Cases, and Requirements

Frontmatter

5. Simultaneous Output-Timing Control in Networked Games and Virtual Environments

Chapter 6. Automated Video Mashups: Research and Challenges

7. MediaSynch Issues for Computer-Supported Cooperative Work

User Experience and Evaluation Methodologies

Frontmatter

Chapter 8. Perceiving, Interacting and Playing with Multimedia Delays

9. Methods for Human-Centered Evaluation of MediaSync in Real-Time Communication

10. Synchronization for Secondary Screens and Social TV: User Experience Aspects

11. Media Synchronization in Networked Multisensory Applications with Haptics

12. Olfaction-Enhanced Multimedia Synchronization

Document Formats and Standards

Frontmatter

13. SMIL: Synchronized Multimedia Integration Language

14. Specifying Intermedia Synchronization with a Domain-Specific Language: The Nested Context Language (NCL)

15. Time and Timing Within MPEG Standards

16. Synchronization in MPEG-4 Systems

Chapter 17. Media Synchronization on the Web

18. Media Synchronisation for Television Services Through HbbTV

Algorithms, Protocols and Techniques

Frontmatter

Chapter 19. Video Delivery and Challenges: TV, Broadcast and Over The Top

Chapter 20. Camera Synchronization for Panoramic Videos

Chapter 21. Merge and Forward: A Self-Organized Inter-Destination Media Synchronization Scheme for Adaptive Media Streaming over HTTP

Chapter 22. Watermarking and Fingerprinting

23. Network Delay and Bandwidth Estimation for Cross-Device Synchronized Media

Backmatter