Skip to main content
main-content

Über dieses Buch

This book constitutes the refereed proceedings of the 8th International Conference on Adaptive Multimedia Retrieval, AMR 2010, held in Linz, Austria, in August 2010. The 14 revised full papers and the invited contribution presented were carefully reviewed and selected from numerous submissions. The papers are organized in topical sections on Context-based personalization; media information fusion; video retrieval; audio and music retrieval; adaptive similarities; and finding and organizing.

Inhaltsverzeichnis

Frontmatter

Invited Contribution

Towards a Storytelling Approach for Novel Artist Recommendations

The Semantic Web offers huge amounts of structured and linked data about various different kinds of resources. We propose to use this data for music recommender systems following a storytelling approach. Beyond similarity of audio content and user preference profiles, recommender systems based on Semantic Web data offer opportunities to detect similarities between artists based on their biographies, musical activities, etc. In this paper we present an approach determining similar artists based on freely available metadata from the Semantic Web. An evaluation experiment has shown that our approach leads to more high quality novel artist recommendations than well-known systems such as Last.fm and Echo Nest. However the overall recommendation accuracy leaves room for further improvement.
Stephan Baumann, Rafael Schirru, Bernhard Streit

Context-Based Personalization

A Survey of Context-Aware Cross-Digital Library Personalization

The constant interaction of users with different Digital Libraries (DLs) and the subsequent scattering of user information across them raise the need not only for Digital Library interoperability but also for cross-Digital Library personalization. The latter calls for sharing and combining of user-information across different DL systems so that a DL system may take advantage of data from others. To achieve this goal, DL systems should be able to maintain compliant and interoperable user models and profiles that enable propagation and reconciliation of user information across different DLs. In this paper, we motivate the need for cross-Digital Library personalization, we define and examine user model, profile, and context interoperability, and we survey and discuss existing user model interoperability approaches.
Ana Nika, Tiziana Catarci, Yannis Ioannidis, Akrivi Katifori, Georgia Koutrika, Natalia Manola, Andreas Nürnberger, Manfred Thaller

An Ontology-Based Approach of Multimedia Information Personalized Search

This paper discusses and provides a solution for the problem of adopting ontologies in order to model the users and the multimedia documents and to develop personalized search functionalities. First, the existing approaches that enable ontology-based semantic description of multimedia content are discussed. Then, current ontology-based solutions for personalized search functionalities inside adaptive hypermedia systems are presented. Our solution is exposed further, including the multimedia document model, the user profile development and the algorithmic solution that enables to provide personalized results to a user query.
Mihaela Brut, Florence Sedes

Media Information Fusion

Approaching Multimedia Retrieval from a Polyrepresentative Perspective

Multimedia documents such as videos, images, or music are characterized by an amount of different qualities that can become relevant during a search task. These qualities are seldom reflected as a whole by retrieval models. Thus, we present a new query model, which fully supports the principle of polyrepresentation by taking advantage of quantum logic. We offer means to model document relevance as a cognitive overlap from various features describing a multimedia document internally. Using our query model, the combination of the aforementioned polyrepresentative features is supported by the mechanisms of a Boolean algebra. In addition, these overlaps can be personalized by user preferences during a machine-based learning supported relevance feedback process. The input for the relevance feedback is based on qualitative judgments between documents, which are known from daily life, to keep the cognitive load on users low.
We further discuss how our model contributes to the unification of different aspects of polyrepresentation into one sound theory.
David Zellhöfer, Ingo Schmitt

Knowledge Based Multimodal Result Fusion for Distributed and Heterogeneous Multimedia Environments: Concept and Ideas

Distributed multimedia retrieval (DMR) is a key issue in today’s information systems. One problem in DMR is the fusion of results retrieved from multiple locations, which is required in order to present the results in an integrated, consolidated and aligned form. This paper sketches a distributed multimedia retrieval scenario in which requirements, a conceptualization and an implementation of a knowledge-based data fusion approach is proposed. The approach is to be used together with the MPEG Query Format and is supported by benchmarks and classifications to derive knowledge used in result fusion.
Florian Stegmaier, Tobias Bürger, Mario Döller, Harald Kosch

Video Retrieval

A Contour-Color-Action Approach to Automatic Classification of Several Common Video Genres

We address the issue of automatic video genre retrieval. We propose three categories of content descriptors, extracted at temporal, color and structural level. At temporal level, video content is described with visual rhythm, action content and amount of gradual transitions. Colors are globally described with statistics of color distribution, elementary hues, color properties and relationship. Finally, structural information is extracted at image level and histograms are built to describe contour segments and their relations. The proposed parameters are used to classify 7 common video genres, namely: animated movies/cartoons, commercials, documentaries, movies, music clips, news and sports. Experimental tests using several classification techniques and more than 91 hours of video footage prove the potential of these parameters to the indexing task: despite the similarity in semantic content of several genres, we achieve detection ratios ranging between 80 − 100%.
Bogdan E. Ionescu, Christoph Rasche, Constantin Vertan, Patrick Lambert

Differences in Video Search Behavior between Novices and Archivists

Improving users’ interactions with a video retrieval system requires the examination of the search behavior of real users. This paper presents a study that examines and compares the video search behavior of professional archivists and novice users. The comparison focuses on the use and effectiveness of different state-of-the-art video search methods offered by the VITALAS retrieval system, and also on the behavior of the two user groups during their interactions with the retrieval results. We conducted our experiments in the context of TRECVID’s 2009 interactive search task, using the provided collection and topics for our evaluation. The findings are based on a qualitative questionnaire analysis and a quantitative examination of the logged user actions on the search interface. The experimental results indicate that today’s visual search techniques have improved in effectiveness, confirming a trend found in previous user studies. To our surprise, professional archivists used visual concept search in many of their searches. Queries containing visual concepts were more effective, resulting in more relevant shots found than the alternative methods. Overall, we conclude that professional archivists are more focused on recall when carrying out their search tasks and are better at reflecting on their own search performance.
Henning Rode, Theodora Tsikrika, Arjen P. de Vries

An Affect-Based Video Retrieval System with Open Vocabulary Querying

Content-based video retrieval systems (CBVR) are creating new search and browse capabilities using metadata describing significant features of the data. An often overlooked aspect of human interpretation of multimedia data is the affective dimension. Incorporating affective information into multimedia metadata can potentially enable search using this alternative interpretation of multimedia content. Recent work has described methods to automatically assign affective labels to multimedia data using various approaches. However, the subjective and imprecise nature of affective labels makes it difficult to bridge the semantic gap between system-detected labels and user expression of information requirements in multimedia retrieval. We present a novel affect-based video retrieval system incorporating an open-vocabulary query stage based on WordNet enabling search using an unrestricted query vocabulary. The system performs automatic annotation of video data with labels of well defined affective terms. In retrieval annotated documents are ranked using the standard Okapi retrieval model based on open-vocabulary text queries. We present experimental results examining the behaviour of the system for retrieval of a collection of automatically annotated feature films of different genres. Our results indicate that affective annotation can potentially provide useful augmentation to more traditional objective content description in multimedia retrieval.
Ching Hau Chan, Gareth J. F. Jones

Audio and Music Retrieval

A Comparison of Human, Automatic and Collaborative Music Genre Classification and User Centric Evaluation of Genre Classification Systems

In this paper two sets of evaluation experiments are conducted. First, we compare state-of-the-art automatic music genre classification algorithms to human performance on the same dataset, via a listening experiment. This will show that the improvements of content-based systems over the last years have reduced the gap between automatic and human classification performance, but could not yet close this gap. As an important extension to previous work in this context, we will also compare the automatic and human classification performance to a collaborative approach. Second, we propose two evaluation metrics, called user scores, that are based on the votes of the participants of the listening experiment. This user centric evaluation approach allows to get rid of predefined ground truth annotations and allows to account for the ambiguous human perception of musical genre. To take genre ambiguities into account is an important advantage with respect to the evaluation of content-based systems, especially since the dataset compiled in this work (both the audio files and collected votes) are publicly available.
Klaus Seyerlehner, Gerhard Widmer, Peter Knees

Clubmixer: A Presentation Platform for MIR Projects

Evaluating solutions to many music IR problems – such as playlist generation, music similarity – in absence of formal evaluation measures frequently requires user studies to establish the benefits of one solution over the other. Building an according application framework to deploy and test user responses is a cumbersome and complex task. We present Clubmixer - an advanced client-server based audio system that could serve MIR researchers as presentation and prototyping platform. The project aims at providing a software framework that minimizes the effort of creating MIR based solutions. The open architecture and the use of open standards provide high flexibility for several MIR related areas (e.g. content based retrieval, collaborative retrieval, etc.). We describe the current state of the system and outline the main functionality as well as the advantages of Clubmixer for MIR research.
Alexander Schindler, Andreas Rauber

Adaptive Similarities

Similarity Adaptation in an Exploratory Retrieval Scenario

Sometimes users of a multimedia retrieval system are not able to explicitly state their information need. They rather want to browse a collection in order to get an overview and to discover interesting content. Exploratory retrieval tools support users in search scenarios where the retrieval goal cannot be stated explicitly as a query or user rather want to browse a collection in order to get an overview and to discover interesting content. In previous work, we have presented Adaptive SpringLens – an interactive visualization technique building upon popular neighborhood-preserving projections of multimedia collections. It uses a complex multi-focus fish-eye distortion of a projection to visualize neighborhood that is automatically adapted to the user’s current focus of interest. This paper investigates how far knowledge about the retrieval task collected during interaction can be used to adapt the underlying similarity measure that defines the neighborhoods.
Sebastian Stober, Andreas Nürnberger

Similarity Query Postprocessing by Ranking

Current multimedia search technology is, especially in commercial applications, heavily based on text annotations. However, there are many applications such as image hosting web sites (e.g. Flickr or Picasa) where the text metadata are of poor quality in general. Searching such collections only by text gives usually rather unsatisfactory results. On the other hand, multimedia retrieval systems based purely on content can retrieve visually similar results but lag behind with the ability to grasp the semantics expressed by text annotations. In this paper, we propose various ranking techniques that can be transparently applied on any content-based retrieval system in order to improve the search results quality and user satisfaction. We demonstrate the usefulness of the approach on two large real-life datasets indexed by the MUFIN system. The improvement of the ranked results was evaluated by real users using an online survey.
Petra Budikova, Michal Batko, Pavel Zezula

Finding and Organizing

Proximity-Based Order-Respecting Intersection for Searching in Image Databases

As the volume of non-textual data, such images and other multimedia data, available on Internet is increasing. The issue of identifying data items based on query containment rather than query equality is becoming more and more important. In this paper, we propose a solution to this problem. We assume local descriptors are extracted from data items, so the aforementioned problem reduces to finding data items that share as many as possible local descriptors with the query. In particular, we define a new ε-intersection for this purpose. Local descriptors usually contain the location of the descriptors, so the proposed solution takes them into account to increase effectiveness of searching. We evaluate the ε-intersection on two real-life image collections using SIFT and SURF local descriptors from both effectiveness and efficiency points of view. Moreover, we study the influence of individual parameters of the ε-intersection to query results.
Tomas Homola, Vlastislav Dohnal, Pavel Zezula

Experiences with Shape Classification through Fuzzy c-Means Using Geometrical and Moments Descriptors

Due to the growing diffusion of digital media, most of real world applications have data with multiple modalities, from multiple sources and in multiple formats. The modelling of information coming from multimedia sources represents an important issue for applications which achieve multimedia mining activities. In particular, the last decades have witnessed great interest in image processing by “mining” visual information for objects recognition and retrieval. Some studies have revealed the image disambiguation based on the shape produces better results than features such as color or texture; moreover, the classification of objects extracted from an image database appears more intuitively formulated as a shape classification task.
This paper presents an approach for 2D shapes classification. The approach is based on the combined use of geometrical and moments features extracted by a given collection of images and achieves shape-based classification exploiting fuzzy clustering techniques.
Ugo Erra, Sabrina Senatore

Quantum Logic Based MPEG Query Format Algebra

The need for fast processing of query requests in multimedia retrieval systems is apparent. One basis for optimization is the formalization of the corresponding query language by a respective algebra. Furthermore, an algebra is important for demonstrating the profoundness and validity of a query language. In this context, the article contributes a formal semantics model for the novel standardized MPEG Query Format for multimedia search. In addition to the specification of its syntax and semantics, our quantum logic approach for fuzzy retrieval on behalf of the formal model is discussed. Besides the validity of our formalization is demonstrated on some examples, the advantages as well as the shortcomings of the query format are discussed.
Mario Döller, Sebastian Lehrack, Harald Kosch, Ingo Schmitt

Backmatter

Weitere Informationen

Premium Partner

BranchenIndex Online

Die B2B-Firmensuche für Industrie und Wirtschaft: Kostenfrei in Firmenprofilen nach Lieferanten, Herstellern, Dienstleistern und Händlern recherchieren.

Whitepaper

- ANZEIGE -

Best Practices für die Mitarbeiter-Partizipation in der Produktentwicklung

Unternehmen haben das Innovationspotenzial der eigenen Mitarbeiter auch außerhalb der F&E-Abteilung erkannt. Viele Initiativen zur Partizipation scheitern in der Praxis jedoch häufig. Lesen Sie hier  - basierend auf einer qualitativ-explorativen Expertenstudie - mehr über die wesentlichen Problemfelder der mitarbeiterzentrierten Produktentwicklung und profitieren Sie von konkreten Handlungsempfehlungen aus der Praxis.
Jetzt gratis downloaden!

Bildnachweise