Skip to main content

2010 | Buch

Advances in Music Information Retrieval

herausgegeben von: Zbigniew W. Raś, Alicja A. Wieczorkowska

Verlag: Springer Berlin Heidelberg

Buchreihe : Studies in Computational Intelligence

insite
SUCHEN

Über dieses Buch

Sound waves propagate through various media, and allow communication or entertainment for us, humans. Music we hear or create can be perceived in such aspects as rhythm, melody, harmony, timbre, or mood. All these elements of music can be of interest for users of music information retrieval systems. Since vast music repositories are available for everyone in everyday use (both in private collections, and in the Internet), it is desirable and becomes necessary to browse music collections by contents. Therefore, music information retrieval can be potentially of interest for every user of computers and the Internet. There is a lot of research performed in music information retrieval domain, and the outcomes, as well as trends in this research, are certainly worth popularizing. This idea motivated us to prepare the book on Advances in Music Information Retrieval.

It is divided into four sections: MIR Methods and Platforms, Harmony, Music Similarity, and Content Based Identification and Retrieval. Glossary of basic terms is given at the end of the book, to familiarize readers with vocabulary referring to music information retrieval.

Inhaltsverzeichnis

Frontmatter

Music Information Retrieval: Indexing, Representations, and Platforms

Frontmatter
Indexing Techniques for Non-metric Music Dissimilarity Measures
Abstract
Many dissimilarity measures suitable for music retrieval do not satisfy all properties of a metric. This rules out the use of many established indexing structures, most of which rely on metricity. In this chapter, we give an overview of some existing approaches to building an indexing structure that makes efficient retrieval possible even if the underlying dissimilarity measure is not a metric.
For symmetric prametrics with metric subspaces, a tunneling technique allows one to search a non-metric space efficiently without false negatives. We give a detailed example for this case. In a query-by-example scenario, if queries are already part of a collection, and the triangle inequality is violated, one can enforce it in subsets of the collection by adding a small constant to the distance measure (Linear Constant Embedding). By embedding a non-metric distance function into a metric space in a way that preserves the ordering induced by the function on any query, one can make indexing methods applicable that usually only work for metrics (TriGen). Also, we present several probabilistic methods, including distance based hashing (DBH), clustering (DynDex), and a tree structure with pointers to near neighbours (SASH).
Rainer Typke, Agatha Walczak-Typke
Clustering Driven Cascade Classifiers for Multi-indexing of Polyphonic Music by Instruments
Abstract
Recognition and separation of sounds played by various instruments is very useful in labeling audio files with semantic information. Numerous approaches on acoustic feature extraction have already been proposed for timbre recognition. Unfortunately, none of these monophonic timbre estimation algorithms can be successfully applied to polyphonic sounds, which are more usual cases in the real music world. This has stimulated the research on a hierarchically structured cascade classification system under the inspiration of the human perceptual process. This cascade classification system makes first estimate on the higher level of the decision attribute, which stands for the musical instrument family. Then, the further estimation is done within that specific family range. However, the traditional hierarchical structures were constructed in human semantics, which are meaningful from human perspective but not appropriate for the cascade system. We introduce the new hierarchical instrument schema according to the clustering results of the acoustic features. This new schema better describes the similarity among different instruments or among different playing techniques of the same instrument. The classification results show a higher accuracy of cascade system with the new schema compared to the traditional schemas.
Wenxin Jiang, Zbigniew W. Raś, Alicja A. Wieczorkowska
Representations of Music in Ranking Rhythmic Hypotheses
Abstract
The chapter presents first the main issues related to music information retrieval (MIR) domain. Within this domain, there exists a variety of approaches to musical instrument recognition, musical phrase classification, melody classification (e.g. query-by-humming systems), rhythm retrieval, retrieval of high-level-musical features such as looking for emotions in music or differences in expressiveness, music search based on listeners’ preferences, etc. The objective of this study is to propose a method for retrieval of hypermetric rhythm on the basis of melody. A stream of sounds in MIDI format is introduced at the system input. On the basis of a musical content the method retrieves a hypermetric structure of rhythm of a musical piece consisting of rhythmic motives, phrases, and sentences. On the basis of the hypermetric structure retrieved, a system capable of creating automatic drum accompaniment to a given melody supporting the composition is proposed. A method does not use any information about rhythm (time signature), which is often included in MIDI information. Neither rhythmic tracks nor harmonic information are used in this method. The only information analyzed is a melody, which may be monophonic as well as polyphonic. The analysis starts after the entire piece has been played. Recurrence of melodic and rhythmic patterns and the rhythmic salience of sounds are combined to create an algorithm that finds the metric structure of rhythm in a given melody.
Jaroslaw Wojcik, Bozena Kostek
Mid-level Representations of Musical Audio Signals for Music Information Retrieval
Abstract
In this chapter, we introduce mid-level representations of music for content-based music information retrieval (MIR). Although low-level features such as spectral and cepstral features were widely used for audio-based MIR, the necessity for developing more musically meaningful representations has recently been recognized. Here, we review attempts of exploring new representations of music based on this motivation. Such representations are called mid − level representations because they have levels of abstraction between those of waveform representations and MIDI-like symbolic representations.
Tetsuro Kitahara
The Music Information Retrieval Evaluation eXchange: Some Observations and Insights
Abstract
Advances in the science and technology of Music Information Retrieval (MIR) systems and algorithms are dependent on the development of rigorous measures of accuracy and performance such that meaningful comparisons among current and novel approaches can be made. This is the motivating principle driving the efforts of the International Music Information Retrieval Systems Evaluation Laboratory (IMIRSEL) and the annual Music Information Retrieval Evaluation eXchange (MIREX). Since it started in 2005, MIREX has fostered great advancements not only in many specific areas of MIR, but also in our general understanding of how MIR systems and algorithms are to be evaluated. This chapter outlines some of the major highlights of the past four years of MIREX evaluations, including its organizing principles, the selection of evaluation metrics, and the evolution of evaluation tasks. The chapter concludes with a brief introduction of how MIREX plans to expand into the future using a suite of Web 2.0 technologies to automated MIREX evaluations.
J. Stephen Downie, Andreas F. Ehmann, Mert Bay, M. Cameron Jones

Harmony

Frontmatter
Chord Analysis Using Ensemble Constraints
Abstract
Many applications in music information retrieval require the analysis of the harmonic structure of a music piece. In Western music, the harmonic structure can be often be well illustrated by the chord structure and sequence. This chapter presents a technique of disambiguation for chord recognition based on a priori knowledge of probabilities of voicings of the chord in a specific musical medium. The main motivating example is guitar chord recognition, where the physical layout and structure of the instrument, along with human physical and temporal constraints, make certain chord voicings and chord sequences more likely than others, and make some impossible. Pitch classes are extracted, and chords are then recognized using pattern recognition techniques. The chord information is then analyzed using an array of voicing vectors indicating likelihood for chord voicings based on constraints of the instrument. Chord sequence analysis is used to reinforce accuracy of individual chord estimations. The specific notes of the chord are then inferred by combining the chord information and the best estimated voicing of the chord.
David Gerhard, Xinglin Zhang
BREVE: An HMPerceptron-Based Chord Recognition System
Abstract
Tonal harmony analysis is a sophisticated task. It combines general knowledge with contextual cues, and it is concerned with faceted and evolving objects such as musical language, execution style and taste. We present Breve, a system for performing a particular kind of harmony analysis, chord recognition: music is encoded as a sequence of sounding events and the system should assing the appropriate chord label to each event. The solution proposed to the problem relies on a conditional model, where domain knowledge is encoded in the form of Boolean features. Breve exploits the recently proposed algorithm CarpeDiem to obtain significant computational gains in solving the optimization problem underlying the classification process. The implemented system has been validated on a corpus of chorales from J.S. Bach: we report and discuss the learnt weights, point out the committed errors, and elaborate on the correlation between errors and growth in the classification times in places where the music is less clearly asserted.
Daniele P. Radicioni, Roberto Esposito
Analysis of Chord Progression Data
Abstract
Harmony is an important component in music. Chord progressions, which represent harmonic changes of music with understandable notations, have been used in popular music and Jazz. This article explores the question of whether a chord progression can be summarized for music retrieval. Various possibilities for chord progression simplification schemes, N-gram construction schemes, and distance functions are explored. Experiments demonstrate that such profiles can be used for artist grouping and for composition retrieval via top-k queries.
Brandt Absolu, Tao Li, Mitsunori Ogihara

Content-Based Identification and Retrieval of Musical Information

Frontmatter
Statistical Music Modeling Aimed at Identification and Alignment
Abstract
This paper describes a methodology for the statistical modeling of music works. Starting from either the representation of the symbolic score or the audio recording of a performance, a hidden Markov model is built to represent the corresponding music work. The model can be used to identify unknown recordings and to align them with the corresponding score. Experimental evaluation using a collection of classical music recordings showed that this approach is effective in terms of both identification and alignment. The methodology can be exploited as the core component for a set of tools aimed at accessing and actively listening to a music collection.
Riccardo Miotto, Nicola Montecchio, Nicola Orio
Harmonic and Percussive Sound Separation and Its Application to MIR-Related Tasks
Abstract
In this chapter, we present a simple and fast method to separate a monaural audio signal into harmonic and percussive components, which leads to a useful pre-processing for MIR-related tasks. Exploiting the anisotropies of the power spectrograms of harmonic and percussive components, we define objective functions based on spectrogram gradients, and, applying to them the auxiliary function approach, we derive simple and fast update equations which guarantee the decrease of the objective function at each iteration. We show experimental results for sound separation on popular and jazz music pieces, and also present the application of the proposed technique to automatic chord recognition and rhythm-pattern extraction.
Nobutaka Ono, Kenichi Miyamoto, Hirokazu Kameoka, Jonathan Le Roux, Yuuki Uchiyama, Emiru Tsunoo, Takuya Nishimoto, Shigeki Sagayama
Violin Sound Quality: Expert Judgements and Objective Measurements
Abstract
Searching for objective and subjective parameters is very important in automatic classification of multimedia databases containing recordings of musical instruments sounds. This paper describes these parameters and methods of obtaining them for a violin sound. The objective parameters are violin modes with their properties: frequency and mutual energy factor. The subjective parameter is evaluation of sound quality done by experts. Based on violin modes parameters, expert judgements and harmony perception, a sound quality classifier was created. The estimated value of sound quality evaluation is consistent with expert judgements for 75.5% of instruments from AMATI multimedia database containing recordings of violins from 10th International Henryk Wieniawski Violin Maker Competition.
Piotr Wrzeciono, Krzysztof Marasek
Emotion Based MIDI Files Retrieval System
Abstract
This chapter presents a query answering system (QAS) associated with MIDI music database and a query language which atomic expressions represent various types of emotions. System for automatic indexing of music by emotions is one of the main modules of QAS. Its construction required building a training database, manual indexation of learning instances, finding a collection of features describing musical segments, and finally building classifiers. A hierarchical model of emotions consisting of two levels, L1 and L2, was used. A collection of harmonic and rhythmic attributes extracted from music files allowed emotion detection in music with an average of 83% accuracy at the level L1. The presented QAS is a collection of personalized search engines (PSE), each one based on a personalized system for automatic indexing of music by emotions. In order to use QAS, user profile has to be built and compared to representative profiles of PSEs. The nearest one is identified and used in answering user query.
Jacek Grekow, Zbigniew W. Raś
On Search for Emotion in Hindusthani Vocal Music
Abstract
Emotions give meaning to our lives. No aspect of our mental life is more important to the quality and meaning of our existence than emotions. Theymake life worth living, or sometimes ending. The English word ‘emotion’ is derived from the French word mouvoir which means ‘move’. Great classical philosophers-Plato, Aristotle, Spinoza, Descartes conceived emotion as responses to certain sorts of events triggering bodily changes and typically motivating characteristic behavior. It is difficult to find a consensus on the definition of emotion [9].
Alicja A. Wieczorkowska, Ashoke Kumar Datta, Ranjan Sengupta, Nityananda Dey, Bhaswati Mukherjee

Music Similarity

Frontmatter
Audio Cover Song Identification and Similarity: Background, Approaches, Evaluation, and Beyond
Abstract
A cover version is an alternative rendition of a previously recorded song. Given that a cover may differ from the original song in timbre, tempo, structure, key, arrangement, or language of the vocals, automatically identifying cover songs in a given music collection is a rather difficult task. The music information retrieval (MIR) community has paid much attention to this task in recent years and many approaches have been proposed. This chapter comprehensively summarizes the work done in cover song identification while encompassing the background related to this area of research. The most promising strategies are reviewed and qualitatively compared under a common framework, and their evaluation methodologies are critically assessed. A discussion on the remaining open issues and future lines of research closes the chapter.
Joan Serrà, Emilia Gómez, Perfecto Herrera
Multimodal Aspects of Music Retrieval: Audio, Song Lyrics – and Beyond?
Abstract
Music retrieval is predominantly seen as a problem to be tackled in the acoustic domain. With the exception of symbolic music retrieval and score-based systems, which form rather separate sub-disciplines on their own, most approaches to retrieve recordings of music by content rely on different features extracted from the audio signal. Music is subsequently retrieved by similarity matching, or classified into genre, instrumentation, artist or other categories. Yet, music is an inherently multimodal type of data. Apart from purely instrumental pieces, the lyrics associated with the music are as essential to the reception and the message of a song as is the audio.Albumcovers are carefully designed by artists to convey a message that is consistent with the message sent by the music on the album as well as by the image of a band in general. Music videos, fan sites and other sources of information add to that in a usually coherent manner. This paper takes a look at recent developments in multimodal analysis of music. It discusses different types of information sources available, stressing the multimodal character of music. It then reviews some features that may be extracted from those sources, focussing particularly on audio and lyrics as sources of information. Experimental results on different collections and categorisation tasks will round off the chapter. It shows the merits and open issues to be addressed to fully benefit from the rich and complex information space that music creates.
Rudolf Mayer, Andreas Rauber
Melodic Grouping in Music Information Retrieval: New Methods and Applications
Abstract
We introduce the MIR task of segmenting melodies into phrases, summarise the musicological and psychological background to the task and review existing computational methods before presenting a new model, IDyOM, for melodic segmentation based on statistical learning and information-dynamic analysis. The performance of the model is compared to several existing algorithms in predicting the annotated phrase boundaries in a large corpus of folk music. The results indicate that four algorithms produce acceptable results: one of these is the IDyOM model which performs much better than naive statistical models and approaches the performance of the best-performing rule-based models. Further slight performance improvement can be obtained by combining the output of the four algorithms in a hybrid model, although the performance of this model is moderate at best, leaving a great deal of room for improvement on this task.
Marcus T. Pearce, Daniel Müllensiefen, Geraint A. Wiggins
Automatic Musical Genre Classification and Artificial Immune Recognition System
Abstract
Artificial Immune Recognition System (AIRS) has been shown to be an effective classifier for several machine learning problems. In this study, AIRS is investigated as a classifier for musical genres from differing cultures. Musical data of two cultures were used - Traditional Malay Music (TMM) and Latin Music (LM). The performance of AIRS for the classification of these genres was compared with performances using several commonly used classifiers. The best classification accuracy for TMM was obtained using AIRS and was comparable, almost similar, to the performance obtained with the popular classifiers. However, the performance of AIRS for LM genre classification was shown to be not promising.
Shyamala Doraisamy, Shahram Golzari
Backmatter
Metadaten
Titel
Advances in Music Information Retrieval
herausgegeben von
Zbigniew W. Raś
Alicja A. Wieczorkowska
Copyright-Jahr
2010
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-11674-2
Print ISBN
978-3-642-11673-5
DOI
https://doi.org/10.1007/978-3-642-11674-2

    Marktübersichten

    Die im Laufe eines Jahres in der „adhäsion“ veröffentlichten Marktübersichten helfen Anwendern verschiedenster Branchen, sich einen gezielten Überblick über Lieferantenangebote zu verschaffen.