Skip to main content

Über dieses Buch

This book constitutes the refereed proceedings of the 13th International Symposium on Music Technology with Swing, CMMR 2017, held in Matosinhos, Portugal, in September 2017. The 44 full papers presented were selected from 64 submissions. The papers are grouped in eight sections: music information retrieval, automatic recognition, estimation and classification, electronic dance music and rhythm, computational musicology, sound in practice: auditory guidance and feedback in the context of motor learning and motor adaptation, human perception in multimodal context, cooperative music networks and musical HCIs, virtual and augmented reality, research and creation: spaces and modalities.



Music Information Retrieval, Automatic Recognition, Estimation and Classification


Automatic Recognition of Sound Categories from Their Vocal Imitation Using Audio Primitives Automatically Found by SI-PLCA and HMM

In this paper we study the automatic recognition of sound categories (such as fridge, mixers or sawing sounds) from their vocal imitations. Vocal imitations are made of a succession over time of sounds produced using vocal mechanisms that can largely differ from the ones used in speech. We develop here a recognition approach inspired by automatic-speech-recognition systems, with an acoustic model (that maps the audio signal to a set of probability over “phonemes”) and a language model (that represents the expected succession of “phonemes” for each sound category). Since we do not know what are the underlying “phonemes” of vocal imitations we propose to automatically estimate them using Shift-Invariant Probabilistic Latent Component Analysis (SI-PLCA) applied to a dataset of vocal imitations. The kernel distributions of the SI-PLCA are considered as the “phonemes” of vocal imitation and its impulse distributions are used to compute the emission probabilities of the states of a set of Hidden Markov Models (HMMs). To evaluate our proposal, we test it for a task of automatically recognizing 12 sound categories from their vocal imitations.

Enrico Marchetto, Geoffroy Peeters

Automatic Estimation of Harmonic Tension by Distributed Representation of Chords

The buildup and release of a sense of tension is one of the most essential aspects of the process of listening to music. A veridical computational model of perceived musical tension would be an important ingredient for many music informatics applications [27]. The present paper presents a new approach to modelling harmonic tension based on a distributed representation of chords. The starting hypothesis is that harmonic tension as perceived by human listeners is related, among other things, to the expectedness of harmonic units (chords) in their local harmonic context. We train a word2vec-type neural network to learn a vector space that captures contextual similarity and expectedness, and define a quantitative measure of harmonic tension on top of this. To assess the veridicality of the model, we compare its outputs on a number of well-defined chord classes and cadential contexts to results from pertinent empirical studies in music psychology. Statistical analysis shows that the model’s predictions conform very well with empirical evidence obtained from human listeners.

Ali Nikrang, David R. W. Sears, Gerhard Widmer

Automatic Music Genre Classification in Small and Ethnic Datasets

Automatic music genre classification commonly relies on a large amount of well-recorded data for model fitting. These conditions are frequently not met in ethnic music collections due to low media availability and ill recording environments. In this paper, we propose an automatic genre classification technique especially designed for small, noisy datasets. The proposed technique uses handcrafted features and a vote-based aggregation process. Its performance was evaluated over a Brazilian ethnic music dataset, showing that using the proposed technique produces higher F1 measures than using traditional data augmentation methods and state-of-the-art, Deep Learning-based methods. Therefore, our method can be used in automatic classification processes for small datasets, which can be helpful in the organization of ethnic music collections.

Tiago Fernandes Tavares, Juliano Henrique Foleiss

Music Genre Classification Revisited: An In-Depth Examination Guided by Music Experts

Despite their many identified shortcomings, music genres are still often used as ground truth and as a proxy for music similarity. In this work we therefore take another in-depth look at genre classification, this time with the help of music experts. In comparison to existing work, we aim at including the viewpoint of different stakeholders to investigate whether musicians and end-user music taxonomies agree on genre ground truth, through a user study among 20 professional and semi-professional music protagonists. We then compare the results of their genre judgments with different commercial taxonomies and with that of computational genre classification experiments, and discuss individual cases in detail. Our findings coincide with existing work and provide further evidence that a simple classification taxonomy is insufficient.

Haukur Pálmason, Björn Þór Jónsson, Markus Schedl, Peter Knees

Exploring Trends in Trinidad Steelband Music Through Computational Ethnomusicology

We present an interdisciplinary case study combining traditional and computational methodologies to study Trinidad steelband music in a collection of recordings of the annual Panorama competition spanning over 50 years. In particular, the ethnomusicology literature identifies a number of trends and hypotheses about this practice of music involving tempo, tuning, and dynamic range. Some of these are difficult to address with traditional, manual methodologies. We investigate these through the computational lens of Music Information Retrieval (MIR) methods. We find that the tempo range measured on our corpus is consistent with values reported in ethnomusicological literature, and add further details about how tempo has changed for the best judged performances at Panorama. With respect to the use of dynamics, we find limited usefulness of a standardised measures of loudness on these recordings. When it comes to judging the tuning frequency of the acoustic recordings, we find what looks to be a narrowing of the range, but these might be unreliable given the diversity of recording media over the past decades.

Elio Quinton, Florabelle Spielmann, Bob L. Sturm

k-Best Unit Selection Strategies for Musical Concatenative Synthesis

Concatenative synthesis is a sample-based approach to sound creation used frequently in speech synthesis and, increasingly, in musical contexts. Unit selection, a key component, is the process by which sounds are chosen from the corpus of samples. With their ability to match target units as well as preserve continuity, Hidden Markov Models are often chosen for this task, but one common criticism is its singular path output which is considered too restrictive when variations are desired. In this article, we propose considering the problem in terms of k-Best path solving for generating alternative lists of candidate solutions and summarise our implementations along with some practical examples.

Cárthach Ó Nuanáin, Perfecto Herrera, Sergi Jordá

Electronic Dance Music and Rhythm


Groove on the Brain

A unique feature of music is its potential to make us want to move our feet and bodies to the rhythm of the musical beat. Even though the ability to synchronize our movements to music feels as a completely natural music-related behavior to most humans (but see [1, 2] for rare cases of so-called beat-deafness in humans) this ability is rarely observed in animals [3], and usually depends on specific training regimes [4]. Our brains structure the musical beat into strong and weak beats even without any such information present in the auditory stimulus [5]. Furthermore, the tendency to move to a regular beat, with isochronous intervals, may persist even if the music that we listen to emphasizes musical events that lies between these beats as for syncopated rhythms [6] or in the case of polyrhythm [7, 8]. This indicates a cognitive discrepancy between what is heard – the rhythm - and the brain’s internal structuring of the beat – which in musicology is termed the meter.In the present paper, I shall argue that this discrepancy: (1) is related to prediction as a fundamental principle of brain processing, (2) gives rise to prediction error between lower - possibly sensory - and higher levels – possibly motor networks - in the hierarchical organized brain, and that (3) perception, learning and our inclination to move to the beat depends on the right balance between predictability and surprise. This predictive coding understanding of the brain mechanisms involved in movement related musical behavior may help us understand brain processes related to aesthetic experiences in general and in designing strategies for clinical intervention for patients with movement disorders.

Peter Vuust

Finding Drum Breaks in Digital Music Recordings

DJs and producers of sample-based electronic dance music (EDM) use breakbeats as an essential building block and rhythmic foundation for their artistic work. The practice of reusing and resequencing sampled drum breaks critically influenced modern musical genres such as hip hop, drum’n’bass, and jungle. While EDM artists have primarily sourced drum breaks from funk, soul, and jazz recordings from the 1960s to 1980s, they can potentially be sampled from music of any genre. In this paper, we introduce and formalize the task of automatically finding suitable drum breaks in music recordings. By adapting an approach previously used for singing voice detection, we establish a first baseline for drum break detection. Besides a quantitative evaluation, we discuss benefits and limitations of our procedure by considering a number of challenging examples.

Patricio López-Serrano, Christian Dittmar, Meinard Müller

Drum Rhythm Spaces: From Global Models to Style-Specific Maps

This paper presents two experiments carried out to find rhythm descriptors that allow the organization of drum patterns in spaces resembling subjects similarity sensations. We revisit rhythm spaces published by Alf Gabrielsson in 1973, based on subject similarity ratings of drum rhythms from an early drum machine, and construct a new rhythm space based on similarity judgments using contemporary electronic dance music (EDM) patterns. We observe how a specific set of descriptors can be used to reconstruct both Gabrielsson’s and the new EDM space, suggesting the descriptors capture drum similarity sensations in very different contexts. The set of descriptors and the methods employed are explained with detail and the possibility of having method for organizing rhythm patterns automatically is discussed.

Daniel Gómez-Marín, Sergi Jordà, Perfecto Herrera

Modulated Swing: Dynamic Rhythm Synthesis by Means of Frequency Modulation

Listening to swinging music you often want to move along with the rhythm. - We pose the question: How might the production of microtiming that characterizes swing be modelled? A fundamental idea in the present paper is to apply an interaction of oscillators to achieve alterations of frequencies that create timing deviations that are typical of live performances of rhythm. - Dynamic, time-dependent features are introduced and implemented in a model based on rhythmic frequency modulation, RFM, previously developed by the authors of this paper. We here exemplify the potential of this new, extended model by simulating various performances of swing in jazz, and we also indicate how the computer implementation of the RFM model might be an interesting tool of electro-acoustic music. Moreover, we discuss our model construction within the framework of event-based and emergent timing.

Carl Haakon Waadeland, Sigurd Saue

A Hierarchical Harmonic Mixing Method

We present a hierarchical harmonic mixing method for assisting users in the process of music mashup creation. Our main contributions are metrics for computing the harmonic compatibility between musical audio tracks at small- and large-scale structural levels, which combine and reassess existing perceptual relatedness (i.e., chroma vector similarity and key affinity) and dissonance-based approaches. Underpinning our harmonic compatibility metrics are harmonic indicators from the perceptually-motivated Tonal Interval Space, which we adapt to describe musical audio. An interactive visualization shows hierarchical harmonic compatibility viewpoints across all tracks in a large musical audio collection. An evaluation of our harmonic mixing method shows our adaption of the Tonal Interval Space robustly describes harmonic attributes of musical instrument sounds irrespective of timbral differences and demonstrates that the harmonic compatibility metrics comply with the principles embodied in Western tonal harmony to a greater extent than previous approaches.

Gilberto Bernardes, Matthew E. P. Davies, Carlos Guedes

Games Without Frontiers: Audio Games for Music Production and Performance

The authors explain a method by which electronic dance music can be produced in a similar manner to producing game audio, in which the timing of sound-events is relative to the actions of the player and the state of the game environment. An interactive or live piece can be considered an audio game, in which interacting sound-machines generate patterns of sound-events that place the performer/player in a virtual space. The performer/player pursues musical goals in non-linear time while maintaining the ability to arrange pieces in a coherent mix.

Jason Hockman, Joseph Thibodeau

Computational Musicology


Autonomous Composition as Search in a Conceptual Space: A Computational Creativity View

Computational Creativity (CC) is an emerging field of research that focuses on the study and exploitation of the computers’ potential to act as autonomous creators and co-creators. The field is a confluence point for contributions from multiple disciplines, such as Artificial Intelligence, which provides most of its methodological framework, and also Cognitive Science, Psychology, Social Sciences and Philosophy, as well as creative domains like the Arts, Music, Design, Poetry, etc.In this text, we briefly introduce some basic concepts and terminology of the field, as well as abstract models for characterising some common modes of creativity. We will illustrate how these concepts have been applied in recent times in the development of creative systems, particularly in the music domain. With this paper, we expect to contribute to facilitate communication between the CMMR and CC communities and foster synergies between them.

F. Amílcar Cardoso

On Linear Algebraic Representation of Time-span and Prolongational Trees

In constructive music theory, such as Schenkerian analysis and the Generative Theory of Tonal Music (GTTM), the hierarchical importance of pitch events is conveniently represented by a tree structure. Although a tree is easy to recognize and has high visibility, such an intuitive representation can hardly be treated in mathematical formalization. Especially in GTTM, the conjunction height of two branches is often arbitrary, contrary to the notion of hierarchy. Since a tree is a kind of graph, and a graph is often represented by a matrix, we show the linear algebraic representation of trees, specifying conjunction heights. Thereafter, we explain the ‘reachability’ between pitch events (corresponding to information about reduction) by the multiplication of matrices. In addition we discuss multiplication with vectors representing a sequence of harmonic functions, and suggest the notion of stability. Finally, we discuss operations between matrices to model compositional processes with simple algebraic operations.

Satoshi Tojo, Alan Marsden, Keiji Hirata

Four-Part Harmonization: Comparison of a Bayesian Network and a Recurrent Neural Network

In this paper, we compare four-part harmonization produced using two different machine learning models: a Bayesian network (BN) and a recurrent neural network (RNN). Four-part harmonization is widely known as a fundamental problem in harmonization, and various methods, especially based on probabilistic models such as a hidden Markov model, a weighted finite-state transducer, and a BN, have been proposed. Recently, a method using an RNN has also been proposed. In this paper, we conducted an experiment on four-part harmonization using the same data with both a BN and RNN and investigated the differences in the results between the models. The results show that these models have different tendencies. For example, the BN’s harmonies have less dissonance but especially the bass melodies are monotonous, while the RNN’s harmonies have more dissonance but especially bass melodies are smoother.

Tatsuro Yamada, Tetsuro Kitahara, Hiroaki Arie, Tetsuya Ogata

On Hierarchical Clustering of Spectrogram

We propose a new method of applying Generative Theory of Tonal Music directly to a spectrogram of music to produce a time-span segmentation as hierarchical clustering. We first consider a vertically long rectangle in a spectrogram (bin) as a pitch event and a spectrogram as a sequence of bins. The texture feature of a bin is extracted using a gray level co-occurrence matrix to generate a sequence of the texture features. The proximity and change of phrases are calculated by the distance between the adjacent bins by their texture features. The global structures such as parallelism and repetition are detected by a self-similarity matrix of a sequence of bins. We develop an algorithm which is given a sequence of the boundary strength between adjacent bins, iteratively merges adjacent bins in the bottom-up manner, and finally generates a dendrogram, which corresponds to a time-span segmentation. We conducted an experiment with inputting Mozart’s K.331 and K.550 and obtained promising results although the algorithm does not take into account almost any musical knowledge such as pitch and harmony.

Shun Sawada, Yoshinari Takegawa, Keiji Hirata

deepGTTM-III: Multi-task Learning with Grouping and Metrical Structures

This paper describes an analyzer that simultaneously learns grouping and metrical structures on the basis of the generative theory of tonal music (GTTM) by using a deep learning technique. GTTM is composed of four modules that are in series. GTTM has a feedback loop in which the former module uses the result of the latter module. However, as each module has been independent in previous GTTM analyzers, they did not form a feedback loop. For example, deepGTTM-I and deepGTTM-II independently learn grouping and metrical structures by using a deep learning technique. In light of this, we present deepGTTM-III, which is a new analyzer that includes the concept of feedback that enables simultaneous learning of grouping and metrical structures by integrating both deepGTTM-I and deepGTTM-II networks. The experimental results revealed that deepGTTM-III outperformed deepGTTM-I and had similar performance to deepGTTM-II.

Masatoshi Hamanaka, Keiji Hirata, Satoshi Tojo

Analyzing Music to Music Perceptual Contagion of Emotion in Clusters of Survey-Takers, Using a Novel Contagion Interface: A Case Study of Hindustani Classical Music

Music has strong potential to convey and elicit emotions, which are dependent on both context and antecedent stimuli. However, there is little research available on the impact of antecedent musical stimuli on emotion perception in consequent musical pieces, when one listens to a sequence of music clips with insignificant time lag. This work attempts to (a) understand how the perception of one music clip is affected by the perception of its antecedent clip and (b) find if there are any inherent patterns in the way people respond when exposed to music in sequence, with special reference to Hindustani Classical Music (HCM). We call this phenomenon of varying perceptions, the perceptual contagion of emotion in music. Findings suggest, when happy clips are preceded by sad and calm clips, perceived happiness increases. When sad clips are preceded by happy and calm clips, perceived sadness increases. Calm clips are perceived as happy and sad when preceded by happy clips and sad clips respectively. This suggests that antecedent musical stimuli have capacity to influence the perception of music that follows. It is also found that almost 85%–95% of people on average are affected by perceptual contagion – while listening to music in sequence – with varying degrees of influence.

Sanga Chaki, Sourangshu Bhattacharya, Raju Mullick, Priyadarshi Patnaik

Sound in Practice: Auditory Guidance and Feedback in the Context of Motor Learning and Motor Adaptation


Inscribing Bodies: Notating Gesture

This paper focuses on methods of transcribing the functions and activities of gesture, with a specific focus on embodiment, or how the interrelated roles of environment and the body shape mental process and experience. Through a process of repeated experiments with our own custom open-source 3D printed sensors, we illuminate the complexities of tracking even a single point over time, and distinguish between casual gesture and choreographed motion.

Emily Beattie, Margaret Schedel

Auditory Modulation of Multisensory Representations

Motor control and motor learning as well as interpersonal coordination are based on motor perception and emergent perceptuomotor representations. At least in early stages motor learning and interpersonal coordination are emerging heavily on visual information in terms of observing others and transforming the information into internal representations to guide owns behavior. With progressing learning, also other perceptual modalities are added when a new motor pattern is established by repeated physical exercises. In contrast to the vast majority of publications on motor learning and interpersonal coordination referring to a certain perceptual modality here we look at the perceptual system as a unitary system coordinating and unifying the information of all involved perceptual modalities. The relation between perceptual streams of different modalities, the intermodal processing and multisensory integration of information as a basis for motor control and learning will be the main focus of this contribution.Multi-/intermodal processing of perceptual streams results in multimodal representations and opens up new approaches to support motor learning and interpersonal coordination: Creating an additional perceptual stream adequately auditory movement information can be generated suitable to be integrated with information of other modalities and thereby modulating the resulting perceptuomotor representations without the need of attention and higher cognition. Here, the concept of a movement defined real-time acoustics is used to serve the auditory system in terms of an additional movement-auditory stream. Before the computational approach of kinematic real-time sonification is finally described, a special focus is directed to the level of adaptation modules of the internal models. Furthermore, this concept is compared with different approaches of additional acoustic movement information. Moreover, a perspective of this approach is given in a broad spectrum of new applications of supporting motor control and learning in sports and motor rehabilitation as well as a broad spectrum of joint action and interpersonal coordination between humans but also concerning human-robot-interaction.

Alfred O. Effenberg, Tong-Hun Hwang, Shashank Ghai, Gerd Schmitz

Music and Musical Sonification for the Rehabilitation of Parkinsonian Dysgraphia: Conceptual Framework

Music has been shown to enhance motor control in patients with Parkinson’s disease (PD). Notably, musical rhythm is perceived as an external auditory cue that helps PD patients to better control movements. The rationale of such effects is that motor control based on auditory guidance would activate a compensatory brain network that minimizes the recruitment of the defective pathway involving the basal ganglia. Would associating music to movement improve its perception and control in PD? Musical sonification consists in modifying in real-time the playback of a preselected music according to some movement parameters. The validation of such a method is underway for handwriting in PD patients. When confirmed, this study will strengthen the clinical interest of musical sonification in motor control and (re)learning in PD.

Lauriane Véron-Delor, Serge Pinto, Alexandre Eusebio, Jean-Luc Velay, Jérémy Danna

Investigating the Role of Auditory Feedback in a Multimodal Biking Experience

In this paper, we investigate the role of auditory feedback in affecting perception of effort while biking in a virtual environment. Subjects were biking on a stationary chair bike, while exposed to 3D renditions of a recumbent bike inside a virtual environment (VE). The VE simulated a park and was created in the Unity5 engine. While biking, subjects were exposed to 9 kinds of auditory feedback (3 amplitude levels with three different filters) which were continuously triggered corresponding to pedal speed, representing the sound of the wheels and bike/chain mechanics. Subjects were asked to rate the perception of exertion using the Borg RPE scale. Results of the experiment showed that most subjects perceived a difference in mechanical resistance from the bike between conditions, but did not consciously notice the variations of the auditory feedback, although these were significantly varied. This points towards interesting perspectives for subliminal perception potential for auditory feedback for VR exercise purposes.

Jon Ram Bruun Pedersen, Stefania Serafin, Francesco Grani

Considerations for Developing Sound in Golf Putting Experiments

This chapter presents the core interests and challenges of using sound for learning motor skills and describes the development of sonification techniques for three separate golf-putting experiments. These studies are part of the ANR SoniMove project, which aims to develop new Human Machine Interfaces (HMI) that provide gestural control of sound in the areas of sports and music. After a brief introduction to sonification and sound-movement studies, the following addresses the ideas and sound synthesis techniques developed for each experiment.

Benjamin O’Brien, Brett Juhas, Marta Bieńkiewicz, Laurent Pruvost, Frank Buloup, Lionel Bringnoux, Christophe Bourdin

Human Perception in Multimodal Context


Feel It in My Bones: Composing Multimodal Experience Through Tissue Conduction

We outline here the feasibility of coherently utilising tissue conduction for spatial audio and tactile input. Tissue conduction display-specific compositional concerns are discussed; it is hypothesised that the qualia available through this medium substantively differ from those for conventional artificial means of appealing to auditory spatial perception. The implications include that spatial music experienced in this manner constitutes a new kind of experience, and that the ground rules of composition are yet to be established. We refer to results from listening experiences with one hundred listeners in an unstructured attribute elicitation exercise, where prominent themes such as “strange”, “weird”, “positive”, “spatial” and “vibrations” emerged. We speculate on future directions aimed at taking maximal advantage of the principle of multimodal perception to broaden the informational bandwidth of the display system. Some implications for composition for hearing-impaired are elucidated.

Peter Lennox, Ian McKenzie, Michael Brown

Enriching Musical Interaction on Tactile Feedback Surfaces with Programmable Friction

In the recent years, a great interest has emerged to utilize tactile interfaces for musical interactions. These interfaces can be enhanced with tactile feedback on the user’s fingertip through various technologies, including programmable friction techniques. In this study, we use a qualitative approach to investigate the potential influence of these tactile feedback interfaces on user’s musical interaction. We have experimented three different mappings between the sound parameters and the tactile feedback in order to study the users’ experiences of a given task. Our preliminary findings suggest that friction-based tactile feedback is a useful tool to enrich musical interactions and learning.

Farzan Kalantari, Florent Berthaut, Laurent Grisoni

Assessing Sound Perception Through Vocal Imitations of Sounds that Evoke Movements and Materials

In this paper we studied a new approach to investigate sound perception. Assuming that a sound contains specific morphologies that convey perceptually relevant information responsible for its recognition, called invariants, we explored the possibility of a new method to determine such invariants, using vocal imitation. We conducted an experiment asking participants to imitate sounds evoking movements and materials generated through a sound synthesizer. Given that the sounds produced by the synthesizer were based on invariant structures, we aimed at retrieving this information from the imitations. Results showed that the participants were able to correctly imitate the dynamics of the sounds, i.e. the action-related information evoked by the sound, whereas texture-related information evoking the material of the sound source was less easily imitated.

Thomas Bordonné, Manuel Dias-Alves, Mitsuko Aramaki, Sølvi Ystad, Richard Kronland-Martinet

Exploration of Sonification Strategies for Guidance in a Blind Driving Game

This paper explores the use of continuous auditory display for a dynamic guidance task. Through a driving game with blindfolded players, the success and the efficiency of a lane-keeping task in which no visual feedback is provided is observed. The results highlight the importance of the display information and reveal that a task-related rather than an error-related feedback should be used to enable the driver to finish the circuit. In terms of sound strategies, a first experiment explores the effect of two complex strategies (pitch and modulations) combined with a basic stereo strategy that informs the user about the distance and the direction to the target. The second experiment examines the influence of morphological sound attributes on the performance compared to the use of the spatial sound attributes alone. The results reveal the advantage of using morphological sound attributes for such kinds of applications.

Gaëtan Parseihian, Mitsuko Aramaki, Sølvi Ystad, Richard Kronland-Martinet

DIGIT: A Digital Foley System to Generate Footstep Sounds

We present DIGItal sTeps (DIGIT), a system for assisting in the creation of footstep sounds in a post-production foley context—a practice that recreates all diegetic sounds for a moving image. The novelty behind DIGIT is the use of the acoustic (haptic) response of a gesture on a tangible interface as means for navigating and retrieving similar matches from a large database of annotated footstep sounds. While capturing the tactile expressiveness of the traditional sound foley practice in the exploration of physical objects, DIGIT streamlines the workflow of the audio post production environment for film or games by reducing its costly and time-consuming requirements.

Luis Aly, Rui Penha, Gilberto Bernardes

Cooperative Music Networks and Musical HCIs


Composing and Improvising. In Real Time

This paper presents a summary of my keynote address discussing the differences between real-time composition (RTC) and improvisation. A definition of real-time composition is presented, as well as a summary discussion of its theoretical framework. Finally, a comparison between RTC and improvisation is done taking into account Richard Ashley’s discussion of improvisation from a psychological perspective [1], which provides an interesting insight in this distinction. RTC is then redefined as improvised composition with computers, and the possibilities of RTC existing outside of computer music are also briefly addressed.

Carlos Guedes

bf-pd: Enabling Mediated Communication and Cooperation in Improvised Digital Orchestras

Digital musical instruments enable new musical collaboration possibilities, extending those of acoustic ensembles. However, the use of these new possibilities remains constrained due to a lack of a common terminology and technical framework for implementing them. Bf-pd is a new software library built in the PureData (Pd) language which enables communication and cooperation between digital instruments. It is based on the BOEUF conceptual framework which consists of a classification of modes of collaboration used in collective music performance, and a set of components which affords them. Bf-pd can be integrated into any digital instrument built in Pd, and provides a “collaboration window” from which musicians can easily view each others’ activity and share control of instrument parameters and other musical data. We evaluate the implementation and design of bf-pd through workshops and a preliminary study and discuss its impact on collaboration within improvised ensembles of digital instruments.

Luke Dahl, Florent Berthaut, Antoine Nau, Patricia Plenacoste

Visual Representations for Music Understanding Improvement

Classical music appreciation is non trivial. Visual representation can aid music teaching and learning processes. In this sense, we propose a set of visual representations based on musical notes features, such as: note type, octave, velocity and timbre. In our system, the visual elements appear along with their corresponding musical elements, in order to improve the perception of musical structures. The visual representations we use to enhance the comprehension of a composition could be extended to performing arts scenarios. It could be adopted as motion graphics during live musical performances. We have developed several videos to illustrate our method. We have also developed an ear training quiz and a research questionnaire. This material is available at .

Leandro Cruz, Vitor Rolla, Juliano Kestenberg, Luiz Velho

Melody Transformation with Semiotic Patterns

This paper presents a music generation method based on the extraction of a semiotic structure from a template piece followed by generation into this semiotic structure using a statistical model of a corpus. To describe the semiotic structure of a template piece, a pattern discovery method is applied, covering the template piece with significant patterns using melodic viewpoints at varying levels of abstraction. Melodies are generated into this structure using a stochastic optimization method. A selection of melodies was performed in a public concert, and audience evaluation results show that the method generates good coherent melodies.

Izaro Goienetxea, Darrell Conklin

The Comprovisador’s Real-Time Notation Interface (Extended Version)

Comprovisador is a system designed to enable real-time mediated soloist-ensemble interaction, through machine listening, algorithmic procedures and dynamic staff-based notation. It uses multiple networked computers – one host and several clients – to perform algorithmic compositional procedures with the music material improvised by a soloist and to coordinate the musical response of an ensemble. Algorithmic parameters are manipulated by a conductor/composer who mediates the interaction between soloist and ensemble, making compositional decisions in real-time. The present text, an extended version of a paper presented at CMMR 2018, in Matosinhos, focuses on the notation interface of this system, after overviewing its concept and structure. A discussion is made on how rehearsals and live performances impacted the development of the interface.

Pedro Louzeiro

JamSketch: Improvisation Support System with GA-Based Melody Creation from User’s Drawing

Improvisation is an enjoyable form of music performance but requires advanced skills and knowledge of music because the player has to create melodies immediately during the performance. To support improvisations by people without skills or knowledge of music, we have to develop (1) a human interface that can be used without skills or knowledge of music and (2) automatic melody generation from the user’s input that may be musically abstract or incomplete. In this paper, we develop an improvisation support system based on melodic outlines, which represent the overall contour of melodies, with a function of melody generation using a genetic algorithm (GA). Once the user draws a melodic outline on the piano-roll display with the mouse or touch screen, the system immediately generates a melody using a GA with a fitness function based on the similarity to the outline, an N-gram probability, and entropy. The generated melody is performed expressively based on expression parameters calculated with an machine learning approach. The results of listening tests for comparing human performances and the system’s performances suggest that generated melodies have quality similar to performances by non-expert human performers.

Tetsuro Kitahara, Sergio Giraldo, Rafael Ramírez

Virtual and Augmented Reality


Mobile Devices as Musical Instruments - State of the Art and Future Prospects

Mobile music making has established itself as a maturing field of inquiry, scholarship, and artistic practice. While mobile devices were used before the advent of the iPhone, its introduction no doubt drastically accelerated the field. We take a look at the current state of the art of mobile devices as musical instruments, and explore future prospects for research in the field.

Georg Essl, Sang Won Lee

Augmenting Virtuality with a Synchronized Dynamic Musical Instrument: A User Evaluation of a Mixed Reality MIDI Keyboard

As virtual reality gains popularity, technology which better integrates the user’s physical experience in the virtual environment is needed. Researchers have shown that by including real physical objects to interact with, the experience can be made significantly more convincing and user-friendly. To explore physically connecting the user to the virtual environment, we designed and developed a mixed reality MIDI keyboard. This project explores maintaining the user’s physical connection to the real world in order to align their senses with their true state of augmented virtuality. A user study of the keyboard was performed which verified its ability to improve the VR experience and identified areas for further research. In addition to producing a mixed reality MIDI keyboard, this project serves as a roadmap and foundation for future developments and investigations in integrating physical and virtual environments to improve immersion and presence.

John Desnoyers-Stewart, David Gerhard, Megan L. Smith

Toward Augmented Familiarity of the Audience with Digital Musical Instruments

The diversity and complexity of Digital Musical Instruments often lead to a reduced appreciation of live performances by the audience. This can be linked to the lack of familiarity they have with the instruments. We propose to increase this familiarity thanks to a transdisciplinary approach in which signals from both the musician and the audience are extracted, familiarity analyzed, and augmentations dynamically added to the instruments. We introduce a new decomposition of familiarity and the concept of correspondences between musical gestures and results. This paper is both a review of research that paves the way for the realization of a pipeline for augmented familiarity, and a call for future research on the identified challenges that remain before it can be implemented.

Olivier Capra, Florent Berthaut, Laurent Grisoni

Use the Force: Incorporating Touch Force Sensors into Mobile Music Interaction

The musical possibilities of force sensors on touchscreen devices are explored, using Apple’s 3D Touch. Three functions are selected to be controlled by force: (a) excitation, (b) modification (aftertouch), and (c) mode change. Excitation starts a note, modification alters a playing note, and mode change controls binary on/off sound parameters. Four instruments are designed using different combinations of force-sound mapping strategies. ForceKlick is a single button instrument that plays consecutive notes within one touch by altering touch force, by detecting force down-peaks. The iPhone 6s/7 Ocarina features force-sensitive fingerholes that heightens octaves upon high force. Force Trombone continuously controls gain by force. Force Synth is a trigger pad array featuring all functions in one button: start note by touch, control vibrato with force, and toggle octaves upon abrupt burst of force. A simple user test suggests that adding force features to well-known instruments are more friendly and usable.

Edward Jangwon Lee, Sangeon Yong, Soonbeom Choi, Liwei Chan, Roshan Peiris, Juhan Nam

Sonifying Twitter’s Emotions Through Music

Sonification is a scientific field that seeks to explore the potential of sound as an instrument to convey and interpret data. Its techniques have been developing significantly with the growth of technology and supporting hardware and software, which have spread in our daily environment. This allowed the establishment of new communication tools to share information, opinion and feelings as part of our daily routine.The aim of this project was to unite the social media phenomena with sonification, using Twitter data to extract user’s emotions and translate them into musical compositions. The focus was to explore the potential of music in translating data as personal and subjective as human emotions, developing a musically complex and captivating mapping based on the rules of Western Music. The music is accompanied by a simple visualization, which results in emotions being heard and seen with the corresponding tweets, in a multimodal experience that represents Twitter’s emotional reality. The mapping was tested through an online survey, and despite a few misunderstandings, the results were generally positive, expressing the efficiency and impact of the developed system.

Mariana Seiça, Rui (Buga) Lopes, Pedro Martins, F. Amílcar Cardoso

Research and Creation: Spaces and Modalities


Reflections on the Use of Musical Software in Compositional Processes for Light Art

Music and the visual arts are both temporal and spatial arts. My artistic research focuses on the use of light as a medium and in this respect my interests are both spatial and temporal and revolve around phenomenological questions related to the perception of volume and space, but also to the problems of how temporal sequences may be arranged and written into a score. After a short presentation of my artistic production and of my theoretical (and phenomenological) interests, I will try to explain and show how and why musical software is better designed for my light installations, especially because they presuppose the activity of autonomous composition in time.

Charlotte Beaufort

SELFHOOD: An Evolutionary and Interactive Experience Synthesizing Images and Sounds

The SELFHOOD installation was conceived aiming to instigate a reflection on the self through a practical and interactive experience. A representation of each participant is created in a form of a cloud of points and a sound drone, suggesting their selves. The dynamics of the visitors’ movements is sonified in such way that colours and sound textures are fused in a surrounding hexaphonic system. CromaCrono≈, the system for immersive improvisation that produces digitally synthesized sounds in real time, is described. Philosophical concepts concerning notions of the Self are presented. We propose that the notion of Presence can be induced by virtual and/or physical sources of stimulation governed by a number of principles that underlie human experience, creativity, and discovery. The methodological point of view is that the notion of Presence indicates that there are essential inputs for the construction of self-referral agents.

Jonatas Manzolli, Artemis Moroni, Guilherme A. Valarini

Music(s), Musicology and Science: Towards an Interscience Network

The Example of the Deaf Musical Experience

This contribution traces the history of musicology in order to set its object. The history of the discipline is clear: born as historical musicology, it flourished as an interdisciplinary discipline over the second half of the 20th century, with the development of new musicology and critical musicology. Defining the scope of musicology, however, is challenging, since it encompasses various aspects of music: music as sound, as a historical fact, as text. Music, therefore, oscillates between natural sciences, humanities, philosophy, and aesthetics, shifting of identity, between a quantifiable sound, the meaningful object of miscellaneous debates, and the purpose of boundless interpretations. These observations induce contemporary musicologists to elaborate an intersciences project which is exposed in the present paper. To concretize our remarks, we will take as an interscientific musicological object a specific situation: the Deaf musical experience.

Sylvain Brétéché, Christine Esclapez

Reading Early Music Today: Between Reenactment and New Technologies

Since the revival of Historically Informed Performance in the 1960s, the interpretation of Early Music has continuously raised questions many of which remain unanswered. Nevertheless, understanding of earlier practice continues to grow and performers have long surpassed the strict historical urtext approach that initially prevailed, largely due to the growing body of evidence that instrumentalists of earlier times relied heavily on aurally transmitted improvised musical traditions, which can only be re-imagined today. The modern musician must also improvise in order to reconstitute or re-invent missing elements belonging to a long-forgotten tradition. From a philosophical point of view, therefore, the performance of early music today is closely related to hermeneutics, and the multiple questions involved in the interpretative process.

Julien Ferrando

ArtDoc - An Experimental Archive and a Tool for Artistic Research

ArtDoc is an experimental archive primarily for documenting artistic practice. One of the ambitions is to address the question of how artistic practice may be documented in a manner that makes visible the processes in action. ArtDoc has its roots in research and artistic practice that began over ten years ago and preliminary tests shows it to be a useful complement to other means to document musical works and artistic processes. The particular case of open form works, works that in some respect are negotiated between the different agents involved, such as composers, musicians and members of the audience was a point of departure and has guided the development to a significant degree. The underlying structure of documentation classes is presented and some of the design choices are discussed. ArtDoc is still under construction but a working proof of concept will be released in 2018.

Henrik Frisk


Weitere Informationen