Skip to main content

About this book

Sound is almost always around us, anywhere, at any time, reaching our ears and stimulating our brains for better or worse. Sound can be the disturbing noise of a drill, a merry little tune sung by a friend, the song of a bird in the morning or a clap of thunder at night. The science of sound, or acoustics, studies all types of sounds and therefore covers a wide range of scientific disciplines, from pure to applied acoustics. Research dealing with acoustics requires a sound to be recorded, analyzed, manipulated and, possibly, changed. This is particularly, but not exclusively, the case in bioacoustics and ecoacoustics, two life sciences disciplines that attempt to understand and to eavesdrop on the sound produced by animals. Sound analysis and synthesis can be challenging for students, researchers and practitioners who have few skills in mathematics or physics. However, deciphering the structure of a sound can be useful in behavioral and ecological research – and also very amusing.

This book is dedicated to anyone who wants to practice acoustics but does not know much about sound. Acoustic analysis and synthesis are possible, with little effort, using the free and open-source software R with a few specific packages. Combining a bit of theory, a lot of step-by-step examples and a few cases studies, this book shows beginners and experts alike how to record, read, play, decompose, visualize, parametrize, change, and synthesize sound with R, opening a new way of working in bioacoustics and ecoacoustics but also in other acoustic disciplines.

Table of Contents


Chapter 1. Introduction

A short introduction to the book content and structure.
Audio files:None
Jérôme Sueur

Chapter 2. What Is Sound?

Sound main properties are overviewed with no reference to R. Sound is first introduced as a mechanical wave with reference to the essential features commonly used for signal description, i.e., amplitude, phase, duration, and frequency. Sound is then considered in a statistical way as a time series, in an electronic framework as a digital object, and in reference to a communication system as a support of information.
Audio files:None
Jérôme Sueur

Chapter 3. What Is R?

The use of the softwave R is introduced such that R codes shown in the following chapters can be understood and repeated. This introduction is a friendly guide to quickly speak R language covering topics from R local installation on a personal computer to script writing for batch processing. More specifically, objects, operators, functions, indexes, conditions, graphics, and packages dedicated to sound are treated.
Audio files:theremin.​wavtheremin-slow.​wav
Jérôme Sueur

Chapter 4. Playing with Sound

This first contact with sound within R reveals the peculiarities of R objects that can contain sound. Sound-specific classes are introduced in details and the different solutions to read (load), play (listen), record (sample), and write (save) sounds are reported.
Audio files:tuning-fork.​wav
Jérôme Sueur

Chapter 5. Display of the Wave

Sound must be displayed so that it can be explored and described. The most intuitive way to visualize sound is to build a time × amplitude plot. Several options to produce an oscillogram are given, so that high-level graphics including color tuning, text annotations, and image overlays can be obtained. The computation and display of the absolute and analytic amplitude envelopes are also explained. The key principle of a window sliding along the sound used for smoothing or discretization of the waveform is also detailed.
Audio files:tico.​wav
Jérôme Sueur

Chapter 6. Edition

Detailed instructions are given to edit, that is to manipulate in time and or amplitude, a sound. These edition facilities include resampling, channels management, time edition (cutting, deleting, pasting, repeating, reversing, managing silence sections) and amplitude changes (removing offset, changing the amplitude level, fading in and out effect).
Audio files:tico.​wavtuning-fork.​wav
Jérôme Sueur

Chapter 7. Amplitude Parametrization

The options to assess the amplitude features of a sound are reviewed. This, in particular, includes details about the different amplitude scales (linear or logarithmic, relative or absolute), the parameters characterizing the amplitude as the crest, the root-mean-square, the signal-to-noise ratio, and the dB unit. Information about the delicate questions of attenuation and calibration are also evoked.
Audio files:tico.​wav
Jérôme Sueur

Chapter 8. Time-Amplitude Parametrization

The techniques to take time ×-amplitude measurements first consist in measuring signal and pause durations either through a manual and visual process or with the help of an automatic segmentation process. Because time variations are tidily linked to amplitude variations, amplitude modulations properties can also be estimated through a Fourier analysis. The calling song of a Mediterranean cicada and of a Caribbean frog is used to illustrate how much these manual and automatic techniques can be useful for sound description.
Audio files:Eleutherodactylu​s_​martinicensis.​wavorni.​wav
Jérôme Sueur

Chapter 9. Introduction to Frequency Analysis: The Fourier Transformation

The Fourier transformation is a key mathematical tool that connects the time and frequency domains such that sound can be parametrized in terms of frequency. The theory of the different Fourier transforms, including the inverse transform, is presented to facilitate the reading of the following chapters. Each mathematical equation is translated into R so that the basic principles can be understood and unmystified. This discovery of the Fourier transformation is accompanied with the presentation of the frequency spectrum, the phase spectrum, the different frequency scales, the Fourier window shapes, and the cepstrum.
Audio files:None
Jérôme Sueur

Chapter 10. Frequency, Quefrency, and Phase in Practice

The options to compute, display, and describe the frequency spectrum are reviewed. This includes the use of different frequency and amplitude scales, the automatic detection of frequency peaks in particular the fundamental frequency peak and the dominant frequency peak, the identification of harmonics series, the principle of symbolic aggregate approximation, and the use of other spectrum parametrizations. The quefrency cepstrum and the phase portrait are also introduced.
Audio files:Loxodonta_​africana.​wavtico.​wavpeewit.​wavsheep.​wavorni.​wavpellucens.​wav
Jérôme Sueur

Chapter 11. Spectrographic Visualization

Variations of amplitude and frequency according to timeare commonly visualised through a time × frequency × amplitude density plot, named the spectrogram. The theory of the short-time discrete Fourier transform (and its inverse function), which is behind the spectrogram output, is introduced with a particular attention paid to the uncertainty principle. Practical solutions are given to display, tune, decorate, annotate, describe, animate, and print a 2D/3D spectrogram. The realization of a mean spectrum and a soundscape spectrum, which are computed on the short-time Fourier transform, is also introduced.
Audio files:synth-face.​wavElliptorhina_​chopardi.​wavforest.​wavtico.​wavpeewit.​wavsheep.​wavorni.​wav
Jérôme Sueur

Chapter 12. Mel-Frequency Cepstral and Linear Predictive Coefficients

Mel-frequency cepstral coefficients (MFCCs) and linear predictive coefficients (LPCs) are features used to describe sound according to time, frequency, and amplitude. These techniques, which are mainly used in speech analysis, are reviewed step by step for a good understanding and practice.
Audio files:hello.​wav
Jérôme Sueur

Chapter 13. Frequency and Energy Tracking

Frequency variations according to time can be estimated by tracking specific features along time. Solutions are proposed to follow the time variation of the dominant frequency, the fundamental frequency, and speech formants. The Hilbert analytic signal and the zero-crossing method are shown to estimate the instantaneous frequency, and the Teager-Kaiser energy operator is described to track energy variations.
Audio files:Pipistrellus_​kuhlii.​wavtheremin.​wavhello.​wavtico.​wavsheep.​wav
Jérôme Sueur

Chapter 14. Frequency Filters

Indications are provided to apply frequency filters to remove unwanted sounds according to frequency. This covers filters with predefined frequency transfer functions, as the preemphasis and Butterworth filters, and filters which cutoff frequencies can be defined by the user, that is, filters based on the short-time Fourier transform and finite impulse response (FIR) filters.
Audio files:Allobates_​femoralis.​wavAlytes_​obstetricans.​wavhello.​wavnoise.​wavpeewit.​wav
Jérôme Sueur

Chapter 15. Other Modifications

Several sound modifications are reported in the frequency domain but also in the time and amplitude domains. This covers changing the amplitude envelope, adding echoes and reverberations, and changing independently the frequency and time content through the use of the inverse short-term Fourier transform or the Hilbert analytic signal.
Audio files:Allobates_​femoralis.​wavhello.​wavtuning-fork.​wavorni.​wavtico.​wav
Jérôme Sueur

Chapter 16. Indices for Ecoacoustics

Ecoacoustics aims at studying acoustic populations, communities, and landscapes for research in ecology. Ecoacoustics uses several acoustic indices to feature outdoor recordings. The main α and β ecoustic indices are reviewed one by one and statistic solutions are provided to treat dissimilarity matrices built with β indices.
Audio files:forest.​wavM-XV_​20101125_​000000.​wavM-XV_​20101125_​010000.​wavM-XV_​20101125_​020000.​wavM-XV_​20101125_​030000.​wavM-XV_​20101125_​040000.​wavM-XV_​20101125_​050000.​wavM-XV_​20101125_​060000.​wavM-XV_​20101125_​070000.​wavM-XV_​20101125_​080000.​wavM-XV_​20101125_​090000.​wavM-XV_​20101125_​100000.​wavM-XV_​20101125_​110000.​wavM-XV_​20101125_​120000.​wavM-XV_​20101125_​130000.​wavM-XV_​20101125_​140000.​wavM-XV_​20101125_​150000.​wavM-XV_​20101125_​160000.​wavM-XV_​20101125_​170000.​wavM-XV_​20101125_​180000.​wavM-XV_​20101125_​190000.​wavM-XV_​20101125_​200000.​wavM-XV_​20101125_​210000.​wavM-XV_​20101125_​220000.​wavM-XV_​20101125_​230000.​wav
Jérôme Sueur

Chapter 17. Comparison and Automatic Detection

Cross-correlation of amplitude envelopes, frequency spectra, and spectrograms are evoked together with the computation of the frequency coherence as solutions to compare two sounds. The dynamic time warping technique, that seeks for the best alignments of time series of unequal length, is also covered. A recipe is provided to run a supervised binary automatic identification over a series of recordings, that to seek automatically for a sound of interest in a large audio dataset.
Audio files:M-XV_​20101125_​150000.​wavAllobates_​femoralis_​2015-11-10_​161500_​GFT.​wavAllobates_​femoralis.​wav
Jérôme Sueur

Chapter 18. Synthesis

Synthesis of sound is possible thanks to a few functions that can generate noises, pulses, square signals, sawtooth signals, triangle signals, pure tones, chirps, harmonic series, amplitude and/or frequency modulated sounds. Additive synthesis, modulation synthesis, tonal synthesis and speech synthesis are reviewed.
Audio files:tico.​wavpeewit.​wavpellucens.​wavEleutherodactylu​s_​martinicensis.​wavsynth-am-fm-1.​wavsynth-am-fm-2.​wavsynth-am-fm-3.​wavsynth-am-fm-4.​wavsynth-am-fm-5.​wavsynth-am-fm-6.​wavsynth-am-fm-7.​wavsynth-am-fm-8.​wavsynth-am-fm-9.​wavsynth-AM-sidebands-1.​wavsynth-AM-sidebands-2.​wavsynth-AM-sidebands-3.​wavsynth-AM-sidebands-4.​wavsynth-chirp-combination.​wavsynth-chirp-harmonics.​wavsynth-chirp-linear.​wavsynth-chirp-logarithmic.​wavsynth-chirp-quadratic.​wavsynth-C-major-scale.​wavsynth-eleutherodactylu​s_​martinicensis.​wavsynth-face.​wavsynth-FM-sidebands-1.​wavsynth-FM-sidebands-2.​wavsynth-FM-sidebands-3.​wavsynth-FM-sidebands-4.​wavsynth-noise-pink.​wavsynth-noise-power.​wavsynth-noise-red.​wavsynth-noise-white.​wavsynth-numsound-euler.​wavsynth-numsound-golden-ratio.​wavsynth-numsound-pi.​wavsynth-numsound-rational.​wavsynth-numsound-test.​wavsynth-oecanthus-pellucens.​wavsynth-peewit.​wavsynth-pulse-1.​wavsynth-pulse-2.​wavsynth-pulse-3.​wavsynth-pulse-4.​wavsynth-risset-glissando.​wavsynth-saw-1.​wavsynth-saw-2.​wavsynth-saw-3.​wavsynth-shepard-scale.​wavsynth-silence.​wavsynth-sine-440-1.​wavsynth-sine-440-2.​wavsynth-sine-440-3.​wavsynth-sine-440-880-stereo.​wavsynth-sine-exponential-amplitude.​wavsynth-sine-harmonics-1.​wavsynth-sine-harmonics-2.​wavsynth-sine-harmonics-3.​wavsynth-sine-harmonics-4.​wavsynth-sine-harmonics-5.​wavsynth-sine-harmonics-6.​wavsynth-sine-harmonics-7.​wavsynth-sine-linear-amplitude.​wavsynth-sine-sinusoidal-amplitude.​wavsynth-square-1.​wavsynth-square-2.​wavsynth-square-3.​wavsynth-square-4.​wavsynth-tico-silence.​wavsynth-tonal-1.​wavsynth-tonal-2.​wavsynth-tonal-3.​wavsynth-tonal-4.​wavsynth-tonal-5.​wavsynth-tonal-6.​wavsynth-triangle.​wavsynth-vowels-phontools.​wavsynth-vowels-soundgen.​wav
Jérôme Sueur


Additional information

Premium Partner

image credits