Skip to main content
Top

2021 | OriginalPaper | Chapter

A Multi-objective Evolutionary Approach to Identify Relevant Audio Features for Music Segmentation

Authors : Igor Vatolkin, Marcel Koch, Meinard Müller

Published in: Artificial Intelligence in Music, Sound, Art and Design

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The goal of automatic music segmentation is to calculate boundaries between musical parts or sections that are perceived as semantic entities. Such sections are often characterized by specific musical properties such as instrumentation, dynamics, tempo, or rhythm. Recent data-driven approaches often phrase music segmentation as a binary classification problem, where musical cues for identifying boundaries are learned implicitly. Complementary to such methods, we present in this paper an approach for identifying relevant audio features that explain the presence of musical boundaries. In particular, we describe a multi-objective evolutionary feature selection strategy, which simultaneously optimizes two objectives. In a first setting, we reduce the number of features while maximizing an F-measure. In a second setting, we jointly maximize precision and recall values. Furthermore, we present extensive experiments based on six different feature sets covering different musical aspects. We show that feature selection allows for reducing the overall dimensionality while increasing the segmentation quality compared to full feature sets, with timbre-related features performing best.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
The terminology within the scope of this paper is as follows: feature selection keeps individual feature dimensions (e.g., the 2nd MFCC) from feature vectors (e.g., a 13-dimensional MFCC vector) which exclusively belong to feature groups like timbre. A feature set selected for music segmentation is then constructed with various dimensions of various features which however belong to the same group in the current setup—the combination of features from different groups remains a promising future work.
 
Literature
1.
go back to reference Burred, J.J., Lerch, A.: A hierarchical approach to automatic musical genre classification. In: Proceedings of the 6th International Conference on Digital Audio Effects (DAFx), pp. 8–11 (2003) Burred, J.J., Lerch, A.: A hierarchical approach to automatic musical genre classification. In: Proceedings of the 6th International Conference on Digital Audio Effects (DAFx), pp. 8–11 (2003)
4.
go back to reference Foote, J.: Visualizing music and audio using self-similarity. In: Proceedings of the 7th ACM International Conference on Multimedia, pp. 77–80 (1999) Foote, J.: Visualizing music and audio using self-similarity. In: Proceedings of the 7th ACM International Conference on Multimedia, pp. 77–80 (1999)
5.
go back to reference Fujinaga, I.: Machine recognition of timbre using steady-state tone of acoustic musical instruments. In: Proceedings of the International Computer Music Conference (ICMC), pp. 207–210 (1998) Fujinaga, I.: Machine recognition of timbre using steady-state tone of acoustic musical instruments. In: Proceedings of the International Computer Music Conference (ICMC), pp. 207–210 (1998)
6.
go back to reference Grill, T., Schlüter, J.: Music boundary detection using neural networks on combined features and two-level annotations. In: Müller, M., Wiering, F. (eds.) Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR), pp. 531–537 (2015) Grill, T., Schlüter, J.: Music boundary detection using neural networks on combined features and two-level annotations. In: Müller, M., Wiering, F. (eds.) Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR), pp. 531–537 (2015)
8.
go back to reference Jensen, K.: Timbre models of musical sounds - from the model of one sound to the model of one instrument. Ph.D. Thesis, University of Copenhagen, Denmark (1999) Jensen, K.: Timbre models of musical sounds - from the model of one sound to the model of one instrument. Ph.D. Thesis, University of Copenhagen, Denmark (1999)
10.
go back to reference Klapuri, A., Eronen, A.J., Astola, J.: Analysis of the meter of acoustic musical signals. IEEE Trans. Audio Speech Lang. Process. 14(1), 342–355 (2006)CrossRef Klapuri, A., Eronen, A.J., Astola, J.: Analysis of the meter of acoustic musical signals. IEEE Trans. Audio Speech Lang. Process. 14(1), 342–355 (2006)CrossRef
11.
go back to reference Lartillot, O., Toiviainen, P.: MIR in MATLAB (II): a toolbox for musical feature extraction from audio. In: Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR), pp. 127–130. Austrian Computer Society (2007) Lartillot, O., Toiviainen, P.: MIR in MATLAB (II): a toolbox for musical feature extraction from audio. In: Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR), pp. 127–130. Austrian Computer Society (2007)
12.
go back to reference Mauch, M., Dixon, S.: Approximate note transcription for the improved identification of difficult chords. In: Downie, J.S., Veltkamp, R.C. (eds.) Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR), pp. 135–140 (2010) Mauch, M., Dixon, S.: Approximate note transcription for the improved identification of difficult chords. In: Downie, J.S., Veltkamp, R.C. (eds.) Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR), pp. 135–140 (2010)
13.
go back to reference McEnnis, D., McKay, C., Fujinaga, I.: jAudio: additions and improvements. In: Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR), pp. 385–386 (2006) McEnnis, D., McKay, C., Fujinaga, I.: jAudio: additions and improvements. In: Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR), pp. 385–386 (2006)
14.
go back to reference McFee, B., et al.: Librosa: audio and music signal analysis in python. In: Proceedings the Python Science Conference, pp. 18–25 (2015) McFee, B., et al.: Librosa: audio and music signal analysis in python. In: Proceedings the Python Science Conference, pp. 18–25 (2015)
15.
go back to reference Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: YALE: rapid prototyping for complex data mining tasks. In: Eliassi-Rad, T., Ungar, L.H., Craven, M., Gunopulos, D. (eds.) Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 935–940. ACM (2006) Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: YALE: rapid prototyping for complex data mining tasks. In: Eliassi-Rad, T., Ungar, L.H., Craven, M., Gunopulos, D. (eds.) Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 935–940. ACM (2006)
17.
go back to reference Müller, M., Ewert, S.: Chroma Toolbox: MATLAB implementations for extracting variants of chroma-based audio features. In: Klapuri, A., Leider, C. (eds.) Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR), pp. 215–220. University of Miami (2011) Müller, M., Ewert, S.: Chroma Toolbox: MATLAB implementations for extracting variants of chroma-based audio features. In: Klapuri, A., Leider, C. (eds.) Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR), pp. 215–220. University of Miami (2011)
18.
go back to reference Müller, M., Zalkow, F.: FMP notebooks: educational material for teaching and learning fundamentals of music processing. In: Proceedings of the 20th International Conference on Music Information Retrieval (ISMIR). Delft, The Netherlands, November 2019 Müller, M., Zalkow, F.: FMP notebooks: educational material for teaching and learning fundamentals of music processing. In: Proceedings of the 20th International Conference on Music Information Retrieval (ISMIR). Delft, The Netherlands, November 2019
19.
go back to reference Parry, R.M., Essa, I.A.: Feature weighting for segmentation. In: Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR) (2004) Parry, R.M., Essa, I.A.: Feature weighting for segmentation. In: Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR) (2004)
20.
go back to reference Saari, P., Eerola, T., Lartillot, O.: Generalizability and simplicity as criteria in feature selection: application to mood classification in music. IEEE Trans. Audio Speech Lang. Process. 19(6), 1802–1812 (2011)CrossRef Saari, P., Eerola, T., Lartillot, O.: Generalizability and simplicity as criteria in feature selection: application to mood classification in music. IEEE Trans. Audio Speech Lang. Process. 19(6), 1802–1812 (2011)CrossRef
21.
go back to reference Smith, J.B.L., Burgoyne, J.A., Fujinaga, I., Roure, D.D., Downie, J.S.: Design and creation of a large-scale database of structural annotations. In: Klapuri, A., Leider, C. (eds.) Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR), pp. 555–560. University of Miami (2011) Smith, J.B.L., Burgoyne, J.A., Fujinaga, I., Roure, D.D., Downie, J.S.: Design and creation of a large-scale database of structural annotations. In: Klapuri, A., Leider, C. (eds.) Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR), pp. 555–560. University of Miami (2011)
22.
go back to reference Smith, J.B.L., Chew, E.: Using quadratic programming to estimate feature relevance in structural analyses of music. In: Proceedings of ACM Multimedia Conference, pp. 113–122. ACM (2013) Smith, J.B.L., Chew, E.: Using quadratic programming to estimate feature relevance in structural analyses of music. In: Proceedings of ACM Multimedia Conference, pp. 113–122. ACM (2013)
23.
go back to reference Tian, M.: A cross-cultural analysis of music structure. Ph.D. Thesis, Queen Mary University of London, UK (2017) Tian, M.: A cross-cultural analysis of music structure. Ph.D. Thesis, Queen Mary University of London, UK (2017)
25.
go back to reference Vatolkin, I., Preuß, M., Rudolph, G.: Multi-objective feature selection in music genre and style recognition tasks. In: Krasnogor, N., Lanzi, P.L. (eds.) Proceedings of the 13th Annual Genetic and Evolutionary Computation Conference (GECCO), pp. 411–418. ACM Press (2011) Vatolkin, I., Preuß, M., Rudolph, G.: Multi-objective feature selection in music genre and style recognition tasks. In: Krasnogor, N., Lanzi, P.L. (eds.) Proceedings of the 13th Annual Genetic and Evolutionary Computation Conference (GECCO), pp. 411–418. ACM Press (2011)
26.
go back to reference Vatolkin, I., Theimer, W., Botteck, M.: AMUSE (Advanced MUSic Explorer) - a multitool framework for music data analysis. In: Downie, J.S., Veltkamp, R.C. (eds.) Proceedings of the 11th International Society on Music Information Retrieval Conference (ISMIR), pp. 33–38 (2010) Vatolkin, I., Theimer, W., Botteck, M.: AMUSE (Advanced MUSic Explorer) - a multitool framework for music data analysis. In: Downie, J.S., Veltkamp, R.C. (eds.) Proceedings of the 11th International Society on Music Information Retrieval Conference (ISMIR), pp. 33–38 (2010)
Metadata
Title
A Multi-objective Evolutionary Approach to Identify Relevant Audio Features for Music Segmentation
Authors
Igor Vatolkin
Marcel Koch
Meinard Müller
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-72914-1_22

Premium Partner