Skip to main content
Top

2021 | OriginalPaper | Chapter

Towards Deep Learning Strategies for Transcribing Electroacoustic Music

Authors : Matthias Nowakowski, Christof Weiß, Jakob Abeßer

Published in: Perception, Representations, Image, Sound, Music

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Electroacoustic music is experienced primarily through auditory perception, as it is not usually based on a prescriptive score. For the analysis of such pieces, transcriptions are sometimes created to illustrate events and processes graphically in a readily comprehensible way. These are usually based on the spectrogram of the recording. Although the manual generation of transcriptions is often time-consuming, they provide a useful starting point for any person who has interest in a work. Deep-learning algorithms that learn to recognize characteristic spectral patterns using supervised learning represent a promising technology to automatize this task. This paper investigates and explores the labeling of sound objects in electroacoustic music recordings. We test several neural-network architectures that enable classification of sound objects using musicological and signal-processing methods. We also show future perspectives how our results can be improved and applied to a new gradient-based visualization approach.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
4
We have requested this dataset, but unfortunately it was no longer provided by the creators.
 
Literature
1.
go back to reference Adavanne, S., Virtanen, T.: A report on sound event detection with different binaural features. In: DCASE 2017 Challenge (2017) Adavanne, S., Virtanen, T.: A report on sound event detection with different binaural features. In: DCASE 2017 Challenge (2017)
2.
go back to reference Alber, M., et al.: iNNvestigate neural networks! CoRR (2018) Alber, M., et al.: iNNvestigate neural networks! CoRR (2018)
3.
go back to reference Beiche, M.: Musique concrète. Handbuch der musikalischen Terminologie 4, Steiner-Verlag Stuttgart (1994) Beiche, M.: Musique concrète. Handbuch der musikalischen Terminologie 4, Steiner-Verlag Stuttgart (1994)
4.
go back to reference Chung, J., Gülçehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 Deep Learning and Representation Learning Workshop (2014) Chung, J., Gülçehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 Deep Learning and Representation Learning Workshop (2014)
5.
go back to reference Choi, K., Fazekas, G., Sandler, M.B., Cho, K.: Transfer learning for music classification and regression tasks. In: Proceedings of the 18th ISMIR Conference, Suzhou, pp. 141–149 (2017) Choi, K., Fazekas, G., Sandler, M.B., Cho, K.: Transfer learning for music classification and regression tasks. In: Proceedings of the 18th ISMIR Conference, Suzhou, pp. 141–149 (2017)
6.
go back to reference Collins, N.: The UbuWeb Electronic Music Corpus: An MIR investigation of a historical database. Organised Sound 20(1), 122–134 (2015)CrossRef Collins, N.: The UbuWeb Electronic Music Corpus: An MIR investigation of a historical database. Organised Sound 20(1), 122–134 (2015)CrossRef
7.
go back to reference Collins, N., Manning, P., Tarsitani, S.: A new curated corpus of historical electronic music: collation, data and research findings. Trans. Int. Soc. Music Inf. Retr. 1(1), 34–55 (2018) Collins, N., Manning, P., Tarsitani, S.: A new curated corpus of historical electronic music: collation, data and research findings. Trans. Int. Soc. Music Inf. Retr. 1(1), 34–55 (2018)
8.
go back to reference Couprie, P.: Methods and tools for transcribing electroacoustic music. In: International Conference on Technologies for Music Notation and Representation - TENOR 2018, Montréal, pp. 7–16 (2018) Couprie, P.: Methods and tools for transcribing electroacoustic music. In: International Conference on Technologies for Music Notation and Representation - TENOR 2018, Montréal, pp. 7–16 (2018)
9.
go back to reference Drieger, J., Müller M., Disch S.: Extending harmonic-percussive separation of audio signals. In: Retrieval Conference (ISMIR 2014), Taipei, pp. 611–616 (2014) Drieger, J., Müller M., Disch S.: Extending harmonic-percussive separation of audio signals. In: Retrieval Conference (ISMIR 2014), Taipei, pp. 611–616 (2014)
10.
go back to reference Erbe, M.: Klänge schreiben: die Transkriptionsproblematik elektroakustischer Musik. Apfel, Vienna (2009) Erbe, M.: Klänge schreiben: die Transkriptionsproblematik elektroakustischer Musik. Apfel, Vienna (2009)
11.
go back to reference Essid, S., Richard, G., David, B.: Musical instrument recognition by pairwise classification strategies. IEEE Trans. Audio Speech Lang. Process. 14(4), 1401–1412 (2006)CrossRef Essid, S., Richard, G., David, B.: Musical instrument recognition by pairwise classification strategies. IEEE Trans. Audio Speech Lang. Process. 14(4), 1401–1412 (2006)CrossRef
12.
go back to reference Grzywczak, D., Gwardys, G.: Deep image features in music information retrieval. Int. J. Electron. Telecommun. 60(4), 321–326 (2014)CrossRef Grzywczak, D., Gwardys, G.: Deep image features in music information retrieval. Int. J. Electron. Telecommun. 60(4), 321–326 (2014)CrossRef
13.
go back to reference Gulluni, S., Essid, S., Buisson, O., Richard, G.: An interactive system for electro-acoustic music analysis. In: 12th International Society for Music Information Retrieval Conference (ISMIR 2011), Miami, pp. 145–150 (2011) Gulluni, S., Essid, S., Buisson, O., Richard, G.: An interactive system for electro-acoustic music analysis. In: 12th International Society for Music Information Retrieval Conference (ISMIR 2011), Miami, pp. 145–150 (2011)
14.
go back to reference Gulluni, S., Essid, S., Buisson, O., Richard, G.: Interactive classification of sound objects for polyphonic electro-acoustic music annotation. In: AES 42nd International Conference, Ilmenau (2011) Gulluni, S., Essid, S., Buisson, O., Richard, G.: Interactive classification of sound objects for polyphonic electro-acoustic music annotation. In: AES 42nd International Conference, Ilmenau (2011)
15.
go back to reference Klien, V., Grill, T., Flexer, A.: On automated annotation of acousmatic music. J. New Music Res. 41(2), 153–173 (2012)CrossRef Klien, V., Grill, T., Flexer, A.: On automated annotation of acousmatic music. J. New Music Res. 41(2), 153–173 (2012)CrossRef
16.
go back to reference López-Serrano, P., Dittmar, C., Müller M.: Mid-level audio features based on cascaded harmonic-residual-percussive separation. In: Proceedings of the Audio Engineering Society AES Conference on Semantic Audio, Erlangen (2017) López-Serrano, P., Dittmar, C., Müller M.: Mid-level audio features based on cascaded harmonic-residual-percussive separation. In: Proceedings of the Audio Engineering Society AES Conference on Semantic Audio, Erlangen (2017)
17.
go back to reference Mesaros, A., et al.: DCASE 2017 challenge setup: tasks, datasets and baseline system. In: DCASE 2017 - Workshop on Detection and Classification of Acoustic Scenes and Events (2017) Mesaros, A., et al.: DCASE 2017 challenge setup: tasks, datasets and baseline system. In: DCASE 2017 - Workshop on Detection and Classification of Acoustic Scenes and Events (2017)
18.
go back to reference Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRef Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRef
19.
go back to reference Park, T.H., Li, Z., Wu, W.: Easy does it: the electro-acoustic music analysis toolbox. In: Proceedings of the 10th International Society for Music Information Retrieval Conference (ISMIR 2009), Kobe, pp. 693–698 (2009) Park, T.H., Li, Z., Wu, W.: Easy does it: the electro-acoustic music analysis toolbox. In: Proceedings of the 10th International Society for Music Information Retrieval Conference (ISMIR 2009), Kobe, pp. 693–698 (2009)
22.
go back to reference Purwins, H., Li, B., Virtanen, T., Schlüter, J., Chang, S.-Y., Sainath, T.: Deep Learning for Audio Signal Processing. IEEE J. Sel. Top. Signal Process. 14(8), 1–14 (2019) Purwins, H., Li, B., Virtanen, T., Schlüter, J., Chang, S.-Y., Sainath, T.: Deep Learning for Audio Signal Processing. IEEE J. Sel. Top. Signal Process. 14(8), 1–14 (2019)
23.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ILCR (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ILCR (2015)
24.
go back to reference Smalley, D.: Spectromorphology: Explaining Sound-shapes. Organised Sound 2/2, Cambridge, pp. 107–126 (1997) Smalley, D.: Spectromorphology: Explaining Sound-shapes. Organised Sound 2/2, Cambridge, pp. 107–126 (1997)
25.
go back to reference Stroh, W.M.: Elektronische Musik. Handbuch der musikalischen Terminologie 2, Steiner-Verlag, Stuttgart (1972) Stroh, W.M.: Elektronische Musik. Handbuch der musikalischen Terminologie 2, Steiner-Verlag, Stuttgart (1972)
26.
go back to reference Thoresen, L., Hedman, A.: Spectromorphological Analysis of Sound Objects: An Adaptation of Pierre Schaeffer’s Typomorphology. Organised Sound 12/2, Cambridge, pp. 129–141 (2007) Thoresen, L., Hedman, A.: Spectromorphological Analysis of Sound Objects: An Adaptation of Pierre Schaeffer’s Typomorphology. Organised Sound 12/2, Cambridge, pp. 129–141 (2007)
27.
go back to reference Torrey, L., Shavlik, J.: Transfer learning. In: Handbook of Research on Machine Learning, Algorithms, Methods, and Techniques, pp. 242–264. IGI-Global (2009) Torrey, L., Shavlik, J.: Transfer learning. In: Handbook of Research on Machine Learning, Algorithms, Methods, and Techniques, pp. 242–264. IGI-Global (2009)
28.
go back to reference van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)MATH van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)MATH
30.
go back to reference Weiß, C., Müller M.: Quantifying and visualizing tonal complexity. In: Proceedings of the 9th Conference on Interdisciplinary Musicology (CIM), Berlin, pp. 184–187 (2014) Weiß, C., Müller M.: Quantifying and visualizing tonal complexity. In: Proceedings of the 9th Conference on Interdisciplinary Musicology (CIM), Berlin, pp. 184–187 (2014)
Metadata
Title
Towards Deep Learning Strategies for Transcribing Electroacoustic Music
Authors
Matthias Nowakowski
Christof Weiß
Jakob Abeßer
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-70210-6_3