Skip to main content
Erschienen in:
Buchtitelbild

2017 | OriginalPaper | Buchkapitel

deepGTTM-I&II: Local Boundary and Metrical Structure Analyzer Based on Deep Learning Technique

verfasst von : Masatoshi Hamanaka, Keiji Hirata, Satoshi Tojo

Erschienen in: Bridging People and Sound

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper describes an analyzer for detecting local grouping boundaries and generating metrical structures of music pieces based on a generative theory of tonal music (GTTM). Although systems for automatically detecting local grouping boundaries and generating metrical structures, such as the full automatic time-span tree analyzer, have been proposed, musicologists have to correct the boundaries or strong beat positions due to numerous errors. In light of this, we use a deep learning technique for detecting local boundaries and generating metrical structures of music pieces based on a GTTM. Because we only have 300 pieces of music with the local grouping boundaries and metrical structures analyzed by musicologist, directly learning the relationship between the scores and metrical structures is difficult due to the lack of training data. To solve this problem, we propose a multi-task learning analyzer called deepGTM-I&II based on the above deep learning technique to learn the relationship between scores and metrical structures in the following three steps. First, we conduct unsupervised pre-training of a network using 15,000 pieces of music in a non-labeled dataset. After pre-training, the network involves supervised fine-tuning by back propagation from output to input layers using a half-labeled dataset, which consists of 15,000 pieces of music labeled with an automatic analyzer that we previously constructed. Finally, the network involves supervised fine-tuning using a labeled dataset. The experimental results indicate that deepGTTM-I&II outperformed previous analyzers for a GTTM in terms of the F-measure for generating metrical structures.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Cambouropoulos, E.: The Local Boundary Detection Model (LBDM) and its application in the study of expressive timing. In: Proceedings of the International Computer Music Conference (ICMC 2001), pp. 290–293 (2001) Cambouropoulos, E.: The Local Boundary Detection Model (LBDM) and its application in the study of expressive timing. In: Proceedings of the International Computer Music Conference (ICMC 2001), pp. 290–293 (2001)
2.
Zurück zum Zitat Cooper, G., Meyer, L.B.: The Rhythmic Structure of Music. The University of Chicago Press, Chicago (1960) Cooper, G., Meyer, L.B.: The Rhythmic Structure of Music. The University of Chicago Press, Chicago (1960)
3.
Zurück zum Zitat Davies, M., Bock, S.: Evaluating the evaluation measures for beat tracking. In: Proceedings of the International Conference on Music Information Retrieval (ISMIR 2014), pp. 637–642 (2014) Davies, M., Bock, S.: Evaluating the evaluation measures for beat tracking. In: Proceedings of the International Conference on Music Information Retrieval (ISMIR 2014), pp. 637–642 (2014)
4.
Zurück zum Zitat Dixon, S.: Automatic extraction of tempo and beat from expressive performance. J. New Music Res. 30(1), 39–58 (2001)MathSciNetCrossRef Dixon, S.: Automatic extraction of tempo and beat from expressive performance. J. New Music Res. 30(1), 39–58 (2001)MathSciNetCrossRef
5.
Zurück zum Zitat Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)MathSciNetMATH Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)MathSciNetMATH
6.
Zurück zum Zitat Goto, M.: An audio-based real-time beat tracking system for music with or without drum-sounds. J. New Music Res. 30(2), 159–171 (2001)MathSciNetCrossRef Goto, M.: An audio-based real-time beat tracking system for music with or without drum-sounds. J. New Music Res. 30(2), 159–171 (2001)MathSciNetCrossRef
7.
Zurück zum Zitat Hamanaka, M., Hirata, K., Tojo, S.: Implementing ‘a generative theory of tonal music’. J. New Music Res. 35(4), 249–277 (2006)CrossRef Hamanaka, M., Hirata, K., Tojo, S.: Implementing ‘a generative theory of tonal music’. J. New Music Res. 35(4), 249–277 (2006)CrossRef
8.
Zurück zum Zitat Hamanaka, M., Hirata, K., Tojo, S.: FATTA: full automatic time-span tree analyzer. In: Proceedings of the 2007 International Computer Music Conference (ICMC 2007), pp. 153–156 (2007) Hamanaka, M., Hirata, K., Tojo, S.: FATTA: full automatic time-span tree analyzer. In: Proceedings of the 2007 International Computer Music Conference (ICMC 2007), pp. 153–156 (2007)
9.
Zurück zum Zitat Hamanaka, M., Hirata, K., Tojo, S.: Melody expectation method based on GTTM and TPS. In: Proceeding of the 2008 International Society for Music Information Retrieval Conference (ISMIR 2008), pp. 107–112 (2008) Hamanaka, M., Hirata, K., Tojo, S.: Melody expectation method based on GTTM and TPS. In: Proceeding of the 2008 International Society for Music Information Retrieval Conference (ISMIR 2008), pp. 107–112 (2008)
10.
Zurück zum Zitat Hamanaka, M., Hirata, K., Tojo, S.: Melody morphing method based on GTTM. In: Proceeding of the 2008 International Computer Music Conference (ICMC 2008), pp. 155–158 (2008) Hamanaka, M., Hirata, K., Tojo, S.: Melody morphing method based on GTTM. In: Proceeding of the 2008 International Computer Music Conference (ICMC 2008), pp. 155–158 (2008)
11.
Zurück zum Zitat Hamanaka, M., Hirata, K., Tojo, S.: Interactive GTTM Analyzer. In: Proceedings of the 10th International Conference on Music Information Retrieval Conference (ISMIR 2009), pp. 291–296 (2009) Hamanaka, M., Hirata, K., Tojo, S.: Interactive GTTM Analyzer. In: Proceedings of the 10th International Conference on Music Information Retrieval Conference (ISMIR 2009), pp. 291–296 (2009)
12.
Zurück zum Zitat Hamanaka, M., Hirata, K., Tojo, S.: Music structural analysis database based on GTTM. In: Proceedings of the 2014 International Society for Music Information Retrieval Conference (ISMIR 2014), pp. 325–330 (2014) Hamanaka, M., Hirata, K., Tojo, S.: Music structural analysis database based on GTTM. In: Proceedings of the 2014 International Society for Music Information Retrieval Conference (ISMIR 2014), pp. 325–330 (2014)
13.
Zurück zum Zitat Hamanaka, M., Hirata, K., Tojo, S.: \(\sigma \)GTTM III: learning-based time-span tree generator based on PCFG. In: Kronland-Martinet, R., Aramaki, M., Ystad, S. (eds.) CMMR 2015. LNCS, vol. 9617, pp. 387–404. Springer, Cham (2016). doi:10.1007/978-3-319-46282-0_25 CrossRef Hamanaka, M., Hirata, K., Tojo, S.: \(\sigma \)GTTM III: learning-based time-span tree generator based on PCFG. In: Kronland-Martinet, R., Aramaki, M., Ystad, S. (eds.) CMMR 2015. LNCS, vol. 9617, pp. 387–404. Springer, Cham (2016). doi:10.​1007/​978-3-319-46282-0_​25 CrossRef
14.
Zurück zum Zitat Hamanaka, M., Hirata, K., Tojo, S.: Implementing methods for analysing music based on Lerdahl and Jackendoff’s generative theory of tonal music. Comput. Music Anal., pp. 221–249. Springer, Cham (2016). doi:10.1007/978-3-319-25931-4_9 CrossRef Hamanaka, M., Hirata, K., Tojo, S.: Implementing methods for analysing music based on Lerdahl and Jackendoff’s generative theory of tonal music. Comput. Music Anal., pp. 221–249. Springer, Cham (2016). doi:10.​1007/​978-3-319-25931-4_​9 CrossRef
16.
Zurück zum Zitat Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefMATH Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefMATH
17.
Zurück zum Zitat Hirata, K., Matsuda, S.: Interactive music summarization based on generative theory of tonal music. J. New Music Res. 32(2), 165–177 (2003)CrossRef Hirata, K., Matsuda, S.: Interactive music summarization based on generative theory of tonal music. J. New Music Res. 32(2), 165–177 (2003)CrossRef
18.
Zurück zum Zitat Hirata, K., Hiraga, R.: Ha-Hi-Hun plays Chopin’s Etude. In: Working Notes of IJCAI-03 Workshop on Methods for Automatic Music Performance and Their Applications in a Public Rendering Contest, pp. 72–73 (2003) Hirata, K., Hiraga, R.: Ha-Hi-Hun plays Chopin’s Etude. In: Working Notes of IJCAI-03 Workshop on Methods for Automatic Music Performance and Their Applications in a Public Rendering Contest, pp. 72–73 (2003)
19.
Zurück zum Zitat Hirata, K., Matsuda, S.: Annotated music for retrieval, reproduction, and sharing. In: Proceeding of International Computer Music Conference (ICMC 2004), pp. 584–587 (2004) Hirata, K., Matsuda, S.: Annotated music for retrieval, reproduction, and sharing. In: Proceeding of International Computer Music Conference (ICMC 2004), pp. 584–587 (2004)
20.
Zurück zum Zitat Kanamori, K., Hamanaka, M.: Method to detect GTTM local grouping boundaries based on clustering and statistical learning. In: Proceedings of the 2014 International Computer Music Conference (ICMC 2014), pp. 125–128 (2014) Kanamori, K., Hamanaka, M.: Method to detect GTTM local grouping boundaries based on clustering and statistical learning. In: Proceedings of the 2014 International Computer Music Conference (ICMC 2014), pp. 125–128 (2014)
21.
Zurück zum Zitat Lerdahl, F., Jackendoff, R.: A Generative Theory of Tonal Music. MIT Press, Cambridge (1983) Lerdahl, F., Jackendoff, R.: A Generative Theory of Tonal Music. MIT Press, Cambridge (1983)
22.
Zurück zum Zitat Lerdahl, F.: Tonal Pitch Space. Oxford University Press, Oxford (2001) Lerdahl, F.: Tonal Pitch Space. Oxford University Press, Oxford (2001)
24.
Zurück zum Zitat Marsden, A.: Software for Schenkerian analysis. In: Proceeding of International Computer Music Conference (ICMC2011), pp. 673–676 (2011) Marsden, A.: Software for Schenkerian analysis. In: Proceeding of International Computer Music Conference (ICMC2011), pp. 673–676 (2011)
25.
Zurück zum Zitat Miura, Y., Hamanaka, M., Hirata, K., Tojo, S.: Use of decision tree to detect GTTM group boundaries. In: Proceedings of the 2009 International Computer Music Conference (ICMC 2009), pp. 125–128 (2009) Miura, Y., Hamanaka, M., Hirata, K., Tojo, S.: Use of decision tree to detect GTTM group boundaries. In: Proceedings of the 2009 International Computer Music Conference (ICMC 2009), pp. 125–128 (2009)
26.
Zurück zum Zitat Nakamura, E., Hamanaka, M., Hirata, K., Yoshii, K.: Tree-structured probabilistic model of monophonic written music based on the generative theory of tonal music. In: Proceedings of the 41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), pp. 276–280 (2016) Nakamura, E., Hamanaka, M., Hirata, K., Yoshii, K.: Tree-structured probabilistic model of monophonic written music based on the generative theory of tonal music. In: Proceedings of the 41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), pp. 276–280 (2016)
27.
Zurück zum Zitat Narmour, E.: The Analysis and Cognition of Basic Melodic Structure. University of Chicago Press, Chicago (1990) Narmour, E.: The Analysis and Cognition of Basic Melodic Structure. University of Chicago Press, Chicago (1990)
28.
Zurück zum Zitat Narmour, E.: The Analysis and Cognition of Melodic Complexity. The University of Chicago Press, Chicago (1992) Narmour, E.: The Analysis and Cognition of Melodic Complexity. The University of Chicago Press, Chicago (1992)
29.
Zurück zum Zitat Oshima, T., Hamanaka, M., Hirata, K., Tojo, S., Nagao, K.: Development of discussion structure editor for discussion mining based on music theory. In: IPSJ SIG DCC, 7 p. (2013). (in Japanese) Oshima, T., Hamanaka, M., Hirata, K., Tojo, S., Nagao, K.: Development of discussion structure editor for discussion mining based on music theory. In: IPSJ SIG DCC, 7 p. (2013). (in Japanese)
30.
Zurück zum Zitat Pearce, M.T., Müllensiefen, D., Wiggins, G.A.: A comparison of statistical and rule-based models of melodic segmentation. In: Proceedings of the International Conference on Music Information Retrieval (ISMIR 2008), pp. 89–94 (2008) Pearce, M.T., Müllensiefen, D., Wiggins, G.A.: A comparison of statistical and rule-based models of melodic segmentation. In: Proceedings of the International Conference on Music Information Retrieval (ISMIR 2008), pp. 89–94 (2008)
31.
Zurück zum Zitat Rosenthal, D.: Emulation of human rhythm perception. Comput. Music J. 16(1), 64–76 (1992)CrossRef Rosenthal, D.: Emulation of human rhythm perception. Comput. Music J. 16(1), 64–76 (1992)CrossRef
32.
Zurück zum Zitat Schenker, H.: Der frei Satz. Universal Edition, Vienna (1935). Published in English as Free Composition, translated and edited by E. Oster. Longman, New York (1979) Schenker, H.: Der frei Satz. Universal Edition, Vienna (1935). Published in English as Free Composition, translated and edited by E. Oster. Longman, New York (1979)
33.
Zurück zum Zitat Takeuchi, S., Hamanaka, M.: Structure of the film based on the music theory. In: JSAI 2014, 1K5-OS-07b-4 (2014). (in Japanese) Takeuchi, S., Hamanaka, M.: Structure of the film based on the music theory. In: JSAI 2014, 1K5-OS-07b-4 (2014). (in Japanese)
35.
Zurück zum Zitat Temperley, D.: The Congnition of Basic Musical Structures. MIT Press, Cambridge (2004) Temperley, D.: The Congnition of Basic Musical Structures. MIT Press, Cambridge (2004)
36.
Zurück zum Zitat Temperley, D.: Music and Probability. The MIT Press, Cambridge (2007)MATH Temperley, D.: Music and Probability. The MIT Press, Cambridge (2007)MATH
37.
Zurück zum Zitat Yazawa, S., Hamanaka, M., Utsuro, T.: Melody generation system based on a theory of melody sequences. In: Proceedings of ICAICTA 2014, pp. 347–352 (2014) Yazawa, S., Hamanaka, M., Utsuro, T.: Melody generation system based on a theory of melody sequences. In: Proceedings of ICAICTA 2014, pp. 347–352 (2014)
Metadaten
Titel
deepGTTM-I&II: Local Boundary and Metrical Structure Analyzer Based on Deep Learning Technique
verfasst von
Masatoshi Hamanaka
Keiji Hirata
Satoshi Tojo
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-67738-5_1