Skip to main content

2018 | OriginalPaper | Buchkapitel

k-Best Unit Selection Strategies for Musical Concatenative Synthesis

verfasst von : Cárthach Ó Nuanáin, Perfecto Herrera, Sergi Jordá

Erschienen in: Music Technology with Swing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Concatenative synthesis is a sample-based approach to sound creation used frequently in speech synthesis and, increasingly, in musical contexts. Unit selection, a key component, is the process by which sounds are chosen from the corpus of samples. With their ability to match target units as well as preserve continuity, Hidden Markov Models are often chosen for this task, but one common criticism is its singular path output which is considered too restrictive when variations are desired. In this article, we propose considering the problem in terms of k-Best path solving for generating alternative lists of candidate solutions and summarise our implementations along with some practical examples.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
3
A heap queue is a binary tree with the special condition that every parent has a value less than or equal to that of its children (this is a minimum queue, a maximum is naturally the inverse). The important function in our case is the push function, which adds items to the tree and maintains the sorted heap property in O(logn) time.
 
4
In fact, the system is sufficiently decoupled that any of these logical stages can be performed separately for their own purpose. For example the tool can be used solely for slicing sounds, or performing batch feature analysis on a library for the purposes of MIR.
 
Literatur
1.
Zurück zum Zitat Aucouturier, J.J., Pachet, F.: Jamming with plunderphonics: interactive concatenative synthesis of music. J. New Music. Res. 35(1), 35–50 (2006)CrossRef Aucouturier, J.J., Pachet, F.: Jamming with plunderphonics: interactive concatenative synthesis of music. J. New Music. Res. 35(1), 35–50 (2006)CrossRef
2.
Zurück zum Zitat Bellman, R.: On a routing problem. Q. Appl. Math. 16(1), 87–90 (1958)CrossRef Bellman, R.: On a routing problem. Q. Appl. Math. 16(1), 87–90 (1958)CrossRef
3.
Zurück zum Zitat Bird, S.: NLTK: The natural language toolkit NLTK: The Natural Language Toolkit. In: Proceedings of the COLING/ACL on Interactive Presentation Sessions, pp. 69–72 (2016) Bird, S.: NLTK: The natural language toolkit NLTK: The Natural Language Toolkit. In: Proceedings of the COLING/ACL on Interactive Presentation Sessions, pp. 69–72 (2016)
4.
Zurück zum Zitat Brown, D.G., Golod, D.: Decoding HMMs using the k best paths: algorithms and applications. BMC Bioinf. 11(Suppl 1), S28 (2010)CrossRef Brown, D.G., Golod, D.: Decoding HMMs using the k best paths: algorithms and applications. BMC Bioinf. 11(Suppl 1), S28 (2010)CrossRef
5.
Zurück zum Zitat Cho, T., Weiss, R.J., Bello, J.P.: Exploring common variations in state of the art chord recognition systems. Sound Music. Comput. 1(January), 11–22 (2010) Cho, T., Weiss, R.J., Bello, J.P.: Exploring common variations in state of the art chord recognition systems. Sound Music. Comput. 1(January), 11–22 (2010)
6.
Zurück zum Zitat Coleman, G., Maestre, E., Bonada, J.: Augmenting sound mosaicing with descriptor-driven transformation. In: Proceedings Digital Audio Effects (DAFx-10), pp. 1–4 (2010) Coleman, G., Maestre, E., Bonada, J.: Augmenting sound mosaicing with descriptor-driven transformation. In: Proceedings Digital Audio Effects (DAFx-10), pp. 1–4 (2010)
7.
Zurück zum Zitat Collins, N.: Audiovisual concatenative synthesis. In: Proceedings of the International Computer Conference, pp. 389–392 (2007) Collins, N.: Audiovisual concatenative synthesis. In: Proceedings of the International Computer Conference, pp. 389–392 (2007)
8.
Zurück zum Zitat Dannenberg, R.B.: Concatenative synthesis using score-aligned transcriptions music analysis and segmentation. In: International Computer Music Conference, pp. 352–355 (2006) Dannenberg, R.B.: Concatenative synthesis using score-aligned transcriptions music analysis and segmentation. In: International Computer Music Conference, pp. 352–355 (2006)
9.
Zurück zum Zitat Davies, M.E.P., Hamel, P., Yoshii, K., Goto, M.: AutoMashUpper: an automatic multi-song mashup system. In: Proceedings of the 14th International Society for Music Information Retrieval Conference, ISMIR 2013, pp. 575–580 (2013) Davies, M.E.P., Hamel, P., Yoshii, K., Goto, M.: AutoMashUpper: an automatic multi-song mashup system. In: Proceedings of the 14th International Society for Music Information Retrieval Conference, ISMIR 2013, pp. 575–580 (2013)
10.
12.
Zurück zum Zitat Einbond, A., Schwarz, D.: Spatializing timbre with corpus-based concatenative synthesis. In: International Computer Music Conference, New York, USA (2010) Einbond, A., Schwarz, D.: Spatializing timbre with corpus-based concatenative synthesis. In: International Computer Music Conference, New York, USA (2010)
13.
Zurück zum Zitat Fernández, J.D., Vico, F.: AI methods in algorithmic composition: a comprehensive survey. J. Artif. Intell. Res. 48, 513–582 (2013)MathSciNetCrossRef Fernández, J.D., Vico, F.: AI methods in algorithmic composition: a comprehensive survey. J. Artif. Intell. Res. 48, 513–582 (2013)MathSciNetCrossRef
14.
Zurück zum Zitat Ford Jr., L.R.: Network flow theory. Technical report, RAND CORP SANTA MONICA CA (1956) Ford Jr., L.R.: Network flow theory. Technical report, RAND CORP SANTA MONICA CA (1956)
15.
Zurück zum Zitat Guéguen, L.: Sarment: Python modules for HMM analysis and partitioning of sequences. Bioinformatics 21(16), 3427–3428 (2005)CrossRef Guéguen, L.: Sarment: Python modules for HMM analysis and partitioning of sequences. Bioinformatics 21(16), 3427–3428 (2005)CrossRef
16.
Zurück zum Zitat Hagberg, A.A., Schult, D.A., Swart, P.J.: Exploring network structure, dynamics, and function using NetworkX. In: Varoquaux, G., Vaught, T., Millman, J. (eds.) Proceedings of the 7th Python in Science Conference, Pasadena, CA USA, pp. 11–15 (2008) Hagberg, A.A., Schult, D.A., Swart, P.J.: Exploring network structure, dynamics, and function using NetworkX. In: Varoquaux, G., Vaught, T., Millman, J. (eds.) Proceedings of the 7th Python in Science Conference, Pasadena, CA USA, pp. 11–15 (2008)
17.
Zurück zum Zitat Hunt, A.J., Black, A.W.: Unit selection in a concatenative speech synthesis system using a large speech database. In: 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, vol. 1, pp. 373–376 (1996) Hunt, A.J., Black, A.W.: Unit selection in a concatenative speech synthesis system using a large speech database. In: 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, vol. 1, pp. 373–376 (1996)
18.
Zurück zum Zitat Jones, E., Oliphant, T., Peterson, P.: SciPy: Open Source Scientific Tools for Python (2014) Jones, E., Oliphant, T., Peterson, P.: SciPy: Open Source Scientific Tools for Python (2014)
19.
Zurück zum Zitat Jordà, S., Gómez-Marín, D., Faraldo, Á., Herrera, P.: Drumming with style: from user needs to a working prototype. In: Proceedings of the International Conference on New Interfaces for Musical Expression, vol. 16, pp. 365–370 (2016) Jordà, S., Gómez-Marín, D., Faraldo, Á., Herrera, P.: Drumming with style: from user needs to a working prototype. In: Proceedings of the International Conference on New Interfaces for Musical Expression, vol. 16, pp. 365–370 (2016)
20.
Zurück zum Zitat Kaehler, A., Bradski, G.: Learning OpenCV 3: Computer Vision in C++ with the OpenCV Library. O’Reilly Media, Inc. (2016) Kaehler, A., Bradski, G.: Learning OpenCV 3: Computer Vision in C++ with the OpenCV Library. O’Reilly Media, Inc. (2016)
21.
22.
Zurück zum Zitat Maestre, E., Hazan, A., Ramirez, R., Perez, A.: Using concatenative synthesis for expressive performance in jazz saxophone. In: Proceedings of the International Computer Music Conference 2006, pp. 163–166 (2006) Maestre, E., Hazan, A., Ramirez, R., Perez, A.: Using concatenative synthesis for expressive performance in jazz saxophone. In: Proceedings of the International Computer Music Conference 2006, pp. 163–166 (2006)
24.
Zurück zum Zitat Nill, C., Sundberg, C.E.W.: List and soft symbol output viterbi algorithms: extensions and comparisons. IEEE Trans. Commun. 43(234), 277–287 (1995)CrossRef Nill, C., Sundberg, C.E.W.: List and soft symbol output viterbi algorithms: extensions and comparisons. IEEE Trans. Commun. 43(234), 277–287 (1995)CrossRef
25.
Zurück zum Zitat Nuanáin, C.Ó., Herrera, P., Jordà, S.: An evaluation framework and case study for rhythmic concatenative synthesis. In: Proceedings of the 17th International Society for Music Information Retrieval Conference, New York, USA (2016) Nuanáin, C.Ó., Herrera, P., Jordà, S.: An evaluation framework and case study for rhythmic concatenative synthesis. In: Proceedings of the 17th International Society for Music Information Retrieval Conference, New York, USA (2016)
26.
Zurück zum Zitat Nuanáin, C.Ó., Herrera, P., Jordà, S.: Rhythmic concatenative synthesis for electronic music: techniques, implementation, and evaluation. Comput. Music J. 41(2), 21–37 (2017)CrossRef Nuanáin, C.Ó., Herrera, P., Jordà, S.: Rhythmic concatenative synthesis for electronic music: techniques, implementation, and evaluation. Comput. Music J. 41(2), 21–37 (2017)CrossRef
27.
Zurück zum Zitat Nuanáin, C.Ó., Jordà, S., Herrera, P.: An interactive software instrument for real-time rhythmic concatenative synthesis. In: New Interfaces for Musical Expression, Brisbane, Australia (2016) Nuanáin, C.Ó., Jordà, S., Herrera, P.: An interactive software instrument for real-time rhythmic concatenative synthesis. In: New Interfaces for Musical Expression, Brisbane, Australia (2016)
28.
Zurück zum Zitat Nuanáin, C.Ó., Jordà, S., Herrera, P.: Towards user-tailored creative applications of concatenative synthesis in electronic dance music. In: International Workshop on Musical Metacreation (MUME), Paris, France (2016) Nuanáin, C.Ó., Jordà, S., Herrera, P.: Towards user-tailored creative applications of concatenative synthesis in electronic dance music. In: International Workshop on Musical Metacreation (MUME), Paris, France (2016)
29.
Zurück zum Zitat Orio, N., Lemouton, S., Schwarz, D.: Score following: state of the art and new developments. In: Proceedings of the Conference on New Interfaces for Musical Expression, pp. 36–41 (2003) Orio, N., Lemouton, S., Schwarz, D.: Score following: state of the art and new developments. In: Proceedings of the Conference on New Interfaces for Musical Expression, pp. 36–41 (2003)
30.
Zurück zum Zitat Papadopoulos, H., Peeters, G.: Large-scale study of chord estimation algorithms based on chroma representation and HMM. In: 2007 International Workshop on Content-Based Multimedia Indexing, Proceedings, CBMI 2007, pp. 53–60 (2007) Papadopoulos, H., Peeters, G.: Large-scale study of chord estimation algorithms based on chroma representation and HMM. In: 2007 International Workshop on Content-Based Multimedia Indexing, Proceedings, CBMI 2007, pp. 53–60 (2007)
31.
Zurück zum Zitat Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition (1993) Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition (1993)
32.
Zurück zum Zitat Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition (1989)CrossRef Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition (1989)CrossRef
33.
Zurück zum Zitat Roads, C.: Microsound. The MIT Press, Cambridge (2004) Roads, C.: Microsound. The MIT Press, Cambridge (2004)
34.
Zurück zum Zitat Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd Edn. Prentice Hall (2002) Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd Edn. Prentice Hall (2002)
35.
Zurück zum Zitat Schwarz, D.: The caterpillar system for data-driven concateantive sound synthesis. In: Proceedings of the 6th International Conference on Digital Audio Effects (DAFx-03), pp. 1–6 (2003) Schwarz, D.: The caterpillar system for data-driven concateantive sound synthesis. In: Proceedings of the 6th International Conference on Digital Audio Effects (DAFx-03), pp. 1–6 (2003)
36.
37.
Zurück zum Zitat Schwarz, D.: Distance mapping for corpus-based concatenative synthesis. In: Sound and Music Computing Conference (SMC), Padova, Italy (2011) Schwarz, D.: Distance mapping for corpus-based concatenative synthesis. In: Sound and Music Computing Conference (SMC), Padova, Italy (2011)
38.
Zurück zum Zitat Schwarz, D., Schnell, N., Gulluni, S.: Scalability in content-based navigation of sound databases. In: Proceedings of the International Computer Music Conference, pp. 253–258 (2009) Schwarz, D., Schnell, N., Gulluni, S.: Scalability in content-based navigation of sound databases. In: Proceedings of the International Computer Music Conference, pp. 253–258 (2009)
39.
Zurück zum Zitat Seshadri, N., Sundberg, C.E.: List Viterbi decoding algorithms with applications. IEEE Trans. Commun. 42(2/3/4), 313–323 (1994)CrossRef Seshadri, N., Sundberg, C.E.: List Viterbi decoding algorithms with applications. IEEE Trans. Commun. 42(2/3/4), 313–323 (1994)CrossRef
40.
Zurück zum Zitat Sheh, A., Ellis, D.P.W.: Chord segmentation and recognition using EM-trained hidden markov models. In: Proceedings of the International Conference on Music Information Retrieval (ISMIR), pp. 185–191 (2003) Sheh, A., Ellis, D.P.W.: Chord segmentation and recognition using EM-trained hidden markov models. In: Proceedings of the International Conference on Music Information Retrieval (ISMIR), pp. 185–191 (2003)
41.
Zurück zum Zitat Smith, J.B.L., Percival, G., Kato, J., Goto, M., Fukayama, S.: CrossSong puzzle: generating and unscrambling music mashups with real-time interactivity. In: Sound and Music Computing Conference, Maynooth, Ireland (2015) Smith, J.B.L., Percival, G., Kato, J., Goto, M., Fukayama, S.: CrossSong puzzle: generating and unscrambling music mashups with real-time interactivity. In: Sound and Music Computing Conference, Maynooth, Ireland (2015)
42.
Zurück zum Zitat Stoll, T.: CorpusDB: software for analysis, storage, and manipulation of sound corpora. In: International Workshop on Musical Metacreation (MuMe), pp. 108–113 (2013) Stoll, T.: CorpusDB: software for analysis, storage, and manipulation of sound corpora. In: International Workshop on Musical Metacreation (MuMe), pp. 108–113 (2013)
43.
Zurück zum Zitat Sturm, B.L.: Adaptive concatenative sound synthesis and its application to micromontage composition. Comput. Music. J. 30(4), 46–66 (2006)CrossRef Sturm, B.L.: Adaptive concatenative sound synthesis and its application to micromontage composition. Comput. Music. J. 30(4), 46–66 (2006)CrossRef
44.
Zurück zum Zitat Viterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inf. Theory 13(2), 260–269 (1967)CrossRef Viterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inf. Theory 13(2), 260–269 (1967)CrossRef
45.
46.
Zurück zum Zitat Zils, A., Pachet, F.: Musical mosaicing. In: Digital Audio Effects (DAFx), pp. 1–6 (2001) Zils, A., Pachet, F.: Musical mosaicing. In: Digital Audio Effects (DAFx), pp. 1–6 (2001)
Metadaten
Titel
k-Best Unit Selection Strategies for Musical Concatenative Synthesis
verfasst von
Cárthach Ó Nuanáin
Perfecto Herrera
Sergi Jordá
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-01692-0_6

Neuer Inhalt