Skip to main content

2014 | OriginalPaper | Buchkapitel

Capturing the Temporal Domain in Echonest Features for Improved Classification Effectiveness

verfasst von : Alexander Schindler, Andreas Rauber

Erschienen in: Adaptive Multimedia Retrieval: Semantics, Context, and Adaptation

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper proposes Temporal Echonest Features to harness the information available from the beat-aligned vector sequences of the features provided by The Echo Nest. Rather than aggregating them via simple averaging approaches, the statistics of temporal variations are analyzed and used to represent the audio content. We evaluate the performance on four traditional music genre classification test collections and compare them to state of the art audio descriptors. Experiments reveal, that the exploitation of temporal variability from beat-aligned vector sequences and combinations of different descriptors leads to an improvement of classification accuracy. Comparing the results of Temporal Echonest Features to those of approved conventional audio descriptors used as benchmarks, these approaches perform well, often significantly outperforming their predecessors, and can be effectively used for large scale music genre classification.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bertin-Mahieux, T., Ellis, D.P.W., Whitman, B., Lamere, P.: The million song dataset. In: Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011) (2011) Bertin-Mahieux, T., Ellis, D.P.W., Whitman, B., Lamere, P.: The million song dataset. In: Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011) (2011)
2.
Zurück zum Zitat Cano, P., Gómez, E., Gouyon, F., Herrera, P., Koppenberger, M., Ong, B., Serra, X., Streich, S., Wack, N.: ISMIR 2004 audio description contest. Technical report (2006) Cano, P., Gómez, E., Gouyon, F., Herrera, P., Koppenberger, M., Ong, B., Serra, X., Streich, S., Wack, N.: ISMIR 2004 audio description contest. Technical report (2006)
3.
Zurück zum Zitat Dieleman, S., Schrauwen, B.: Audio-based music classification with a pretrained convolutional network. In: Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011) (2011) Dieleman, S., Schrauwen, B.: Audio-based music classification with a pretrained convolutional network. In: Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011) (2011)
4.
Zurück zum Zitat Ellis, D.P.W.: Classifying music audio with timbral and chroma features. In: Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR 2007) (2007) Ellis, D.P.W.: Classifying music audio with timbral and chroma features. In: Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR 2007) (2007)
5.
Zurück zum Zitat Fu, Z., Lu, G., Ting, K.M., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Trans. Multimed. 13(2), 303–319 (2011)CrossRef Fu, Z., Lu, G., Ting, K.M., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Trans. Multimed. 13(2), 303–319 (2011)CrossRef
6.
Zurück zum Zitat Hall, Mark, Frank, Eibe, Holmes, Geoffrey, Pfahringer, Bernhard, Reutemann, Peter, Witten, Ian H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRef Hall, Mark, Frank, Eibe, Holmes, Geoffrey, Pfahringer, Bernhard, Reutemann, Peter, Witten, Ian H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRef
8.
Zurück zum Zitat Lidy, T., Mayer, R., Rauber, A., Pertusa, A., Inesta, J.M.: A cartesian ensemble of feature subspace classifiers for music categorization. In: Proceedings of the 11th International Conference on Music Information Retrieval (ISMIR 2010) (2010) Lidy, T., Mayer, R., Rauber, A., Pertusa, A., Inesta, J.M.: A cartesian ensemble of feature subspace classifiers for music categorization. In: Proceedings of the 11th International Conference on Music Information Retrieval (ISMIR 2010) (2010)
9.
Zurück zum Zitat Lidy, T., Rauber, A.: In: Proceedings of the 6th International Society for Music Information Retrieval Conference (ISMIR 2005) (2005) Lidy, T., Rauber, A.: In: Proceedings of the 6th International Society for Music Information Retrieval Conference (ISMIR 2005) (2005)
10.
Zurück zum Zitat Lidy, T., Silla Jr., C.N., Cornelis, O., Gouyon, F., Rauber, A., Kaestner, Caa, Koerich, A.L.: On the suitability of state-of-the-art music information retrieval methods for analyzing, categorizing and accessing non-Western and ethnic music collections. Signal Process. 90(4), 1032–1048 (2010)CrossRefMATH Lidy, T., Silla Jr., C.N., Cornelis, O., Gouyon, F., Rauber, A., Kaestner, Caa, Koerich, A.L.: On the suitability of state-of-the-art music information retrieval methods for analyzing, categorizing and accessing non-Western and ethnic music collections. Signal Process. 90(4), 1032–1048 (2010)CrossRefMATH
11.
Zurück zum Zitat Logan, B.: Mel frequency cepstral coefficients for music modeling. In: International Symposium on Music Information Retrieval (2000) Logan, B.: Mel frequency cepstral coefficients for music modeling. In: International Symposium on Music Information Retrieval (2000)
12.
Zurück zum Zitat McKay, C., Fujinaga, I.: Musical genre classification: is it worth pursuing and how can it be improved. In: Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR 2006), pp.101–106 (2006) McKay, C., Fujinaga, I.: Musical genre classification: is it worth pursuing and how can it be improved. In: Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR 2006), pp.101–106 (2006)
13.
Zurück zum Zitat McKay, C., Fujinaga, I.: jMIR: tools for automatic music classification. In: Proceedings of the International Computer Music Conference, pp. 65–68 (2009) McKay, C., Fujinaga, I.: jMIR: tools for automatic music classification. In: Proceedings of the International Computer Music Conference, pp. 65–68 (2009)
14.
Zurück zum Zitat Pampalk, E., Rauber, A., Merkl, D.: Content-based organization and visualization of music archives. In: Proceedings of the 10th ACM International Conference on Multimedia, p. 570 (2002) Pampalk, E., Rauber, A., Merkl, D.: Content-based organization and visualization of music archives. In: Proceedings of the 10th ACM International Conference on Multimedia, p. 570 (2002)
15.
Zurück zum Zitat Rauber, A., Pampalk, E., Merkl, D.: The SOM-enhanced JukeBox: organization and visualization of music collections based on perceptual models. J. New Music Res. 32(2), 193–210 (2003)CrossRef Rauber, A., Pampalk, E., Merkl, D.: The SOM-enhanced JukeBox: organization and visualization of music collections based on perceptual models. J. New Music Res. 32(2), 193–210 (2003)CrossRef
16.
Zurück zum Zitat Schindler, A., Mayer, R., Rauber, A.: Facilitating comprehensive benchmarking experiments on the million song dataset. In: Proceedings of the 13th International Conference on Music Information Retrieval (ISMIR 2012) (2012) Schindler, A., Mayer, R., Rauber, A.: Facilitating comprehensive benchmarking experiments on the million song dataset. In: Proceedings of the 13th International Conference on Music Information Retrieval (ISMIR 2012) (2012)
17.
Zurück zum Zitat Silla, Jr., C.N., Koerich, A.L., Catholic, P., Kaestner, C.A.A.: The Latin music database. In: Proceedings of the 9th International Conference of Music Information Retrieval, p. 451. Lulu. com (2008) Silla, Jr., C.N., Koerich, A.L., Catholic, P., Kaestner, C.A.A.: The Latin music database. In: Proceedings of the 9th International Conference of Music Information Retrieval, p. 451. Lulu. com (2008)
18.
Zurück zum Zitat Tzanetakis, G.: Manipulation, analysis and retrieval systems for audio signals. Ph.D. thesis (2002) Tzanetakis, G.: Manipulation, analysis and retrieval systems for audio signals. Ph.D. thesis (2002)
19.
Zurück zum Zitat Tzanetakis, George, Cook, Perry: Marsyas: a framework for audio analysis. Organised Sound 4(3), 169–175 (2000)CrossRef Tzanetakis, George, Cook, Perry: Marsyas: a framework for audio analysis. Organised Sound 4(3), 169–175 (2000)CrossRef
20.
Zurück zum Zitat Tzanetakis, George, Cook, Perry: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002)CrossRef Tzanetakis, George, Cook, Perry: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002)CrossRef
21.
Zurück zum Zitat Witten, I.H., Frank, E., Trigg, L., Hall, M., Holmes, G., Cunningham, S.J.: Weka: practical machine learning tools and techniques with Java implementations (1999) Witten, I.H., Frank, E., Trigg, L., Hall, M., Holmes, G., Cunningham, S.J.: Weka: practical machine learning tools and techniques with Java implementations (1999)
Metadaten
Titel
Capturing the Temporal Domain in Echonest Features for Improved Classification Effectiveness
verfasst von
Alexander Schindler
Andreas Rauber
Copyright-Jahr
2014
DOI
https://doi.org/10.1007/978-3-319-12093-5_13

Neuer Inhalt