Skip to main content
Erschienen in: Discover Computing 2/2014

01.04.2014

Learning music similarity from relative user ratings

verfasst von: Daniel Wolff, Tillman Weyde

Erschienen in: Discover Computing | Ausgabe 2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Computational modelling of music similarity is an increasingly important part of personalisation and optimisation in music information retrieval and research in music perception and cognition. The use of relative similarity ratings is a new and promising approach to modelling similarity that avoids well known problems with absolute ratings. In this article, we use relative ratings from the MagnaTagATune dataset with new and existing variants of state-of-the-art algorithms and provide the first comprehensive and rigorous evaluation of this approach. We compare metric learning based on support vector machines (SVMs) and metric-learning-to-rank (MLR), including a diagonal and a novel weighted variant, and relative distance learning with neural networks (RDNN). We further evaluate the effectiveness of different high and low level audio features and genre data, as well as dimensionality reduction methods, weighting of similarity ratings, and different sampling methods. Our results show that music similarity measures learnt on relative ratings can be significantly better than a standard Euclidian metric, depending on the choice of learning algorithm, feature sets and application scenario. MLR and SVM outperform DMLR and RDNN, while MLR with weighted ratings leads to no further performance gain. Timbral and music-structural features are most effective, and all features jointly are significantly better than any other combination of feature sets. Sharing audio clips (but not the similarity ratings) between test and training sets improves performance, in particular for the SVM-based methods, which is useful for some applications scenarios. A testing framework has been implemented in Matlab and made publicly available http://​mi.​soi.​city.​ac.​uk/​datasets/​ir2012framework so that these results are reproducible.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aho, A. V., Garey, M. R., & Ullman, J. D. (1972). The transitive reduction of a directed graph. SIAM Journal on Computing, 1(2), 131–137.CrossRefMATHMathSciNet Aho, A. V., Garey, M. R., & Ullman, J. D. (1972). The transitive reduction of a directed graph. SIAM Journal on Computing, 1(2), 131–137.CrossRefMATHMathSciNet
Zurück zum Zitat Akkermans, V., Font, F., Funollet, J., De Jong, B., Roma, G., Togias, S., et al. (2011). Freesound 2: An improved platform for sharing audio clips. In International Society for Music Information Retrieval Conference (ISMIR 2011), Late-breaking Demo Session. Miami, Florida, USA. Akkermans, V., Font, F., Funollet, J., De Jong, B., Roma, G., Togias, S., et al. (2011). Freesound 2: An improved platform for sharing audio clips. In International Society for Music Information Retrieval Conference (ISMIR 2011), Late-breaking Demo Session. Miami, Florida, USA.
Zurück zum Zitat Allan, H., Müllensiefen, D., & Wiggins, G. (2007). Methodological considerations in studies of musical similarity. In 8th International conference on music information retrieval, pp. 473–478. Allan, H., Müllensiefen, D., & Wiggins, G. (2007). Methodological considerations in studies of musical similarity. In 8th International conference on music information retrieval, pp. 473–478.
Zurück zum Zitat Bogdanov, D., Serrà, J., Wack, N., & Herrera, P. (2009). From low-level to high-level: Comparative study of music similarity measures. In IEEE International symposium on multimedia. Workshop on Advances in Music Information Research (AdMIRe). Bogdanov, D., Serrà, J., Wack, N., & Herrera, P. (2009). From low-level to high-level: Comparative study of music similarity measures. In IEEE International symposium on multimedia. Workshop on Advances in Music Information Research (AdMIRe).
Zurück zum Zitat Bosma, M., Veltkamp, R. C., & Wiering, F. (2006). Muugle: A modular music information retrieval framework. In International symposium on music information retrieval. Bosma, M., Veltkamp, R. C., & Wiering, F. (2006). Muugle: A modular music information retrieval framework. In International symposium on music information retrieval.
Zurück zum Zitat Braun, H. (1997). Neuronale Netze—Optimierung durch Lernen und Evolution. Springer, Berlin.MATH Braun, H. (1997). Neuronale Netze—Optimierung durch Lernen und Evolution. Springer, Berlin.MATH
Zurück zum Zitat Braun, H., Feulner, J., & Ullrich, V. (1991). Learning strategies for solving the planning problem using backpropagation. In Proceedings of NEURO-Nimes 91, 4th international conference on neural networks and their applications. Braun, H., Feulner, J., & Ullrich, V. (1991). Learning strategies for solving the planning problem using backpropagation. In Proceedings of NEURO-Nimes 91, 4th international conference on neural networks and their applications.
Zurück zum Zitat Casey, M., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., Slaney, M. (2008). Content-based music information retrieval: Current directions and future challenges. Proceedings of the IEEE, 96(4), 668–696.CrossRef Casey, M., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., Slaney, M. (2008). Content-based music information retrieval: Current directions and future challenges. Proceedings of the IEEE, 96(4), 668–696.CrossRef
Zurück zum Zitat Celma, O. (2008). Music recommendation and discovery in the long tail. Ph.D. thesis, Universitat Pompeu Fabra, Barcelona. Celma, O. (2008). Music recommendation and discovery in the long tail. Ph.D. thesis, Universitat Pompeu Fabra, Barcelona.
Zurück zum Zitat Davis, J. V., Kulis, B., Jain, P., Sra, S., & Dhillon, I. S. (2007). Information-theoretic metric learning. In Proceedings of the 24th international conference on machine learning, ICML ’07 (pp. 209–216). New York, NY, USA: ACM. Davis, J. V., Kulis, B., Jain, P., Sra, S., & Dhillon, I. S. (2007). Information-theoretic metric learning. In Proceedings of the 24th international conference on machine learning, ICML ’07 (pp. 209–216). New York, NY, USA: ACM.
Zurück zum Zitat Ellis, D. P. W., & Whitman, B. (2002). The quest for ground truth in musical artist similarity. In Proceedings of the international symposium on music information retrieval (ISMIR) (pp. 170–177). Ellis, D. P. W., & Whitman, B. (2002). The quest for ground truth in musical artist similarity. In Proceedings of the international symposium on music information retrieval (ISMIR) (pp. 170–177).
Zurück zum Zitat Ferrer, R., & Eerola, T. (2010). Timbral qualities of semantic structures of music. In Proceedings of the 11th International Society for Music (pp. 571–576). Ferrer, R., & Eerola, T. (2010). Timbral qualities of semantic structures of music. In Proceedings of the 11th International Society for Music (pp. 571–576).
Zurück zum Zitat Galleguillos, C., McFee, B., Belongie, S., & Lanckriet, G. R. G. (2011). From region similarity to category discovery. InIEEE conference in computer vision and patter recognition (CVPR) (pp. 2665–2672). Galleguillos, C., McFee, B., Belongie, S., & Lanckriet, G. R. G. (2011). From region similarity to category discovery. InIEEE conference in computer vision and patter recognition (CVPR) (pp. 2665–2672).
Zurück zum Zitat Gammerman, A., Vovk, V., & Vapnik, V. (1998). Learning by transduction. In G. Cooper & S. Moral (Eds.), Uncertainty in artificial intelligence (pp. 148–155). San Francisco, CA: Morgan Kaufmann. Gammerman, A., Vovk, V., & Vapnik, V. (1998). Learning by transduction. In G. Cooper & S. Moral (Eds.), Uncertainty in artificial intelligence (pp. 148–155). San Francisco, CA: Morgan Kaufmann.
Zurück zum Zitat Gentner, D., & Markman, A. (1997) Structure mapping in analogy and similarity. American Psychologist, 52(1), 45–56.CrossRef Gentner, D., & Markman, A. (1997) Structure mapping in analogy and similarity. American Psychologist, 52(1), 45–56.CrossRef
Zurück zum Zitat Hörnel, D. (2004). Chordnet: Learning and producing voice leading with neural networks and dynamic programming. Journal of New Music Research, 33(4), 387–397.CrossRef Hörnel, D. (2004). Chordnet: Learning and producing voice leading with neural networks and dynamic programming. Journal of New Music Research, 33(4), 387–397.CrossRef
Zurück zum Zitat Jehan, T. (2005). Creating music by listening. Ph.D. thesis, Massachusetts Institute of Technology, MA, USA. Jehan, T. (2005). Creating music by listening. Ph.D. thesis, Massachusetts Institute of Technology, MA, USA.
Zurück zum Zitat Karp, R. M. (1972). Reducibility among combinatorial problems. In R. E. Miller & J. W. Thatcher (Eds.), Complexity of computer computations (pp. 85–103). New York: Plenum Press.CrossRef Karp, R. M. (1972). Reducibility among combinatorial problems. In R. E. Miller & J. W. Thatcher (Eds.), Complexity of computer computations (pp. 85–103). New York: Plenum Press.CrossRef
Zurück zum Zitat Mahalanobis, P. C. (1936). On the generalised distance in statistics. In Proceedings of the National Institute of Sciences of India 2 (pp. 49–55). MIT Press. Mahalanobis, P. C. (1936). On the generalised distance in statistics. In Proceedings of the National Institute of Sciences of India 2 (pp. 49–55). MIT Press.
Zurück zum Zitat McFee, B., Barrington, L., & Lanckriet, G. (2010). Learning similarity from collaborative filters. In Proceedings of the International Society for Music Information Retrieval Conference (pp. 345–350). McFee, B., Barrington, L., & Lanckriet, G. (2010). Learning similarity from collaborative filters. In Proceedings of the International Society for Music Information Retrieval Conference (pp. 345–350).
Zurück zum Zitat McFee, B., & Lanckriet, G. (2009). Heterogeneous embedding for subjective artist similarity. In Proceedings of the international symposium on music information retrieval (ISMIR). McFee, B., & Lanckriet, G. (2009). Heterogeneous embedding for subjective artist similarity. In Proceedings of the international symposium on music information retrieval (ISMIR).
Zurück zum Zitat Mcfee, B., & Lanckriet, G. (2010). Metric learning to rank. In Proceedings of the 27th annual International conference on machine learning (ICML). Mcfee, B., & Lanckriet, G. (2010). Metric learning to rank. In Proceedings of the 27th annual International conference on machine learning (ICML).
Zurück zum Zitat McFee, B., & Lanckriet, G. (2012). Hypergraph models of playlist dialects. In 13th International symposium for music information retrieval (ISMIR2012)). McFee, B., & Lanckriet, G. (2012). Hypergraph models of playlist dialects. In 13th International symposium for music information retrieval (ISMIR2012)).
Zurück zum Zitat Musil, J., El-Nusairi, B., & Müllensiefen, D. (2012). Perceptual dimensions of short audio clips and corresponding timbre features. In Proceedings of the 9th international symposium on computer music modelling and retrieval (CMMR 2012). Musil, J., El-Nusairi, B., & Müllensiefen, D. (2012). Perceptual dimensions of short audio clips and corresponding timbre features. In Proceedings of the 9th international symposium on computer music modelling and retrieval (CMMR 2012).
Zurück zum Zitat Novello, A., Mckinney, M. F., & Kohlrausch, A. (2006). Perceptual evaluation of music similarity. In Proceedings of the 7th international conference on music information retrieval (ISMIR). Novello, A., Mckinney, M. F., & Kohlrausch, A. (2006). Perceptual evaluation of music similarity. In Proceedings of the 7th international conference on music information retrieval (ISMIR).
Zurück zum Zitat Page, K., Fields, B., De Roure, D., Crawford, T., & Downie, J. S. (2012). Reuse, remix, repeat: The workflows of mir. In Proceedings of the 13th International Society for Music Information Retrieval Conference. Porto, Portugal. Page, K., Fields, B., De Roure, D., Crawford, T., & Downie, J. S. (2012). Reuse, remix, repeat: The workflows of mir. In Proceedings of the 13th International Society for Music Information Retrieval Conference. Porto, Portugal.
Zurück zum Zitat Ricci, F. (2012). Context-aware music recommender systems: workshop keynote abstract. In Proceedings of the 21st world wide web conference, WWW 2012 (pp. 865–866). Lyon. Ricci, F. (2012). Context-aware music recommender systems: workshop keynote abstract. In Proceedings of the 21st world wide web conference, WWW 2012 (pp. 865–866). Lyon.
Zurück zum Zitat Riedmiller, M., & Braun, H. (1993). A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In Proceedings of the IEEE international conference on neural networks (pp. 586–591). San Francisco, CA. Riedmiller, M., & Braun, H. (1993). A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In Proceedings of the IEEE international conference on neural networks (pp. 586–591). San Francisco, CA.
Zurück zum Zitat Schultz, M., & Joachims, T. (2003). Learning a distance metric from relative comparisons. In Advances in neural information processing systems (NIPS). MIT Press. Schultz, M., & Joachims, T. (2003). Learning a distance metric from relative comparisons. In Advances in neural information processing systems (NIPS). MIT Press.
Zurück zum Zitat Serra, X. (2012). Data gathering for a culture specific approach in mir. In Proceedings of the 21st World Wide Web Conference, WWW 2012, Lyon, pp. 867–868. Serra, X. (2012). Data gathering for a culture specific approach in mir. In Proceedings of the 21st World Wide Web Conference, WWW 2012, Lyon, pp. 867–868.
Zurück zum Zitat Slaney, M., Weinberger, K. Q., & White, W. (2008). Learning a metric for music similarity. In J. P. Bello, E. Chew, D. Turnbull (eds.) International Society for Music Information Retrieval (ISMIR) 2008 (pp. 313–318). Slaney, M., Weinberger, K. Q., & White, W. (2008). Learning a metric for music similarity. In J. P. Bello, E. Chew, D. Turnbull (eds.) International Society for Music Information Retrieval (ISMIR) 2008 (pp. 313–318).
Zurück zum Zitat Slaney, M., & White, W. (2007). Similarity based on rating data. In Proceedings of the 2007 International Society for Music Information Retrieval (ISMIR) (pp. 479–484). Slaney, M., & White, W. (2007). Similarity based on rating data. In Proceedings of the 2007 International Society for Music Information Retrieval (ISMIR) (pp. 479–484).
Zurück zum Zitat Stober, S., & Nürnberger, A. (2010). Similarity adaptation in an exploratory retrieval scenario. In Proceedings of 8th international workshop on adaptive multimedia retrieval (AMR’10). Linz, Austria (To appear). Stober, S., & Nürnberger, A. (2010). Similarity adaptation in an exploratory retrieval scenario. In Proceedings of 8th international workshop on adaptive multimedia retrieval (AMR’10). Linz, Austria (To appear).
Zurück zum Zitat Stober, S., & Nürnberger, A. (2011). An experimental comparison of similarity adaptation approaches. In Proceedings of 9th international workshop on adaptive multimedia retrieval (AMR). Barcelona, Spain (To appear). Stober, S., & Nürnberger, A. (2011). An experimental comparison of similarity adaptation approaches. In Proceedings of 9th international workshop on adaptive multimedia retrieval (AMR). Barcelona, Spain (To appear).
Zurück zum Zitat Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector machine learning for interdependent and structured output spaces. In Proceedings of the international conference on machine learning (ICML). Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector machine learning for interdependent and structured output spaces. In Proceedings of the international conference on machine learning (ICML).
Zurück zum Zitat Tversky, A. (1977). Features of similarity. Psychological Review, 84, 327–352.CrossRef Tversky, A. (1977). Features of similarity. Psychological Review, 84, 327–352.CrossRef
Zurück zum Zitat Weinberger, K., & Saul, L. (2009). Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research, 10, 207–244.MATH Weinberger, K., & Saul, L. (2009). Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research, 10, 207–244.MATH
Zurück zum Zitat Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research, 10, 207–244.MATH Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research, 10, 207–244.MATH
Zurück zum Zitat Wolff, D., Stober, S., Nürnberger, A., & Weyde, T. (2012). A systematic comparison of music similarity adaptation approaches. In Proceedings of international symposium on music information retrieval (ISMIR) (To appear). Wolff, D., Stober, S., Nürnberger, A., & Weyde, T. (2012). A systematic comparison of music similarity adaptation approaches. In Proceedings of international symposium on music information retrieval (ISMIR) (To appear).
Zurück zum Zitat Wolff, D., & Weyde, T. (2011a). Adapting metrics for music similarity using comparative judgements. In Proceedings of international symposium on music information retrieval (ISMIR). Wolff, D., & Weyde, T. (2011a). Adapting metrics for music similarity using comparative judgements. In Proceedings of international symposium on music information retrieval (ISMIR).
Zurück zum Zitat Wolff, D., & Weyde, T. (2011b). Combining sources of description for approximating music similarity ratings. In Proceedings of 9th international workshop on adaptive multimedia retrieval (AMR). Barcelona, Spain. Wolff, D., & Weyde, T. (2011b). Combining sources of description for approximating music similarity ratings. In Proceedings of 9th international workshop on adaptive multimedia retrieval (AMR). Barcelona, Spain.
Zurück zum Zitat Wolff, D., & Weyde, T. (2011c). On culture-dependent modelling of music similarity. In Proceedings of fourth international conference of students of systematic musicology sysmus. Cologne, Germany. Wolff, D., & Weyde, T. (2011c). On culture-dependent modelling of music similarity. In Proceedings of fourth international conference of students of systematic musicology sysmus. Cologne, Germany.
Zurück zum Zitat Wolff, D., & Weyde, T. (2012). Adapting similarity on the magnatagatune database: effects of model and feature choices. In Proceedings of the 21st international conference companion on world wide web, WWW ’12 Companion (pp. 931–936). New York, NY, USA: ACM. Wolff, D., & Weyde, T. (2012). Adapting similarity on the magnatagatune database: effects of model and feature choices. In Proceedings of the 21st international conference companion on world wide web, WWW ’12 Companion (pp. 931–936). New York, NY, USA: ACM.
Zurück zum Zitat Yang, L. (2006). Distance metric learning: A comprehensive survey. Michigan State Universiy pp. 1–51. Yang, L. (2006). Distance metric learning: A comprehensive survey. Michigan State Universiy pp. 1–51.
Metadaten
Titel
Learning music similarity from relative user ratings
verfasst von
Daniel Wolff
Tillman Weyde
Publikationsdatum
01.04.2014
Verlag
Springer Netherlands
Erschienen in
Discover Computing / Ausgabe 2/2014
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI
https://doi.org/10.1007/s10791-013-9229-0

Premium Partner