Skip to main content
Top
Published in: Neural Computing and Applications 13/2021

12-11-2020 | Original Article

Music genre profiling based on Fisher manifolds and Probabilistic Quantum Clustering

Authors: Raúl V. Casaña-Eslava, Ian H. Jarman, Sandra Ortega-Martorell, Paulo J. G. Lisboa, José D. Martín-Guerrero

Published in: Neural Computing and Applications | Issue 13/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Probabilistic classifiers induce a similarity metric at each location in the space of the data. This is measured by the Fisher Information Matrix. Pairwise distances in this Riemannian space, calculated along geodesic paths, can be used to generate a similarity map of the data. The novelty in the paper is twofold; to improve the methodology for visualisation of data structures in low-dimensional manifolds, and to illustrate the value of inferring the structure from a probabilistic classifier by metric learning, through application to music data. This leads to the discovery of new structures and song similarities beyond the original genre classification labels. These similarities are not directly observable by measuring Euclidean distances between features of the original space, but require the correct metric to reflect similarity based on genre. The results quantify the extent to which music from bands typically associated with one particular genre can, in fact, crossover strongly to another genre.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
1.
go back to reference Amari SI (1998) Natural gradient works efficiently in learning. Neural comput 10(2):251–276CrossRef Amari SI (1998) Natural gradient works efficiently in learning. Neural comput 10(2):251–276CrossRef
2.
go back to reference Amari Si WuS (1999) Improving support vector machine classifiers by modifying kernel functions. Neural Netw 12(6):783–789CrossRef Amari Si WuS (1999) Improving support vector machine classifiers by modifying kernel functions. Neural Netw 12(6):783–789CrossRef
3.
go back to reference Bogdanov D, Serra J, Wack N, Herrera P (2009) From low-level to high-level: Comparative study of music similarity measures. In: 2009 11th IEEE international symposium on multimedia, pp 453–458. IEEE Bogdanov D, Serra J, Wack N, Herrera P (2009) From low-level to high-level: Comparative study of music similarity measures. In: 2009 11th IEEE international symposium on multimedia, pp 453–458. IEEE
6.
go back to reference Casaña-Eslava RV, Jarman IH, Lisboa PJ, Martín-Guerrero JD (2017) Quantum clustering in non-spherical data distributions: finding a suitable number of clusters. Neurocomputing 268:127–141CrossRef Casaña-Eslava RV, Jarman IH, Lisboa PJ, Martín-Guerrero JD (2017) Quantum clustering in non-spherical data distributions: finding a suitable number of clusters. Neurocomputing 268:127–141CrossRef
7.
go back to reference Casaña-Eslava RV, Martín-Guerrero JD, Ortega-Martorell S, Lisboa PJ, Jarman IH (2019) Scalable implementation of measuring distances in a riemannian manifold based on the fisher information metric. In: 2019 International joint conference on neural networks (IJCNN), pp 1–7. IEEE Casaña-Eslava RV, Martín-Guerrero JD, Ortega-Martorell S, Lisboa PJ, Jarman IH (2019) Scalable implementation of measuring distances in a riemannian manifold based on the fisher information metric. In: 2019 International joint conference on neural networks (IJCNN), pp 1–7. IEEE
8.
go back to reference Casey MA, Veltkamp R, Goto M, Leman M, Rhodes C, Slaney M (2008) Content-based music information retrieval: current directions and future challenges. Proc IEEE 96(4):668–696CrossRef Casey MA, Veltkamp R, Goto M, Leman M, Rhodes C, Slaney M (2008) Content-based music information retrieval: current directions and future challenges. Proc IEEE 96(4):668–696CrossRef
9.
go back to reference Chambers SJ, Jarman IH, Etchells TA, Lisboa PJG (2013) Inference of number of prototypes with a framework approach to k-means clustering. Int J Biomed Eng Technol 13(4):323–340CrossRef Chambers SJ, Jarman IH, Etchells TA, Lisboa PJG (2013) Inference of number of prototypes with a framework approach to k-means clustering. Int J Biomed Eng Technol 13(4):323–340CrossRef
10.
go back to reference Cortes C, Vapnik V (1995) Support-vector networks. Mach Learning 20(3):273–297MATH Cortes C, Vapnik V (1995) Support-vector networks. Mach Learning 20(3):273–297MATH
11.
go back to reference Cox MA, Cox TF (2008) Multidimensional scaling. In: Chen C, Härdle WK, Unwin A (eds) Handbook of data visualization, Springer, Heidelberg, pp 315–347CrossRef Cox MA, Cox TF (2008) Multidimensional scaling. In: Chen C, Härdle WK, Unwin A (eds) Handbook of data visualization, Springer, Heidelberg, pp 315–347CrossRef
12.
14.
go back to reference Goto M, Goto T (2005) Musicream: New music playback interface for streaming, sticking, sorting, and recalling musical pieces. In: ISMIR, pp 404–411 Goto M, Goto T (2005) Musicream: New music playback interface for streaming, sticking, sorting, and recalling musical pieces. In: ISMIR, pp 404–411
15.
go back to reference Gower JC (1966) Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53(3–4):325–338MathSciNetCrossRef Gower JC (1966) Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53(3–4):325–338MathSciNetCrossRef
16.
go back to reference Hamasaki M, Goto M (2013) Songrium: A music browsing assistance service based on visualization of massive open collaboration within music content creation community. In: Proceedings of the 9th international symposium on open collaboration, pp 1–10 Hamasaki M, Goto M (2013) Songrium: A music browsing assistance service based on visualization of massive open collaboration within music content creation community. In: Proceedings of the 9th international symposium on open collaboration, pp 1–10
17.
go back to reference Haykin SS (2009) Neural networks and learning machines, 3rd edn. Pearson Education, Upper Saddle River Haykin SS (2009) Neural networks and learning machines, 3rd edn. Pearson Education, Upper Saddle River
18.
go back to reference Horn D, Gottlieb A (2001) Algorithm for data clustering in pattern recognition problems based on quantum mechanics. Phys Rev Lett 88(1):018702CrossRef Horn D, Gottlieb A (2001) Algorithm for data clustering in pattern recognition problems based on quantum mechanics. Phys Rev Lett 88(1):018702CrossRef
19.
go back to reference Horn D, Gottlieb A (2001) The method of quantum clustering. Proc Neural Inf Process Syst NIPS 2001:769–776 Horn D, Gottlieb A (2001) The method of quantum clustering. Proc Neural Inf Process Syst NIPS 2001:769–776
20.
go back to reference Jaakkola T, Haussler D et al (1999) Exploiting generative models in discriminative classifiers. In: Kearns MJ, Solla SA, Cohn DA (eds) Advances in neural information processing systems 11. MIT Press, pp 487–493 Jaakkola T, Haussler D et al (1999) Exploiting generative models in discriminative classifiers. In: Kearns MJ, Solla SA, Cohn DA (eds) Advances in neural information processing systems 11. MIT Press, pp 487–493
21.
go back to reference Jones MC, Downie JS, Ehmann AF (2007) Human similarity judgments: Implications for the design of formal evaluations. In: ISMIR, pp 539–542 Jones MC, Downie JS, Ehmann AF (2007) Human similarity judgments: Implications for the design of formal evaluations. In: ISMIR, pp 539–542
22.
go back to reference Kaski S, Sinkkonen J (2000) Metrics that learn relevance. In: Proceedings of the IEEE-INNS-ENNS international joint conference on neural networks. IJCNN 2000. Neural computing: new challenges and perspectives for the new millennium, vol 5, pp 547–552 https://doi.org/10.1109/IJCNN.2000.861526 Kaski S, Sinkkonen J (2000) Metrics that learn relevance. In: Proceedings of the IEEE-INNS-ENNS international joint conference on neural networks. IJCNN 2000. Neural computing: new challenges and perspectives for the new millennium, vol 5, pp 547–552 https://​doi.​org/​10.​1109/​IJCNN.​2000.​861526
24.
go back to reference Kim JH, Tomasik B, Turnbull D (2009) Using artist similarity to propagate semantic information. ISMIR 9:375–380 Kim JH, Tomasik B, Turnbull D (2009) Using artist similarity to propagate semantic information. ISMIR 9:375–380
25.
go back to reference Knees P, Pampalk E, Widmer G (2004) Artist classification with web-based data. In: ISMIR Knees P, Pampalk E, Widmer G (2004) Artist classification with web-based data. In: ISMIR
26.
go back to reference Knees P, Schedl M, Pohle T, Widmer G (2006) An innovative three-dimensional user interface for exploring music collections enriched. In: Proceedings of the 14th ACM international conference on Multimedia, pp 17–24 Knees P, Schedl M, Pohle T, Widmer G (2006) An innovative three-dimensional user interface for exploring music collections enriched. In: Proceedings of the 14th ACM international conference on Multimedia, pp 17–24
27.
go back to reference Kullback S (1997) Information theory and statistics. Courier Corporation, New YorkMATH Kullback S (1997) Information theory and statistics. Courier Corporation, New YorkMATH
29.
go back to reference Li Y, Wang Y, Wang Y, Jiao L, Liu Y (2016) Quantum clustering using kernel entropy component analysis. Neurocomputing 202:36–48CrossRef Li Y, Wang Y, Wang Y, Jiao L, Liu Y (2016) Quantum clustering using kernel entropy component analysis. Neurocomputing 202:36–48CrossRef
30.
go back to reference Lippens S, Martens JP, De Mulder T (2004) A comparison of human and automatic musical genre classification. In: 2004 IEEE international conference on acoustics, speech, and signal processing, vol 4, pp. iv–iv. IEEE Lippens S, Martens JP, De Mulder T (2004) A comparison of human and automatic musical genre classification. In: 2004 IEEE international conference on acoustics, speech, and signal processing, vol 4, pp. iv–iv. IEEE
31.
go back to reference Lisboa PJG, Etchells TA, Jarman IH, Chambers SJ (2013) Finding reproducible cluster partitions for the k-means algorithm. BMC Bioinf 14(Suppl. 1):S8CrossRef Lisboa PJG, Etchells TA, Jarman IH, Chambers SJ (2013) Finding reproducible cluster partitions for the k-means algorithm. BMC Bioinf 14(Suppl. 1):S8CrossRef
32.
go back to reference Mandel MI, Pascanu R, Eck D, Bengio Y, Aiello LM, Schifanella R, Menczer F (2011) Contextual tag inference. ACM Trans Multimed Comput Commun Appl (TOMM) 7(1):1–18 Mandel MI, Pascanu R, Eck D, Bengio Y, Aiello LM, Schifanella R, Menczer F (2011) Contextual tag inference. ACM Trans Multimed Comput Commun Appl (TOMM) 7(1):1–18
33.
go back to reference McKay C (2010) Automatic music classification with jMIR. Citeseer McKay C (2010) Automatic music classification with jMIR. Citeseer
34.
go back to reference McKay C, Fujinaga I, Depalle P (2005) jaudio: A feature extraction library. In: Proceedings of the international conference on music information retrieval, pp 600–603 McKay C, Fujinaga I, Depalle P (2005) jaudio: A feature extraction library. In: Proceedings of the international conference on music information retrieval, pp 600–603
35.
go back to reference Miotto R, Barrington L, Lanckriet GR (2010) Improving auto-tagging by modeling semantic co-occurrences. In: ISMIR, pp 297–302 Miotto R, Barrington L, Lanckriet GR (2010) Improving auto-tagging by modeling semantic co-occurrences. In: ISMIR, pp 297–302
38.
go back to reference Newman ME (2004) Detecting community structure in networks. Eur Phys J B Conden Matter Complex Syst 38(2):321–330CrossRef Newman ME (2004) Detecting community structure in networks. Eur Phys J B Conden Matter Complex Syst 38(2):321–330CrossRef
39.
go back to reference Parisi L, RaviChandran N, Manaog ML (2020) A novel hybrid algorithm for aiding prediction of prognosis in patients with hepatitis. Neural Comput Appl 32(8):3839–3852CrossRef Parisi L, RaviChandran N, Manaog ML (2020) A novel hybrid algorithm for aiding prediction of prognosis in patients with hepatitis. Neural Comput Appl 32(8):3839–3852CrossRef
40.
go back to reference Platt J et al (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large Margin Classif 10(3):61–74 Platt J et al (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large Margin Classif 10(3):61–74
41.
go back to reference Rao CR (1992) Information and the accuracy attainable in the estimation of statistical parameters. In: Kotz S, Johnson NL (eds) Breakthroughs in statistics. Springer, New York, pp 235–247CrossRef Rao CR (1992) Information and the accuracy attainable in the estimation of statistical parameters. In: Kotz S, Johnson NL (eds) Breakthroughs in statistics. Springer, New York, pp 235–247CrossRef
42.
go back to reference Ruiz H, Etchells TA, Jarman IH, Martín JD, Lisboa PJ (2013) A principled approach to network-based classification and data representation. Neurocomputing 112:79–91CrossRef Ruiz H, Etchells TA, Jarman IH, Martín JD, Lisboa PJ (2013) A principled approach to network-based classification and data representation. Neurocomputing 112:79–91CrossRef
43.
go back to reference Sammon JW (1969) A nonlinear mapping for data structure analysis. IEEE Trans Comput 18(5):401–409CrossRef Sammon JW (1969) A nonlinear mapping for data structure analysis. IEEE Trans Comput 18(5):401–409CrossRef
44.
go back to reference Schedl M, Flexer A, Urbano J (2013) The neglected user in music information retrieval research. J Intell Inf Syst 41(3):523–539CrossRef Schedl M, Flexer A, Urbano J (2013) The neglected user in music information retrieval research. J Intell Inf Syst 41(3):523–539CrossRef
45.
go back to reference Schedl M, Gutiérrez EG, Urbano J (2014) Music information retrieval: Recent developments and applications. Foundations and Trends in Information Retrieval. 2014 Sept 12; 8 (2-3): 127–261 Schedl M, Gutiérrez EG, Urbano J (2014) Music information retrieval: Recent developments and applications. Foundations and Trends in Information Retrieval. 2014 Sept 12; 8 (2-3): 127–261
46.
go back to reference Schedl M, Pohle T, Knees P, Widmer G (2011) Exploring the music similarity space on the web. ACM Trans Inf Syst (TOIS) 29(3):1–24CrossRef Schedl M, Pohle T, Knees P, Widmer G (2011) Exploring the music similarity space on the web. ACM Trans Inf Syst (TOIS) 29(3):1–24CrossRef
47.
go back to reference Schindler A, Mayer R, Rauber A (2012) Facilitating comprehensive benchmarking experiments on the million song dataset. In: ISMIR, pp 469–474 Schindler A, Mayer R, Rauber A (2012) Facilitating comprehensive benchmarking experiments on the million song dataset. In: ISMIR, pp 469–474
48.
go back to reference Schindler A, Rauber A (2012) Capturing the temporal domain in echonest features for improved classification effectiveness. In: International workshop on adaptive multimedia retrieval, Springer, pp 214–227 Schindler A, Rauber A (2012) Capturing the temporal domain in echonest features for improved classification effectiveness. In: International workshop on adaptive multimedia retrieval, Springer, pp 214–227
49.
go back to reference Seyerlehner K, Schedl M, Pohle T, Knees P (2010) Using block-level features for genre classification, tag classification and music similarity estimation. Submission to Audio Music Similarity and Retrieval Task of MIREX 2010 Seyerlehner K, Schedl M, Pohle T, Knees P (2010) Using block-level features for genre classification, tag classification and music similarity estimation. Submission to Audio Music Similarity and Retrieval Task of MIREX 2010
50.
go back to reference Sordo M et al (2012) Semantic annotation of music collections: A computational approach. Ph.D. thesis, Universitat Pompeu Fabra Sordo M et al (2012) Semantic annotation of music collections: A computational approach. Ph.D. thesis, Universitat Pompeu Fabra
52.
go back to reference Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10(5):293–302CrossRef Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10(5):293–302CrossRef
53.
go back to reference Urbano J (2013) Evaluation in audio music similarity. Ph.D. thesis, Universidad Carlos III de Madrid Urbano J (2013) Evaluation in audio music similarity. Ph.D. thesis, Universidad Carlos III de Madrid
54.
go back to reference Urbano J, Morato J, Marrero M, Martín D (2010) Crowdsourcing preference judgments for evaluation of music similarity tasks. In: ACM SIGIR workshop on crowdsourcing for search evaluation, ACM New York, pp 9–16 Urbano J, Morato J, Marrero M, Martín D (2010) Crowdsourcing preference judgments for evaluation of music similarity tasks. In: ACM SIGIR workshop on crowdsourcing for search evaluation, ACM New York, pp 9–16
55.
go back to reference Vincent P, Bengio Y (2003) Manifold parzen windows. In: Advances in neural information processing systems, pp 849–856 Vincent P, Bengio Y (2003) Manifold parzen windows. In: Advances in neural information processing systems, pp 849–856
57.
go back to reference Young G, Householder AS (1938) Discussion of a set of points in terms of their mutual distances. Psychometrika 3(1):19–22CrossRef Young G, Householder AS (1938) Discussion of a set of points in terms of their mutual distances. Psychometrika 3(1):19–22CrossRef
58.
go back to reference Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. Adv Neural Inf Process Syst 17:1601–1608 Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. Adv Neural Inf Process Syst 17:1601–1608
59.
go back to reference Zhang YC, Séaghdha DÓ, Quercia D, Jambor T (2012) Auralist: introducing serendipity into music recommendation. In: Proceedings of the fifth ACM international conference on Web search and data mining, pp 13–22 Zhang YC, Séaghdha DÓ, Quercia D, Jambor T (2012) Auralist: introducing serendipity into music recommendation. In: Proceedings of the fifth ACM international conference on Web search and data mining, pp 13–22
Metadata
Title
Music genre profiling based on Fisher manifolds and Probabilistic Quantum Clustering
Authors
Raúl V. Casaña-Eslava
Ian H. Jarman
Sandra Ortega-Martorell
Paulo J. G. Lisboa
José D. Martín-Guerrero
Publication date
12-11-2020
Publisher
Springer London
Published in
Neural Computing and Applications / Issue 13/2021
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-020-05499-x

Other articles of this Issue 13/2021

Neural Computing and Applications 13/2021 Go to the issue

Premium Partner