Skip to main content

2017 | OriginalPaper | Buchkapitel

Structured Prediction of Music Mood with Twin Gaussian Processes

verfasst von : Santosh Chapaneri, Deepak Jayaswal

Erschienen in: Pattern Recognition and Machine Intelligence

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Music mood is one of the most frequently used descriptors when people search for music, but due to its subjective nature, it is difficult to accurately estimate mood. In this work, we propose a structured prediction framework to model the valence and arousal dimensions of mood jointly without requiring multiple regressors. A confidence-interval based estimated consensus from crowdsourced annotations is first learned along with reliabilities of various annotators to serve as the ground truth and is shown to perform better than using the average annotation values. A variational Bayesian approach is used to learn the Gaussian mixture model representation for acoustic features. Using an efficient implementation of Twin Gaussian process for structured regression, the proposed work achieves an improvement in \(R^2\) of \(9.3\%\) for arousal and \(18.2\%\) for valence relative to state-of-the-art techniques.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Hu, X.: Music and mood: Where theory and reality meet. In: Proceedings of the 5th iConference, Chicago, USA (2010) Hu, X.: Music and mood: Where theory and reality meet. In: Proceedings of the 5th iConference, Chicago, USA (2010)
2.
Zurück zum Zitat Brinker, B., Dinther, R., Skowronek, J.: Expressed music mood classification compared with valence and arousal ratings. EURASIP J. Audio, Speech Music Process. 24, 1–14 (2012) Brinker, B., Dinther, R., Skowronek, J.: Expressed music mood classification compared with valence and arousal ratings. EURASIP J. Audio, Speech Music Process. 24, 1–14 (2012)
3.
Zurück zum Zitat Kumar, N., Guha, T., Huang, C., Vaz, C., Narayanan, S.: Novel affective features for multiscale prediction of emotion in music. In: Proceedings of the 18th IEEE International Workshop on Multimedia Signal Processing (MMSP), Montreal, Canada (2016) Kumar, N., Guha, T., Huang, C., Vaz, C., Narayanan, S.: Novel affective features for multiscale prediction of emotion in music. In: Proceedings of the 18th IEEE International Workshop on Multimedia Signal Processing (MMSP), Montreal, Canada (2016)
4.
Zurück zum Zitat Bo, L., Sminchisescu, C.: Twin Gaussian processes for structured prediction. Springer Int. J. Comput. Vis. 87(28), 1–25 (2010) Bo, L., Sminchisescu, C.: Twin Gaussian processes for structured prediction. Springer Int. J. Comput. Vis. 87(28), 1–25 (2010)
5.
Zurück zum Zitat Chin, Y., Wang, J., Wang, J., Yang, Y.: Predicting the probability density function of music emotion using emotion space mapping. IEEE Trans. Affect. Comput. PP(99), 1–10 (2016) Chin, Y., Wang, J., Wang, J., Yang, Y.: Predicting the probability density function of music emotion using emotion space mapping. IEEE Trans. Affect. Comput. PP(99), 1–10 (2016)
6.
Zurück zum Zitat Fukayama, S., Goto, M.: Music emotion recognition with adaptive aggregation of Gaussian process regressors. In: Proceedings of the 41st IEEE International Conference Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China (2016) Fukayama, S., Goto, M.: Music emotion recognition with adaptive aggregation of Gaussian process regressors. In: Proceedings of the 41st IEEE International Conference Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China (2016)
7.
Zurück zum Zitat Wang, J., Yang, Y., Wang, H., Jeng, S.: Modeling the affective content of music with a Gaussian mixture model. IEEE Trans. Affect. Comput. 6(1), 56–68 (2015)CrossRef Wang, J., Yang, Y., Wang, H., Jeng, S.: Modeling the affective content of music with a Gaussian mixture model. IEEE Trans. Affect. Comput. 6(1), 56–68 (2015)CrossRef
8.
Zurück zum Zitat Wang, J., Wang, H., Lanckriet, G.: A histogram density modeling approach to music emotion recognition. In: Proceedings of the 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia (2015) Wang, J., Wang, H., Lanckriet, G.: A histogram density modeling approach to music emotion recognition. In: Proceedings of the 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia (2015)
9.
Zurück zum Zitat Yang, Y., Chen, H.: Machine recognition of music emotion: a review. ACM Trans. Intell. Syst. Technol. 3(3), 1–30 (2012) Yang, Y., Chen, H.: Machine recognition of music emotion: a review. ACM Trans. Intell. Syst. Technol. 3(3), 1–30 (2012)
10.
Zurück zum Zitat Wan, M., Chen, X., Kaplan, L., Han, J., Gao, J., Zhao, B.: From truth discovery to trustworthy opinion discovery: An uncertainty-aware quantitative modeling approach. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California (2016) Wan, M., Chen, X., Kaplan, L., Han, J., Gao, J., Zhao, B.: From truth discovery to trustworthy opinion discovery: An uncertainty-aware quantitative modeling approach. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California (2016)
11.
Zurück zum Zitat Ramakrishna, A., Gupta, R., Grossman, R., Narayanan, S.: An expectation maximization approach to joint modeling of multidimensional ratings derived from multiple annotators. In: INTERSPEECH, San Francisco, USA (2016) Ramakrishna, A., Gupta, R., Grossman, R., Narayanan, S.: An expectation maximization approach to joint modeling of multidimensional ratings derived from multiple annotators. In: INTERSPEECH, San Francisco, USA (2016)
12.
Zurück zum Zitat Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)MATH Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)MATH
13.
Zurück zum Zitat Markov, K., Matsui, T.: Music genre and emotion recognition using Gaussian processes. IEEE Access 2, 688–697 (2014)CrossRef Markov, K., Matsui, T.: Music genre and emotion recognition using Gaussian processes. IEEE Access 2, 688–697 (2014)CrossRef
14.
Zurück zum Zitat Elhoseiny, M., Elgammal, A.: Generalized twin Gaussian processes using Sharma-Mittal divergence. Springer J. Mach. Learn. 100(2), 399–424 (2015)CrossRefMATHMathSciNet Elhoseiny, M., Elgammal, A.: Generalized twin Gaussian processes using Sharma-Mittal divergence. Springer J. Mach. Learn. 100(2), 399–424 (2015)CrossRefMATHMathSciNet
15.
Zurück zum Zitat Chen, Y., Yang, Y., Wang, J., Chen, H.: The AMG1608 dataset for music emotion recognition. In: Proceedings of the 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia (2015) Chen, Y., Yang, Y., Wang, J., Chen, H.: The AMG1608 dataset for music emotion recognition. In: Proceedings of the 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia (2015)
16.
Zurück zum Zitat Raykar, V., Yu, S., Zhao, L., Valadez, G., Florin, C., Bogoni, L., Moy, L.: Learning from crowds. J. Mach. Learn. Res. 11, 1297–1322 (2010)MathSciNet Raykar, V., Yu, S., Zhao, L., Valadez, G., Florin, C., Bogoni, L., Moy, L.: Learning from crowds. J. Mach. Learn. Res. 11, 1297–1322 (2010)MathSciNet
17.
Zurück zum Zitat Bishop, C.: Pattern Recognition and Machine Learning. Springer, New York (2006)MATH Bishop, C.: Pattern Recognition and Machine Learning. Springer, New York (2006)MATH
18.
Zurück zum Zitat Yamada, M., Sigal, L., Chang, Y.: Domain adaptation for structured regression. Int. J. Comput. Vis. 109(2), 126–145 (2014)CrossRefMATH Yamada, M., Sigal, L., Chang, Y.: Domain adaptation for structured regression. Int. J. Comput. Vis. 109(2), 126–145 (2014)CrossRefMATH
Metadaten
Titel
Structured Prediction of Music Mood with Twin Gaussian Processes
verfasst von
Santosh Chapaneri
Deepak Jayaswal
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-69900-4_82