nach oben

Erschienen in:

2019 | OriginalPaper | Buchkapitel

Multimodel Music Emotion Recognition Using Unsupervised Deep Neural Networks

verfasst von : Jianchao Zhou, Xiaoou Chen, Deshun Yang

Erschienen in: Proceedings of the 6th Conference on Sound and Music Technology (CSMT)

Verlag: Springer Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In most studies on multimodal music emotion recognition, different modalities are generally combined in a simple way and used for supervised training. The improvement of the experiment results illustrates the correlations between different modalities. However, few studies focus on modeling the relationships between different modal data. In this paper, we propose to model the relationships between different modalities (i.e., lyric and audio data) by deep learning methods in multimodal music emotion recognition. Several deep networks are first applied to perform unsupervised feature learning over multiple modalities. We, then, design a series of music emotion recognition experiments to evaluate the learned features. The experiment results show that the deep networks perform well on unsupervised feature learning for multimodal data and can model the relationships effectively. In addition, we demonstrate a unimodal enhancement experiment, where better features for one modality (e.g., lyric) can be learned by the proposed deep network, if the other modality (e.g., audio) is also present at unsupervised feature learning time.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel A Practical Singing Voice Detection System Based on GRU-RNN

Nächstes Kapitel Music Summary Detection with State Space Embedding and Recurrence Plot

Yang YH, Lin YC, Su YF, Chen HH (2008) A regression approach to music emotion recognition. IEEE Trans Audio Speech Lang Process 16(2):448–457CrossRef

Kim YE, Schmidt EM, Migneco R, Morton BG, Richardson P, Scott J, Speck JA, Turnbull D (2010) Music emotion recognition: a state of the art review. ResearchGate 86(00):937–952

Laurier C, Grivolla J, Herrera P (2008) Multimodal music mood classification using audio and lyrics. In: International conference on machine learning and applications, pp 688–693

Tzanetakis G, Ermolinskyi A, Cook P (2003) Pitch histograms in audio and symbolic music information retrieval. J New Music Res 32(2):143–152CrossRef

Hu X, Downie JS, Ehmann AF (2009) Lyric text mining in music mood classification. In: International society for music information retrieval conference, ISMIR 2009, pp 411–416. Kobe International Conference Center, Kobe, Japan, October

Yang YH, Lin YC, Cheng HT, Liao I, Ho YC, Chen HH (2008) Toward multi-modal music emotion classification. In: Pacific Rim conference on multimedia: advances in multimedia information processing, pp 70–79, (2008)CrossRef

Hu X, Downie JS (2010) Improving mood classification in music digital libraries by combining lyrics and audio. In: Joint international conference on digital libraries, JCDL 2010, pp 159–168, Gold Coast, Queensland, Australia, June

Lu Q, Chen X, Yang D, Wang J (2010) Boosting for multi-modal music emotion classification. In International society for music information retrieval conference, ISMIR 2010, pp 105–110, Utrecht, Netherlands, August

Zhao Y, Yang D, Chen X (2010) Multi-modal music mood classification using co-training. In: International conference on computational intelligence and software engineering, pp 1–4

10.

Srivastava N, Salakhutdinov R (2012) Multimodal learning with deep boltzmann machines. J Mach Learn Res 15(8):1967–2006MathSciNetMATH

11.

Ngiam J, Khosla A, Kim M, Nam J, Lee H, Ng AY (2011) Multimodal deep learning. In: International conference on machine learning, ICML 2011, pp 689–696. Bellevue, Washington, USA, June 28–July

12.

Liu W, Zheng WL, Lu BL (2016) Emotion recognition using multimodal deep learning

13.

Mehrabian A (1995) Framework for a comprehensive description and measurement of emotional states. Genet Soc Gen Psychol Monogr 121(3):339

14.

Zhou J, Peng L, Chen X, Yang D (2016) Robust sound event classification by using denoising autoencoder. In: 18th IEEE international workshop on multimedia signal processing, MMSP 2016, pp 1–6. Montreal, QC, Canada, September 21–23

15.

Chen H, Murray AF (2003) Continuous restricted boltzmann machine with an implementable training algorithm. Vis Image Signal Process IEE Proc 150(3):153–158CrossRef

16.

Lang PJ (1980) Behavioral treatment and bio-behavioural assessment: computer applications. Technology in mental health care delivery systems. Norwood Ablex

17.

Mckay C, Fujinaga I, Depalle P (2005) jAudio: a feature extraction library

18.

Guan D, Chen X, Yang D (2012) Music emotion regression based on multi-modal features. In: Proceedings of international symposium on computer music modeling and retrieval, pp 70–77

19.

Mckay C, Fujinaga I (2006) Symbolic: a feature extractor for midi files, pp 302–305

Titel: Multimodel Music Emotion Recognition Using Unsupervised Deep Neural Networks
verfasst von: Jianchao Zhou
Xiaoou Chen
Deshun Yang
Verlag: Springer Singapore
Buch: Proceedings of the 6th Conference on Sound and Music Technology (CSMT)
Print ISBN: 978-981-13-8706-7

Electronic ISBN: 978-981-13-8707-4

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-981-13-8707-4_3

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Frank Urbansky/© Peter Eichler / Leipzig, CO2-Fußabdruck/© Jenny Sturm / stock.adobe.com, Interview Entropie Bild 1/© Bernhard Weßling, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.