nach oben

World Wide Web

Erschienen in:

18.02.2019

A general framework for learning prosodic-enhanced representation of rap lyrics

verfasst von: Hongru Liang, Haozheng Wang, Qian Li, Jun Wang, Guandong Xu, Jiawei Chen, Jin-Mao Wei, Zhenglu Yang

Erschienen in: World Wide Web | Ausgabe 6/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Learning and analyzing rap lyrics is a significant basis for many Web applications, such as music recommendation, automatic music categorization, and music information retrieval, due to the abundant source of digital music in the World Wide Web. Although numerous studies have explored the topic, knowledge in this field is far from satisfactory, because critical issues, such as prosodic information and its effective representation, as well as appropriate integration of various features, are usually ignored. In this paper, we propose a hierarchical attention variational a utoe ncoder framework (HAVAE), which simultaneously considers semantic and prosodic features for rap lyrics representation learning. Specifically, the representation of the prosodic features is encoded by phonetic transcriptions with a novel and effective strategy (i.e., rhyme2vec). Moreover, a feature aggregation strategy is proposed to appropriately integrate various features and generate prosodic-enhanced representation. A comprehensive empirical evaluation demonstrates that the proposed framework outperforms the state-of-the-art approaches under various metrics in different rap lyrics learning tasks.

Vorheriger Artikel A novel trust-based access control for social networks using fuzzy systems

Nächster Artikel Two-level task scheduling with multi-objectives in geo-distributed and large-scale SaaS cloud

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Monorhyme is a rhyme scheme in which each line has an identical rhyme.

In alternate rhyme, the rhyme is on alternate lines.

The dimension of û is equal to that of u.

http://ohhla.com/.

The source code and dataset can be viewed at https://github.com/q9s5c1/HAVAE.

A number of rap songs in the tested corpus are lack of component labels, and it is difficult to extract verses from these songs without professional knowledge. To eliminate the bias incurred by non-expert labeling and make fair comparisons, the songs with explicit ”verse” labels are extracted as the experimental dataset.

In the current paper, we report the results in the original work, and reproduce it on the crawled dataset.

Addanki, K., Wu, D.: Unsupervised rhyme scheme identification in hip hop lyrics using hidden Markov models. In: Proceedings of the International Conference on Statistical Language and Speech Processing, pp. 39–50 (2013)CrossRef

Alexey, T., Ivan, P.Y.: Music generation with variational recurrent autoencoder supported by history. arXiv:170505458 (2017)

Association, I.P.: Handbook of the International Phonetic Association: a guide to the use of the International Phonetic Alphabet. Cambridge University Press, Cambridge (1999)

Bengio, Y.: Learning deep architectures for AI. Mach. Learn. 2(1), 1–127 (2009)MathSciNetCrossRef

Bryant, P.E., MacLean, M., Bradley, L.L., Crossland, J.: Rhyme and alliteration, phoneme detection, and learning to read. Developmental psychology 26(3), 429–438 (1990)CrossRef

Chen, X., Wang, Y., Liu, Q: Visual and textual sentiment analysis using deep fusion convolutional neural networks. In: Proceedings of the 2017 IEEE International Conference on Image Processing, pp. 1557–1561 (2017)

Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 658–666 (2016)

Duddington, J.: eSpeak text to speech. http://espeak.sourceforge.net (2012)

Edwards, P.: How to Rap. Random House (2012)

10.

Fabius, O., van Amersfoort, J.R.: Variational recurrent auto-encoders. arXiv:14126581 (2014)

11.

Hadjeres, G., Nielsen, F., Pachet, F: GLSR-VAE: Geodesic latent space regularization for variational autoencoder architectures. arXiv:170704588 (2017)

12.

He, J., Zhou, M., Jiang, L.: Generating chinese classical poems with statistical machine translation models. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence, pp. 1650–1656 (2012)

13.

Hirjee, H., Brown, D.G.: Automatic detection of internal and imperfect rhymes in rap lyrics. In: Proceedings of the International Society for Music Information Retrieval Conference, pp. 711–716 (2009)

14.

Hirjee, H., Brown, D.G.: Rhyme analyzer: an analysis tool for rap lyrics. In: Proceedings of the 11th International Society for Music Information Retrieval Conference (2010)

15.

Hirjee, H., Brown, D.G.: Using automated rhyme detection to characterize rhyming style in rap music. Empirical Musicology Review 5(4), 121–145 (2010)CrossRef

16.

Hou, X., Shen, L., Sun, K., Qiu, G.: Deep feature consistent variational autoencoder. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, pp. 1133–1141 (2017)

17.

Hu, X., Bai, K., Cheng, J., Deng, J.q., Guo, Y., Hu, B., Krishnan, A.S., Wang, F.: MeDJ: multidimensional emotion-aware music delivery for adolescent. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 793–794 (2017)

18.

Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv:13126114 (2013)

19.

Kingma, D.P., Rezende, D.J., Mohamed, S., Welling, M.: Semi-supervised learning with deep generative models. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, pp. 3581–3589 (2014)

20.

Malmi, E., Takala, P., Toivonen, H., Raiko, T., Gionis, A.: DopeLearning: a computational approach to rap lyrics generation. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 195–204 (2016)

21.

Mauch, M., MacCallum, R.M., Levy, M., Leroi, A.M.: The evolution of popular music: USA 1960–2010. Royal Society open science, 2(5) (2015)CrossRef

22.

Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv:13013781 (2013)

23.

Oliveira, H.G.: Poetryme: a versatile platform for poetry generation. In: Proceedings of Workshop on Computational Creativity, Concept Invention, and General Intelligence (2012)

24.

Potash, P., Romanov, A., Rumshisky, A: Ghostwriter: using an LSTM for automatic rap lyric generation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 165–177 (2015)

25.

Potash, P., Romanov, A., Rumshisky, A.: Evaluating creative language generation: the case of rap lyric ghostwriting. arXiv:161203205(2016)

26.

Quoc, V.L., Tomas, M.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning, pp. 1188–1196 (2014)

27.

Real, R., Vargas, J.M.: The probabilistic basis of Jaccard’s index of similarity. Syst. Biol. 45(3), 380–385 (1996)CrossRef

28.

Ruli, M., Graeme, R., Henry, s.T.: Using genetic algorithms to create meaningful poetic text. J. Exp. Theor. Artif. Intell. 24(1), 43–64 (2012)CrossRef

29.

Thorsten, J.: Training linear SVMs in linear time. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 217–226 (2006)

30.

Tsaptsinos, A.: Lyrics-based music genre classification using a hierarchical attention network. In: Proceedings of the 18th International Society for Music Information Retrieval Conference, pp. 694–701 (2017)

31.

Wang, Q., Luo, T., Wang, D., Xing, C.: Chinese song iambics generation with neural attention-based model. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence, pp. 2943–2949 (2016)

32.

Wu, D., Addanki, V.S.K., Saers, M.S., Beloucif, M.: Learning to freestyle: Hip hop challenge-response induction via transduction rule segmentation. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 102–112 (2013)

33.

Yan, R., Jiang, H., Lapata, M., Lin, S.D., Lv, X., Li, X.: i, Poet: Automatic chinese poetry composition through a generative summarization framework under constrained optimization. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 2197–2203 (2013)

34.

Yu, C., Mohammed, J.Z.: KATE: K-competitive autoencoder for text. In: Proceedings of the ACM SIGKDD International Conference on Data Mining and Knowledge Discovery, pp. 85–94 (2017)

Titel: A general framework for learning prosodic-enhanced representation of rap lyrics
verfasst von: Hongru Liang
Haozheng Wang
Qian Li
Jun Wang
Guandong Xu
Jiawei Chen
Jin-Mao Wei
Zhenglu Yang
Publikationsdatum: 18.02.2019
Verlag: Springer US
Erschienen in: World Wide Web / Ausgabe 6/2019
Print ISSN: 1386-145X
Elektronische ISSN: 1573-1413
DOI: https://doi.org/10.1007/s11280-019-00672-2

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 6/2019

Minimum cost seed set for threshold influence problem under competitive models

Two-dimensional indexing to provide one-integrated-memory view of distributed memory for a massively-parallel search engine

Efficient time-interval data extraction in MVCC-based RDBMS

Editor’s Note

FID-sketch: an accurate sketch to store frequencies in data streams

Evidence-driven dubious decision making in online shopping

Premium Partner