Skip to main content
Erschienen in: World Wide Web 6/2019

18.02.2019

A general framework for learning prosodic-enhanced representation of rap lyrics

verfasst von: Hongru Liang, Haozheng Wang, Qian Li, Jun Wang, Guandong Xu, Jiawei Chen, Jin-Mao Wei, Zhenglu Yang

Erschienen in: World Wide Web | Ausgabe 6/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Learning and analyzing rap lyrics is a significant basis for many Web applications, such as music recommendation, automatic music categorization, and music information retrieval, due to the abundant source of digital music in the World Wide Web. Although numerous studies have explored the topic, knowledge in this field is far from satisfactory, because critical issues, such as prosodic information and its effective representation, as well as appropriate integration of various features, are usually ignored. In this paper, we propose a hierarchical attention variational a utoe ncoder framework (HAVAE), which simultaneously considers semantic and prosodic features for rap lyrics representation learning. Specifically, the representation of the prosodic features is encoded by phonetic transcriptions with a novel and effective strategy (i.e., rhyme2vec). Moreover, a feature aggregation strategy is proposed to appropriately integrate various features and generate prosodic-enhanced representation. A comprehensive empirical evaluation demonstrates that the proposed framework outperforms the state-of-the-art approaches under various metrics in different rap lyrics learning tasks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
Monorhyme is a rhyme scheme in which each line has an identical rhyme.
 
2
In alternate rhyme, the rhyme is on alternate lines.
 
3
The dimension of û is equal to that of u.
 
5
The source code and dataset can be viewed at https://​github.​com/​q9s5c1/​HAVAE.
 
6
A number of rap songs in the tested corpus are lack of component labels, and it is difficult to extract verses from these songs without professional knowledge. To eliminate the bias incurred by non-expert labeling and make fair comparisons, the songs with explicit ”verse” labels are extracted as the experimental dataset.
 
7
In the current paper, we report the results in the original work, and reproduce it on the crawled dataset.
 
Literatur
1.
Zurück zum Zitat Addanki, K., Wu, D.: Unsupervised rhyme scheme identification in hip hop lyrics using hidden Markov models. In: Proceedings of the International Conference on Statistical Language and Speech Processing, pp. 39–50 (2013)CrossRef Addanki, K., Wu, D.: Unsupervised rhyme scheme identification in hip hop lyrics using hidden Markov models. In: Proceedings of the International Conference on Statistical Language and Speech Processing, pp. 39–50 (2013)CrossRef
2.
Zurück zum Zitat Alexey, T., Ivan, P.Y.: Music generation with variational recurrent autoencoder supported by history. arXiv:170505458 (2017) Alexey, T., Ivan, P.Y.: Music generation with variational recurrent autoencoder supported by history. arXiv:170505458 (2017)
3.
Zurück zum Zitat Association, I.P.: Handbook of the International Phonetic Association: a guide to the use of the International Phonetic Alphabet. Cambridge University Press, Cambridge (1999) Association, I.P.: Handbook of the International Phonetic Association: a guide to the use of the International Phonetic Alphabet. Cambridge University Press, Cambridge (1999)
5.
Zurück zum Zitat Bryant, P.E., MacLean, M., Bradley, L.L., Crossland, J.: Rhyme and alliteration, phoneme detection, and learning to read. Developmental psychology 26(3), 429–438 (1990)CrossRef Bryant, P.E., MacLean, M., Bradley, L.L., Crossland, J.: Rhyme and alliteration, phoneme detection, and learning to read. Developmental psychology 26(3), 429–438 (1990)CrossRef
6.
Zurück zum Zitat Chen, X., Wang, Y., Liu, Q: Visual and textual sentiment analysis using deep fusion convolutional neural networks. In: Proceedings of the 2017 IEEE International Conference on Image Processing, pp. 1557–1561 (2017) Chen, X., Wang, Y., Liu, Q: Visual and textual sentiment analysis using deep fusion convolutional neural networks. In: Proceedings of the 2017 IEEE International Conference on Image Processing, pp. 1557–1561 (2017)
7.
Zurück zum Zitat Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 658–666 (2016) Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 658–666 (2016)
9.
Zurück zum Zitat Edwards, P.: How to Rap. Random House (2012) Edwards, P.: How to Rap. Random House (2012)
10.
Zurück zum Zitat Fabius, O., van Amersfoort, J.R.: Variational recurrent auto-encoders. arXiv:14126581 (2014) Fabius, O., van Amersfoort, J.R.: Variational recurrent auto-encoders. arXiv:14126581 (2014)
11.
Zurück zum Zitat Hadjeres, G., Nielsen, F., Pachet, F: GLSR-VAE: Geodesic latent space regularization for variational autoencoder architectures. arXiv:170704588 (2017) Hadjeres, G., Nielsen, F., Pachet, F: GLSR-VAE: Geodesic latent space regularization for variational autoencoder architectures. arXiv:170704588 (2017)
12.
Zurück zum Zitat He, J., Zhou, M., Jiang, L.: Generating chinese classical poems with statistical machine translation models. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence, pp. 1650–1656 (2012) He, J., Zhou, M., Jiang, L.: Generating chinese classical poems with statistical machine translation models. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence, pp. 1650–1656 (2012)
13.
Zurück zum Zitat Hirjee, H., Brown, D.G.: Automatic detection of internal and imperfect rhymes in rap lyrics. In: Proceedings of the International Society for Music Information Retrieval Conference, pp. 711–716 (2009) Hirjee, H., Brown, D.G.: Automatic detection of internal and imperfect rhymes in rap lyrics. In: Proceedings of the International Society for Music Information Retrieval Conference, pp. 711–716 (2009)
14.
Zurück zum Zitat Hirjee, H., Brown, D.G.: Rhyme analyzer: an analysis tool for rap lyrics. In: Proceedings of the 11th International Society for Music Information Retrieval Conference (2010) Hirjee, H., Brown, D.G.: Rhyme analyzer: an analysis tool for rap lyrics. In: Proceedings of the 11th International Society for Music Information Retrieval Conference (2010)
15.
Zurück zum Zitat Hirjee, H., Brown, D.G.: Using automated rhyme detection to characterize rhyming style in rap music. Empirical Musicology Review 5(4), 121–145 (2010)CrossRef Hirjee, H., Brown, D.G.: Using automated rhyme detection to characterize rhyming style in rap music. Empirical Musicology Review 5(4), 121–145 (2010)CrossRef
16.
Zurück zum Zitat Hou, X., Shen, L., Sun, K., Qiu, G.: Deep feature consistent variational autoencoder. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, pp. 1133–1141 (2017) Hou, X., Shen, L., Sun, K., Qiu, G.: Deep feature consistent variational autoencoder. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, pp. 1133–1141 (2017)
17.
Zurück zum Zitat Hu, X., Bai, K., Cheng, J., Deng, J.q., Guo, Y., Hu, B., Krishnan, A.S., Wang, F.: MeDJ: multidimensional emotion-aware music delivery for adolescent. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 793–794 (2017) Hu, X., Bai, K., Cheng, J., Deng, J.q., Guo, Y., Hu, B., Krishnan, A.S., Wang, F.: MeDJ: multidimensional emotion-aware music delivery for adolescent. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 793–794 (2017)
18.
Zurück zum Zitat Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv:13126114 (2013) Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv:13126114 (2013)
19.
Zurück zum Zitat Kingma, D.P., Rezende, D.J., Mohamed, S., Welling, M.: Semi-supervised learning with deep generative models. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, pp. 3581–3589 (2014) Kingma, D.P., Rezende, D.J., Mohamed, S., Welling, M.: Semi-supervised learning with deep generative models. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, pp. 3581–3589 (2014)
20.
Zurück zum Zitat Malmi, E., Takala, P., Toivonen, H., Raiko, T., Gionis, A.: DopeLearning: a computational approach to rap lyrics generation. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 195–204 (2016) Malmi, E., Takala, P., Toivonen, H., Raiko, T., Gionis, A.: DopeLearning: a computational approach to rap lyrics generation. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 195–204 (2016)
21.
Zurück zum Zitat Mauch, M., MacCallum, R.M., Levy, M., Leroi, A.M.: The evolution of popular music: USA 1960–2010. Royal Society open science, 2(5) (2015)CrossRef Mauch, M., MacCallum, R.M., Levy, M., Leroi, A.M.: The evolution of popular music: USA 1960–2010. Royal Society open science, 2(5) (2015)CrossRef
22.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv:13013781 (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv:13013781 (2013)
23.
Zurück zum Zitat Oliveira, H.G.: Poetryme: a versatile platform for poetry generation. In: Proceedings of Workshop on Computational Creativity, Concept Invention, and General Intelligence (2012) Oliveira, H.G.: Poetryme: a versatile platform for poetry generation. In: Proceedings of Workshop on Computational Creativity, Concept Invention, and General Intelligence (2012)
24.
Zurück zum Zitat Potash, P., Romanov, A., Rumshisky, A: Ghostwriter: using an LSTM for automatic rap lyric generation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 165–177 (2015) Potash, P., Romanov, A., Rumshisky, A: Ghostwriter: using an LSTM for automatic rap lyric generation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 165–177 (2015)
25.
Zurück zum Zitat Potash, P., Romanov, A., Rumshisky, A.: Evaluating creative language generation: the case of rap lyric ghostwriting. arXiv:161203205(2016) Potash, P., Romanov, A., Rumshisky, A.: Evaluating creative language generation: the case of rap lyric ghostwriting. arXiv:161203205(2016)
26.
Zurück zum Zitat Quoc, V.L., Tomas, M.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning, pp. 1188–1196 (2014) Quoc, V.L., Tomas, M.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning, pp. 1188–1196 (2014)
27.
Zurück zum Zitat Real, R., Vargas, J.M.: The probabilistic basis of Jaccard’s index of similarity. Syst. Biol. 45(3), 380–385 (1996)CrossRef Real, R., Vargas, J.M.: The probabilistic basis of Jaccard’s index of similarity. Syst. Biol. 45(3), 380–385 (1996)CrossRef
28.
Zurück zum Zitat Ruli, M., Graeme, R., Henry, s.T.: Using genetic algorithms to create meaningful poetic text. J. Exp. Theor. Artif. Intell. 24(1), 43–64 (2012)CrossRef Ruli, M., Graeme, R., Henry, s.T.: Using genetic algorithms to create meaningful poetic text. J. Exp. Theor. Artif. Intell. 24(1), 43–64 (2012)CrossRef
29.
Zurück zum Zitat Thorsten, J.: Training linear SVMs in linear time. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 217–226 (2006) Thorsten, J.: Training linear SVMs in linear time. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 217–226 (2006)
30.
Zurück zum Zitat Tsaptsinos, A.: Lyrics-based music genre classification using a hierarchical attention network. In: Proceedings of the 18th International Society for Music Information Retrieval Conference, pp. 694–701 (2017) Tsaptsinos, A.: Lyrics-based music genre classification using a hierarchical attention network. In: Proceedings of the 18th International Society for Music Information Retrieval Conference, pp. 694–701 (2017)
31.
Zurück zum Zitat Wang, Q., Luo, T., Wang, D., Xing, C.: Chinese song iambics generation with neural attention-based model. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence, pp. 2943–2949 (2016) Wang, Q., Luo, T., Wang, D., Xing, C.: Chinese song iambics generation with neural attention-based model. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence, pp. 2943–2949 (2016)
32.
Zurück zum Zitat Wu, D., Addanki, V.S.K., Saers, M.S., Beloucif, M.: Learning to freestyle: Hip hop challenge-response induction via transduction rule segmentation. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 102–112 (2013) Wu, D., Addanki, V.S.K., Saers, M.S., Beloucif, M.: Learning to freestyle: Hip hop challenge-response induction via transduction rule segmentation. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 102–112 (2013)
33.
Zurück zum Zitat Yan, R., Jiang, H., Lapata, M., Lin, S.D., Lv, X., Li, X.: i, Poet: Automatic chinese poetry composition through a generative summarization framework under constrained optimization. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 2197–2203 (2013) Yan, R., Jiang, H., Lapata, M., Lin, S.D., Lv, X., Li, X.: i, Poet: Automatic chinese poetry composition through a generative summarization framework under constrained optimization. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 2197–2203 (2013)
34.
Zurück zum Zitat Yu, C., Mohammed, J.Z.: KATE: K-competitive autoencoder for text. In: Proceedings of the ACM SIGKDD International Conference on Data Mining and Knowledge Discovery, pp. 85–94 (2017) Yu, C., Mohammed, J.Z.: KATE: K-competitive autoencoder for text. In: Proceedings of the ACM SIGKDD International Conference on Data Mining and Knowledge Discovery, pp. 85–94 (2017)
Metadaten
Titel
A general framework for learning prosodic-enhanced representation of rap lyrics
verfasst von
Hongru Liang
Haozheng Wang
Qian Li
Jun Wang
Guandong Xu
Jiawei Chen
Jin-Mao Wei
Zhenglu Yang
Publikationsdatum
18.02.2019
Verlag
Springer US
Erschienen in
World Wide Web / Ausgabe 6/2019
Print ISSN: 1386-145X
Elektronische ISSN: 1573-1413
DOI
https://doi.org/10.1007/s11280-019-00672-2

Weitere Artikel der Ausgabe 6/2019

World Wide Web 6/2019 Zur Ausgabe

Premium Partner