research-article

A Probabilistic Model to Combine Tags and Acoustic Similarity for Music Retrieval

Authors:
Riccardo Miotto

University of Padova

University of Padova
View Profile

,
Nicola Orio

University of Padova

University of Padova
View Profile

Authors Info & Claims

ACM Transactions on Information Systems Volume 30 Issue 2Article No.: 8pp 1–29https://doi.org/10.1145/2180868.2180870

Published:01 May 2012Publication History

ACM Transactions on Information Systems

Abstract

The rise of the Internet has led the music industry to a transition from physical media to online products and services. As a consequence, current online music collections store millions of songs and are constantly being enriched with new content. This has created a need for music technologies that allow users to interact with these extensive collections efficiently and effectively. Music search and discovery may be carried out using tags, matching user interests and exploiting content-based acoustic similarity. One major issue in music information retrieval is how to combine such noisy and heterogeneous information sources in order to improve retrieval effectiveness. With this aim in mind, the article explores a novel music retrieval framework based on combining tags and acoustic similarity through a probabilistic graph-based representation of a collection of songs. The retrieval function highlights the path across the graph that most likely observes a user query and is used to improve state-of-the-art music search and discovery engines by delivering more relevant ranking lists. Indeed, by means of an empirical evaluation, we show how the proposed approach leads to better performances than retrieval strategies which rank songs according to individual information sources alone or which use a combination of them.

References

Agarwal, S. 2006. Ranking on graph data. In Proceedings of the International Conference on Machine Learning (ICML’06). 25--32. Google ScholarDigital Library
Barrington, L., Oda, R., and Lanckriet, G. 2009. Smarter than genius? Human evaluation of music recommender systems. In Proceedings of the International Society for Music Information Retrieval (ISMIR’09). 357--362.Google Scholar
Berenzweig, A., Logan, B., Ellis, D., and Whitman, B. 2004. A large-scale evaluation of acoustic and subjective music-similarity measures. Comput. Music J. 28, 63--76. Google ScholarDigital Library
Bertin-Mahieux, T., Eck, D., Maillet, F., and Lamere, P. 2008. Autotagger: A model for predicting social tags from acoustic features on large music databases. J. New Music Res. 37, 2, 115--135.Google ScholarCross Ref
Bertin-Mahieux, T., Weiss, R., and Ellis, D. 2010. Clustering beat-chroma patterns in a large music database. In Proceedings of the International Society for Music Information Retrieval (ISMIR’10). 111--116.Google Scholar
Bu, J., Tan, S., Chen, C., Wang, C., Wu, H., Zhang, L., and He, X. 2010. Music recommendation by unified hypergraph: Combining social media information and music content. In Proceedings of the ACM Multimedia Conference. 391--400. Google ScholarDigital Library
Carneiro, G., Chan, A., Moreno, P., and Vasconcelos, N. 2007. Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 29, 3, 394--410. Google ScholarDigital Library
Casey, M., Rhodes, C., and Slaney, M. 2008a. Analysis of minimum distances in high-dimensional musical spaces. IEEE Trans. Audio, Speech Lang. Process. 5, 16, 1015--1028. Google ScholarDigital Library
Casey, M., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., and Slaney, M. 2008b. Content-Based music information retrieval: Current directions and future challenges. Proc. IEEE 96, 4, 668--696.Google ScholarCross Ref
Celma, O. 2008. Music recommendation and discovery in the long tail. Ph.D. thesis, Universitat Pompeu Fabra, Barcelona.Google Scholar
Celma, O., Cano, P., and Herrera, P. 2006. Search sounds: An audio crawler focused on Web-logs. In Proceedings of the International Society for Music Information Retrieval (ISMIR’06). 365--366.Google Scholar
Coviello, E., Chan, A., and Lanckriet, G. 2011. Time series models for semantic music annotation. IEEE Trans. Audio, Speech Lang. Process. 19, 5, 1343--1359. Google ScholarDigital Library
Dempster, A., Laird, N., and Rubin, D. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. B39, 1, 1--38.Google Scholar
Downie, J. 2008. The music information retrieval evaluation exchange (2005--2007): A window into music information retrieval research. Acoust. Sci. Technol. 29, 4, 247--255.Google ScholarCross Ref
Feng, S., Manmatha, R., and Lavrenko, V. 2004. Multiple Bernoulli relevance models for image and video annotation. In Proceedings of the IEEE Conference on Computerc Vision and Pattern Recognition (CVPR’04). 1002--1009. Google ScholarDigital Library
Fields, B., Jacobson, K., Rhodes, C., d’Inverno, M., Sandler, M., and Casey, M. 2011. Analysis and exploitation of musician social networks for recommendation and discovery. IEEE Trans. Multimedia 13, 4, 674--686. Google ScholarDigital Library
Flexer, A., Schnitzer, D., Gasser, M., and Pohle, T. 2010. Combining features reduces hubness in audio similarity. In Proceedings of the International Society for Music Information Retrieval (ISMIR’10). 171--176.Google Scholar
Forney, G. 1973. The Viterbi algorithm. Proc. IEEE 61, 3, 268--278.Google ScholarCross Ref
Hoffman, M., Blei, D., and Cook, P. 2008. Content-Based musical similarity computation using the hierarchical Dirichlet process. In Proceedings of the International Society for Music Information Retrieval (ISMIR’08). 349--354.Google Scholar
Hoffman, M., Blei, D., and Cook, P. 2009. Easy as CBA: A simple probabilistic model for tagging music. In Proc. of ISMIR. 369--374.Google Scholar
Jensen, J., Christensen, M., Ellis, D., and Jensen, S. 2009. Quantitative analysis of a common audio similarity measure. IEEE Trans. Audio, Speech Lang. Process. 17, 4, 693--703. Google ScholarDigital Library
Knees, P., Pohle, T., Schedl, M., and Widmer, G. 2007. A music search engine built upon audio-based and Web-based similarity measures. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 23--27. Google ScholarDigital Library
Knees, P., Pohle, T., Schedl, M., Schnitzer, D., Seyerlehner, K., and Widmer, G. 2009. Augmenting text-based music retrieval with audio similarity. In Proceedings of the International Society for Music Information Retrieval (ISMIR’09). 579--584.Google Scholar
Kullback, S. and Leibler, R. 1951. On information and sufficiency. Ann. Math. Statist. 12, 2, 79--86.Google ScholarCross Ref
Lamere, P. 2008. Social tagging and music information retrieval. J. New Music Res. 37, 2, 101--114.Google ScholarCross Ref
Logan, B. 2000. Mel frequency cepstral coefficients for music modeling. In Proceedings of the International Society for Music Information Retrieval (ISMIR’00).Google Scholar
Mandel, M. and Ellis, D. 2005. Song-level features and support vector machines for music classification. In Proceedings of the International Society for Music Information Retrieval (ISMIR’05). 594--599.Google Scholar
Mandel, M. and Ellis, D. 2008. Multiple-instance learning for music information retrieval. In Proceedings of the International Society for Music Information Retrieval (ISMIR’08). 577--582.Google Scholar
Manning, C., Raghavan, P., and Schtze, H. 2008. Introduction to Information Retrieval. Cambridge University Press. Google ScholarDigital Library
McFee, B. and Lanckriet, G. 2009. Heterogenous embedding for subjective artist similarity. In Proceedings of the International Society for Music Information Retrieval (ISMIR’09). 513--518.Google Scholar
Miotto, R. and Lanckriet, G. 2012. A generative context model for semantic music annotation and retrieval. IEEE Trans. Audio, Speech Lang. Process. 20, 4, 1096--1108.Google ScholarDigital Library
Miotto, R. and Orio, N. 2010. A probabilistic approach to merge context and content information for music retrieval. In Proceedings of the International Society for Music Information Retrieval (ISMIR’10). 15--20.Google Scholar
Miotto, R., Montecchio, N., and Orio, N. 2010. Statistical music modeling aimed at identification and alignment. In Advances in Music Information Retrieval, Z. Ras and A. Wieczorkowska Eds., Springer, 187--212.Google Scholar
Ness, S., Theocharis, A., Tzanetakis, G., and Martins, L. 2009. Improving automatic music tag annotation using stacked generalization of probabilistic SVM outputs. In Proceedings of the ACM Multimedia Conference. 705--708. Google ScholarDigital Library
Orio, N. 2006. Music retrieval: A tutorial and review. Found Trends Inf. Retriev. 1, 1, 1--90. Google ScholarDigital Library
Pampalk, E. 2006. Computational models of music similarity and their application to music information retrieval. Ph.D. thesis, Vienna University of Technology.Google Scholar
Rabiner, L. 1989. A tutorial on hidden Markov models and selected application. Proc. IEEE 77, 2, 257--286.Google ScholarCross Ref
Raphael, C. 1999. Automatic segmentation of acoustic musical signals using hidden Markov models. IEEE Trans. Pattern Anal. Mach. Intell. 21, 4, 360--370. Google ScholarDigital Library
Rasiwasia, N. and Vasconcelos, N. 2007. Bridging the semantic gap: Query by semantic example. IEEE Trans. Multimedia 9, 5, 923--938. Google ScholarDigital Library
Seyerlehner, K., Widmer, G., and Knees, P. 2008. Frame level audio similarity -- A codebook approach. In Proceedings of the International Conference on Digital Audio Effects (DAFx’08). 349--356.Google Scholar
Shifrin, J., Pardo, B., Meek, C., and Birmingham, W. 2002. HMM-based musical query retrieval. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL’02). 295--300. Google ScholarDigital Library
Slaney, M., Weinberger, K., and White, W. 2008. Learning a metric for music similarity. In Proceedings of the International Society for Music Information Retrieval (ISMIR’08). 313--318.Google Scholar
Sordo, M., Laurier, C., and Celma, O. 2007. Annotating music collections: How content-based similarity helps to propagate labels. In Proceedings of the International Society for Music Information Retrieval (ISMIR’07). 531--534.Google Scholar
Tingle, D., Kim, Y., and Turnbull, D. 2010. Exploring automatic music annotation with “acoustically-objective” tags. In Proceedings of the ACM International Conference on Multimedia Retrieval (ICMR’10). 55--61. Google ScholarDigital Library
Tomasik, B., Kim, J., Ladlow, M., Augat, M., Tingle, D., Wicentowski, R., and Turnbull, D. 2009. Using regression to combine data sources for semantic music discovery. In Proceedings of the International Society for Music Information Retrieval (ISMIR’09). 405--410.Google Scholar
Tomasik, B., Thiha, P., and Turnbull, D. 2010. Beat-sync-mash-coder: A web application for real-time creation of beat-synchronous music mashups. In Proc. of IEEE ICASSP. 437--440.Google Scholar
Tsai, C. and Hung, C. 2008. Automatically annotating images with keywords: A review of image annotation systems. Recent Patents Comput. Sci 1, 55--68.Google ScholarCross Ref
Turnbull, D., Barrington, L., Torres, D., and Lanckriet, G. 2007. Towards musical query-by-semantic description using the CAL500 data set. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 439--446. Google ScholarDigital Library
Turnbull, D., Barrington, L., and Lanckriet, G. 2008a. Five approaches to collecting tags for music. In Proceedings of the International Society for Music Information Retrieval (ISMIR’08). 225--230.Google Scholar
Turnbull, D., Barrington, L., Torres, D., and Lanckriet, G. 2008b. Semantic annotation and retrieval of music and sound effects. IEEE Trans. Audio, Speech Lang. Process. 16, 2, 467--476. Google ScholarDigital Library
Turnbull, D., Barrington, L., Lanckriet, G., and Yazdani, M. 2009. Combining audio content and social context for semantic music discovery. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 387--394. Google ScholarDigital Library
Vasconcelos, N. and Lippman, A. 1998. Learning mixture hierarchies. In Proceedings of the Conference on Advances in Neutral Information Processing Systems (NIPS’98). 606--612. Google ScholarDigital Library
Wang, D., Li, T., and Ogihara, M. 2010. Are tags better than audio features? The effects of joint use of tags and audio content features for artistic style clustering. In Proceedings of the International Society for Music Information Retrieval (ISMIR’10). 57--62.Google Scholar
Yang, Y., Lin, Y., Lee, A., and Chen, H. 2009. Improving musical concept detection by ordinal regression and context fusion. In Proceedings of the International Society for Music Information Retrieval (ISMIR’09). 147--152.Google Scholar

Index Terms

A Probabilistic Model to Combine Tags and Acoustic Similarity for Music Retrieval
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing
  2. Information systems applications

Recommendations

Music similarity and retrieval
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

This tutorial serves as an introductory course to the field of and state-of-the-art in music information retrieval (MIR) and in particular to music similarity estimation which is an essential component of music retrieval. Apart from explaining ...
Read More
Music Retrieval and Recommendation: A Tutorial Overview
SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval

In this tutorial, we give an introduction to the field of and state of the art in music information retrieval (MIR). The tutorial particularly spotlights the question of music similarity, which is an essential aspect in music retrieval and ...
Read More
Semantic Annotation and Retrieval of Music and Sound Effects

We present a computer audition system that can both annotate novel audio tracks with semantically meaningful words and retrieve relevant tracks from a database of unlabeled audio content given a text-based query. We consider the related tasks of content-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Information Systems Volume 30, Issue 2
May 2012
245 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/2180868
Issue’s Table of Contents

Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 May 2012
- Accepted: 1 December 2011
- Revised: 1 September 2011
- Received: 1 February 2011
Published in tois Volume 30, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Music information retrieval
acoustic similarity
graph structure
music discovery
probabilistic model
tags
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 15
  Total Citations
  View Citations
- 650
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Probabilistic Model to Combine Tags and Acoustic Similarity for Music Retrieval

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Music similarity and retrieval

Music Retrieval and Recommendation: A Tutorial Overview

Semantic Annotation and Retrieval of Music and Sound Effects

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A Probabilistic Model to Combine Tags and Acoustic Similarity for Music Retrieval

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Music similarity and retrieval

Music Retrieval and Recommendation: A Tutorial Overview

Semantic Annotation and Retrieval of Music and Sound Effects

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media