research-article

A novel framework for efficient automated singer identification in large music databases

Authors:
Jialie Shen

Singapore Management University, Singapore

Singapore Management University, Singapore
View Profile

,
John Shepherd

The University of New South Wales, Sidney, Australia

The University of New South Wales, Sidney, Australia
View Profile

,
Bin Cui

Peking University, Beijing, China

Peking University, Beijing, China
View Profile

,
Kian-Lee Tan

National University of Singapore, Kent Ridge, Singapore

National University of Singapore, Kent Ridge, Singapore
View Profile

Authors Info & Claims

ACM Transactions on Information Systems Volume 27 Issue 3Article No.: 18pp 1–31https://doi.org/10.1145/1508850.1508856

Published:19 May 2009Publication History

ACM Transactions on Information Systems

Abstract

Over the past decade, there has been explosive growth in the availability of multimedia data, particularly image, video, and music. Because of this, content-based music retrieval has attracted attention from the multimedia database and information retrieval communities. Content-based music retrieval requires us to be able to automatically identify particular characteristics of music data. One such characteristic, useful in a range of applications, is the identification of the singer in a musical piece. Unfortunately, existing approaches to this problem suffer from either low accuracy or poor scalability. In this article, we propose a novel scheme, called Hybrid Singer Identifier (HSI), for efficient automated singer recognition. HSI uses multiple low-level features extracted from both vocal and nonvocal music segments to enhance the identification process; it achieves this via a hybrid architecture that builds profiles of individual singer characteristics based on statistical mixture models. An extensive experimental study on a large music database demonstrates the superiority of our method over state-of-the-art approaches in terms of effectiveness, efficiency, scalability, and robustness.

References

Bartsch, M. and Wakefield, G. 2004. Singing voice identification using spectral envelop estimation. IEEE Trans. Speech Aud. Process. 12, 100--109.Google ScholarCross Ref
Becchetti, C., Ricotti, L., and Ricotti, L. 1999. Speech Recognition. John Wiley, New York, NY. Google ScholarDigital Library
Berenzweig, A., Ellis, D. P. W., and Lawrence, S. 2002. Using voice segments to improve artist classification of music. In Proceedings of the AES 22nd International Conference on Virtual, Synthetic and Entertainment Audio. 119--122.Google Scholar
Berenzweig, A., Logan, B., Ellis, D., and Whitman, B. 2004. A large-scale evaluation of acoustic and subjective music-similarity measures. Comput. Mus. J. 28, 63--76. Google ScholarDigital Library
Berenzweig, A. L. and Ellis, D. P. W. 2001. Locating singing voice segments within music signals. In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 119--122.Google Scholar
Blum, A. 1990. Learning Boolean functions in an infinite attribute space. In Proceedings of the 22nd Annual ACM Symposium on Theory of Computing (STOC'90). 64--72. Google ScholarDigital Library
Carson, C., Belongie, S., Greenspan, H., and Malik, J. 2002. Blobworld:image segmentation using expectation-maximization and its application to image querying. IEEE Trans. Patt. Anal. Mach. Intell. 24, 8, 1026--1038. Google ScholarDigital Library
Chang, C.-C. and Lin, C.-J. 2001. LIBSVM: A library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm.Google Scholar
Collins, M., Schapire, R. E., and Singer, Y. 2000. Logistic regression, Adaboost and Bregman distances. In Proceedings of the 13rd Annual Conference on Computational Learning Theory (COLT'00). 158--169. Google ScholarDigital Library
Downie, J., Ehmann, A., and Hu, X. 2005a. Music-to-knowledge (M2K): A prototyping and evaluation environment for music digital library research. In Proceedings of the 5th ACM/IEEE Joint Conference on Digital Libraries (JCDL). 376. Google ScholarDigital Library
Downie, J. S. 2006. The Music Information Retrieval Evaluation Exchange (MIREX). D-Lib Mag. 12, 12 (Dec.)Google Scholar
Downie, J. S., West, K., Ehmann, A., and Vincent, E. 2005b. The 2005 Music Information Retrieval Evaluation Exchange (MIREX 2005) preliminary overview. In Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR). 320--323.Google Scholar
Easley, R. F., Michel, J. G., and Devaraj, S. 2003. The MP3 open standard and the music industry's response to internet piracy. Commun. ACM 46, 11, 90--96. Google ScholarDigital Library
Freund, Y. and Schapire, R. 1997. A decision-theoretic generalzation of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 1, 119--139. Google ScholarDigital Library
Greenspan, H., Goldberger, J., and Ridel, L. 2001. A continuous probabilistic framework for image matching. Comput. Vis. Image Underst. 84, 3, 384--406. Google ScholarDigital Library
Hastie, T., Tibshirani, R., and Friedman, J. 2001. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Verlag, Berlin, Germany.Google Scholar
ISMIR. 2004. The Fifth International Conference on Music Information Retrieval. http://ismir2004.ismir.net/index.html.Google Scholar
Jordan, M. I. 1995. Why the logistic function? a tutorial discussion on probabilities and neural networks. Tech. rep. 9503. MIT, Cambridge, MA.Google Scholar
Kim, Y. E. and Whitman, B. 2002. Singer identification in popular music recordings using voice coding features. In Proceedings of the 3rd International Conference Music on Information Retrieval (ISMIR). 164--169.Google Scholar
Kim, Y. E., Williamson, D., and Pilli, S. 2006. Towards quantifying the album effect in artist identification. In Proceedings of the 7th International Conference Music Information Retrieval (ISMIR'06). 393--394.Google Scholar
Lam, C. K. M. and Tan, B. C. Y. 2001. The Internet is changing the music industry. Commun. ACM 44, 8, 62--68. Google ScholarDigital Library
Lebanon, G. and Lafferty, J. 2001. Boosting and maximum likelihood for exponential model and Bregman distances. In Advances in Neural Information Processing Systems 14 (Proceedings of NIPS). 110--121.Google Scholar
Li, T. and Ogihara, M. 2004. Music artist style identification by semisupervised learning from both lyrics and content. In Proceedings of the 12th Annual ACM International Conference on Multimedia. 364--367. Google ScholarDigital Library
Li, T., Ogihara, M., and Li, Q. 2003. A comparative study on content-based music genre classification. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval. 282--289. Google ScholarDigital Library
Liu, C. C. and Huang, C. S. 2002. A singer identification technique for content-based classification of MP3 music objects. In Proceedings of the 10th International Conference on Information and Knowledge Management (CIKM). 506--511. Google ScholarDigital Library
Livshin, A. and Rodet, X. 2004. Musical instrument identification in continuous recordings. In Proceedings of the 7th International Conference on Digital Audio Effects (DAFx). 222--227.Google Scholar
Lu, L., Zhang, H., and Li, S. Z. 2003. Content-based audio classification and segmentation by using support vector machines. Multimed. Syst. 8, 6, 482--492.Google ScholarCross Ref
MIREX. 2005. Artist identification contest track. http://www.music-ir.org/evaluation/mirex-results/audio-artist/index.html.Google Scholar
MIREX. 2007. Artist identification contest track. http://www.music-ir.org/mirex2007/index.php/AudioArtistIdentificationResults.Google Scholar
Pachet, F. 2003. Content management for electronic music distribution. Commun. ACM 46, 4, 71--75. Google ScholarDigital Library
Pardo, B. 2006. Special issue: Music information retrieval. Commun. ACM 49, 8. Google ScholarDigital Library
Pinto, A. and Haus, G. 2007. A novel XML music information retrieval method using graph invariants. ACM Trans. Inf. Syst. 25, 4, 19. Google ScholarDigital Library
Rabiner, L. and Juang, B. 1993. Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs, NJ. Google ScholarDigital Library
Rabiner, L. and Schafer, R. 1978. Digital Processing of Speech Signals. Prentice-Hall, Englewood Cliffs, NJ.Google Scholar
Rissanen, J. 1978. Modeling by shortest data description. Automatica 14, 465--471.Google ScholarDigital Library
Shen, J., Shepherd, J., Cui, B., and Tan, K.-L. 2006. HSI: A novel framework for efficient automated singer identification in large music database. In Proceedings of the 22nd International Conference on Data Engineering (ICDE). 169. Google ScholarDigital Library
Tolonen, T. and Karjalainen, M. 2000. A computationally efficient multipitch analysis model. IEEE Trans. Speech Aud. Process. 8, 4, 708--716.Google ScholarCross Ref
Tsai, W. H. and Wang, H. M. 2006. Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals. IEEE Trans. Speech Aud. Process. 14, 1, 330--341. Google ScholarDigital Library
Tsai, W. H., Wang, H. M., Rodgers, D., Cheng, S. S., and Yu, H. M. 2003. Blind clustering of popular music recordings based on singer voice characteristics. In Proceedings of the 4th international Conference on Music Information Retrieval (ISMIR). 167--173.Google Scholar
Vapnik, V. 1998. Statistical Learning Theory. John Wiley & Sons. New York, NY.Google Scholar
Whitman, B., Flake, G., and Lawrence, S. 2001. Artist detection in music with Minnowmatch. In Proceedings of the IEEE Workshop on Neural Networks for Signal Processing. 559--568.Google Scholar
Xu, C. S., Maddage, N., and Shao, X. 2005. Automatic music classification and summarization. IEEE Trans. Speech Aud. Process. 13, 3, 441--450.Google ScholarCross Ref
Zhang, T. 2003. Automatic singer identification. In Proceedings of the 2003 International Conference on Multimedia and Expo (ICME). 33--36. Google ScholarDigital Library

Index Terms

A novel framework for efficient automated singer identification in large music databases
1. Applied computing
  1. Arts and humanities
    1. Performing arts
    2. Sound and music computing
2. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
    2. Retrieval models and ranking

Recommendations

Towards efficient automated singer identification in large music databases
SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval

Automated singer identification is important in organising, browsing and retrieving data in large music databases. In this paper, we propose a novel scheme, called Hybrid Singer Identifier (HSI), for automated singer recognition. HSI can effectively use ...
Read More
Music Information Retrieval of Carnatic Songs Based on Carnatic Music Singer Identification
ICCEE '08: Proceedings of the 2008 International Conference on Computer and Electrical Engineering

In this paper, a methodology for Carnatic music singer identification is proposed and implemented. The motive behind identifying the singer is to extend this work for efficient music information retrieval of Carnatic music song based on singer ...
Read More
A singer identification technique for content-based classification of MP3 music objects
CIKM '02: Proceedings of the eleventh international conference on Information and knowledge management

As there is a growing amount of MP3 music data available on the Internet today, the problems related to music classification and content-based music retrieval are getting more attention recently. In this paper, we propose an approach to automatically ...
Read More

Reviews

Reviewer: Rosa Michaelson

It is very difficult to automate the recognition of music and musicians by content rather than bibliographical detail. Recent work to identify a singer uses voice data, often abstracted from a more complicated musical source, as the most representative feature, with some form of statistical modeling and machine learning of the singer's characteristics, so that further examples can be tested and categorized. Shen et al. regard other information, such as beat and timbre, as being of equal and additional importance to vocal data, as well as what they call the genre of the piece, the accompanying instrumental music. The authors present in this paper a new method, the hybrid singer identification (HSI), which they claim is more robust than previous techniques. HSI is a multi-faceted method, in which the four specific features noted above are abstracted from a piece of music, and a statistical profile for a particular singer is constructed for each feature, from sample performances. This profile is then used to classify further songs. Several assumptions are made in the creation of each profile, namely, that singers tend to play with the same backing band and that the type of instrumentation does not change from recording to recording. These assumptions are not realistic, since session musicians are used extensively in recordings of major artists who may also change style, hence the instrumentation, across a range of musical genres. Problems also occur with the first stage of profiling-using datasets from complete albums biases the "learning" toward the style of an album, not a singer, although HSI attempts to overcome this bias. The authors provide a useful overview of a range of associated research that is often conducted on small datasets, introduce us to a benchmarking system for such research set up in 2005, and proceed to demonstrate that HSI performs better than comparative methods over a number of factors, such as robustness and scalability. Their experimental work uses a large dataset of commercial popular singers of the late 20th century; it would be interesting to see how HSI fares when applied to different types of music and to what extent it can apply to other forms of multimedia. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Information Systems Volume 27, Issue 3
May 2009
206 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/1508850
Issue’s Table of Contents

Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 May 2009
- Accepted: 1 September 2008
- Revised: 1 May 2007
- Received: 1 September 2006
Published in tois Volume 27, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
EM algorithm
Gaussian mixture models
Music retrieval
classification
evaluation
singer identification
statistical modeling
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 35
  Total Citations
  View Citations
- 854
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A novel framework for efficient automated singer identification in large music databases

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Towards efficient automated singer identification in large music databases

Music Information Retrieval of Carnatic Songs Based on Carnatic Music Singer Identification

A singer identification technique for content-based classification of MP3 music objects

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A novel framework for efficient automated singer identification in large music databases

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Towards efficient automated singer identification in large music databases

Music Information Retrieval of Carnatic Songs Based on Carnatic Music Singer Identification

A singer identification technique for content-based classification of MP3 music objects

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media