Abstract
Over the past decade, there has been explosive growth in the availability of multimedia data, particularly image, video, and music. Because of this, content-based music retrieval has attracted attention from the multimedia database and information retrieval communities. Content-based music retrieval requires us to be able to automatically identify particular characteristics of music data. One such characteristic, useful in a range of applications, is the identification of the singer in a musical piece. Unfortunately, existing approaches to this problem suffer from either low accuracy or poor scalability. In this article, we propose a novel scheme, called Hybrid Singer Identifier (HSI), for efficient automated singer recognition. HSI uses multiple low-level features extracted from both vocal and nonvocal music segments to enhance the identification process; it achieves this via a hybrid architecture that builds profiles of individual singer characteristics based on statistical mixture models. An extensive experimental study on a large music database demonstrates the superiority of our method over state-of-the-art approaches in terms of effectiveness, efficiency, scalability, and robustness.
- Bartsch, M. and Wakefield, G. 2004. Singing voice identification using spectral envelop estimation. IEEE Trans. Speech Aud. Process. 12, 100--109.Google ScholarCross Ref
- Becchetti, C., Ricotti, L., and Ricotti, L. 1999. Speech Recognition. John Wiley, New York, NY. Google ScholarDigital Library
- Berenzweig, A., Ellis, D. P. W., and Lawrence, S. 2002. Using voice segments to improve artist classification of music. In Proceedings of the AES 22nd International Conference on Virtual, Synthetic and Entertainment Audio. 119--122.Google Scholar
- Berenzweig, A., Logan, B., Ellis, D., and Whitman, B. 2004. A large-scale evaluation of acoustic and subjective music-similarity measures. Comput. Mus. J. 28, 63--76. Google ScholarDigital Library
- Berenzweig, A. L. and Ellis, D. P. W. 2001. Locating singing voice segments within music signals. In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 119--122.Google Scholar
- Blum, A. 1990. Learning Boolean functions in an infinite attribute space. In Proceedings of the 22nd Annual ACM Symposium on Theory of Computing (STOC'90). 64--72. Google ScholarDigital Library
- Carson, C., Belongie, S., Greenspan, H., and Malik, J. 2002. Blobworld:image segmentation using expectation-maximization and its application to image querying. IEEE Trans. Patt. Anal. Mach. Intell. 24, 8, 1026--1038. Google ScholarDigital Library
- Chang, C.-C. and Lin, C.-J. 2001. LIBSVM: A library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm.Google Scholar
- Collins, M., Schapire, R. E., and Singer, Y. 2000. Logistic regression, Adaboost and Bregman distances. In Proceedings of the 13rd Annual Conference on Computational Learning Theory (COLT'00). 158--169. Google ScholarDigital Library
- Downie, J., Ehmann, A., and Hu, X. 2005a. Music-to-knowledge (M2K): A prototyping and evaluation environment for music digital library research. In Proceedings of the 5th ACM/IEEE Joint Conference on Digital Libraries (JCDL). 376. Google ScholarDigital Library
- Downie, J. S. 2006. The Music Information Retrieval Evaluation Exchange (MIREX). D-Lib Mag. 12, 12 (Dec.)Google Scholar
- Downie, J. S., West, K., Ehmann, A., and Vincent, E. 2005b. The 2005 Music Information Retrieval Evaluation Exchange (MIREX 2005) preliminary overview. In Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR). 320--323.Google Scholar
- Easley, R. F., Michel, J. G., and Devaraj, S. 2003. The MP3 open standard and the music industry's response to internet piracy. Commun. ACM 46, 11, 90--96. Google ScholarDigital Library
- Freund, Y. and Schapire, R. 1997. A decision-theoretic generalzation of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 1, 119--139. Google ScholarDigital Library
- Greenspan, H., Goldberger, J., and Ridel, L. 2001. A continuous probabilistic framework for image matching. Comput. Vis. Image Underst. 84, 3, 384--406. Google ScholarDigital Library
- Hastie, T., Tibshirani, R., and Friedman, J. 2001. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Verlag, Berlin, Germany.Google Scholar
- ISMIR. 2004. The Fifth International Conference on Music Information Retrieval. http://ismir2004.ismir.net/index.html.Google Scholar
- Jordan, M. I. 1995. Why the logistic function? a tutorial discussion on probabilities and neural networks. Tech. rep. 9503. MIT, Cambridge, MA.Google Scholar
- Kim, Y. E. and Whitman, B. 2002. Singer identification in popular music recordings using voice coding features. In Proceedings of the 3rd International Conference Music on Information Retrieval (ISMIR). 164--169.Google Scholar
- Kim, Y. E., Williamson, D., and Pilli, S. 2006. Towards quantifying the album effect in artist identification. In Proceedings of the 7th International Conference Music Information Retrieval (ISMIR'06). 393--394.Google Scholar
- Lam, C. K. M. and Tan, B. C. Y. 2001. The Internet is changing the music industry. Commun. ACM 44, 8, 62--68. Google ScholarDigital Library
- Lebanon, G. and Lafferty, J. 2001. Boosting and maximum likelihood for exponential model and Bregman distances. In Advances in Neural Information Processing Systems 14 (Proceedings of NIPS). 110--121.Google Scholar
- Li, T. and Ogihara, M. 2004. Music artist style identification by semisupervised learning from both lyrics and content. In Proceedings of the 12th Annual ACM International Conference on Multimedia. 364--367. Google ScholarDigital Library
- Li, T., Ogihara, M., and Li, Q. 2003. A comparative study on content-based music genre classification. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval. 282--289. Google ScholarDigital Library
- Liu, C. C. and Huang, C. S. 2002. A singer identification technique for content-based classification of MP3 music objects. In Proceedings of the 10th International Conference on Information and Knowledge Management (CIKM). 506--511. Google ScholarDigital Library
- Livshin, A. and Rodet, X. 2004. Musical instrument identification in continuous recordings. In Proceedings of the 7th International Conference on Digital Audio Effects (DAFx). 222--227.Google Scholar
- Lu, L., Zhang, H., and Li, S. Z. 2003. Content-based audio classification and segmentation by using support vector machines. Multimed. Syst. 8, 6, 482--492.Google ScholarCross Ref
- MIREX. 2005. Artist identification contest track. http://www.music-ir.org/evaluation/mirex-results/audio-artist/index.html.Google Scholar
- MIREX. 2007. Artist identification contest track. http://www.music-ir.org/mirex2007/index.php/AudioArtistIdentificationResults.Google Scholar
- Pachet, F. 2003. Content management for electronic music distribution. Commun. ACM 46, 4, 71--75. Google ScholarDigital Library
- Pardo, B. 2006. Special issue: Music information retrieval. Commun. ACM 49, 8. Google ScholarDigital Library
- Pinto, A. and Haus, G. 2007. A novel XML music information retrieval method using graph invariants. ACM Trans. Inf. Syst. 25, 4, 19. Google ScholarDigital Library
- Rabiner, L. and Juang, B. 1993. Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs, NJ. Google ScholarDigital Library
- Rabiner, L. and Schafer, R. 1978. Digital Processing of Speech Signals. Prentice-Hall, Englewood Cliffs, NJ.Google Scholar
- Rissanen, J. 1978. Modeling by shortest data description. Automatica 14, 465--471.Google ScholarDigital Library
- Shen, J., Shepherd, J., Cui, B., and Tan, K.-L. 2006. HSI: A novel framework for efficient automated singer identification in large music database. In Proceedings of the 22nd International Conference on Data Engineering (ICDE). 169. Google ScholarDigital Library
- Tolonen, T. and Karjalainen, M. 2000. A computationally efficient multipitch analysis model. IEEE Trans. Speech Aud. Process. 8, 4, 708--716.Google ScholarCross Ref
- Tsai, W. H. and Wang, H. M. 2006. Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals. IEEE Trans. Speech Aud. Process. 14, 1, 330--341. Google ScholarDigital Library
- Tsai, W. H., Wang, H. M., Rodgers, D., Cheng, S. S., and Yu, H. M. 2003. Blind clustering of popular music recordings based on singer voice characteristics. In Proceedings of the 4th international Conference on Music Information Retrieval (ISMIR). 167--173.Google Scholar
- Vapnik, V. 1998. Statistical Learning Theory. John Wiley & Sons. New York, NY.Google Scholar
- Whitman, B., Flake, G., and Lawrence, S. 2001. Artist detection in music with Minnowmatch. In Proceedings of the IEEE Workshop on Neural Networks for Signal Processing. 559--568.Google Scholar
- Xu, C. S., Maddage, N., and Shao, X. 2005. Automatic music classification and summarization. IEEE Trans. Speech Aud. Process. 13, 3, 441--450.Google ScholarCross Ref
- Zhang, T. 2003. Automatic singer identification. In Proceedings of the 2003 International Conference on Multimedia and Expo (ICME). 33--36. Google ScholarDigital Library
Index Terms
- A novel framework for efficient automated singer identification in large music databases
Recommendations
Towards efficient automated singer identification in large music databases
SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrievalAutomated singer identification is important in organising, browsing and retrieving data in large music databases. In this paper, we propose a novel scheme, called Hybrid Singer Identifier (HSI), for automated singer recognition. HSI can effectively use ...
Music Information Retrieval of Carnatic Songs Based on Carnatic Music Singer Identification
ICCEE '08: Proceedings of the 2008 International Conference on Computer and Electrical EngineeringIn this paper, a methodology for Carnatic music singer identification is proposed and implemented. The motive behind identifying the singer is to extend this work for efficient music information retrieval of Carnatic music song based on singer ...
A singer identification technique for content-based classification of MP3 music objects
CIKM '02: Proceedings of the eleventh international conference on Information and knowledge managementAs there is a growing amount of MP3 music data available on the Internet today, the problems related to music classification and content-based music retrieval are getting more attention recently. In this paper, we propose an approach to automatically ...
Comments