Article

Hierarchical filtering method for content-based music retrieval via acoustic input

Authors:
Jyh-Shing Roger Jang

National Tsing Hua University, Taiwan

National Tsing Hua University, Taiwan
View Profile

,
Hong-Ru Lee

National Tsing Hua University, Taiwan

National Tsing Hua University, Taiwan
View Profile

MULTIMEDIA '01: Proceedings of the ninth ACM international conference on MultimediaOctober 2001Pages 401–410https://doi.org/10.1145/500141.500201

Published:01 October 2001Publication History

MULTIMEDIA '01: Proceedings of the ninth ACM international conference on Multimedia

Pages 401–410

ABSTRACT

This paper presents an implementation of a content-based music retrieval system that can take a user's acoustic input (8-second clip of singing or humming) via a microphone and then retrieve the intended song from a database containing over 3000 candidate songs. The system, known as Super MBox, demonstrates the feasibility of real-time music retrieval with a high success rate. Super MBox first takes the user's acoustic input from a microphone and converts it into a pitch vector. Then a hierarchical filtering method (HFM) is used to first filter out 80% unlikely candidates and then compare the query input with the remaining 20% candidates in a detailed manner. The output of Super MBox is a ranked song list according to the computed similarity scores. A brief mathematical analysis of the two-step HFM is given in the paper to explain how to derive the optimum parameters of the comparison engine. The proposed HFM and its analysis framework can be directly applied to other multimedia information retrieval systems. We have tested Super MBox extensively and found the top-20 success rate is over 85%, based on a dataset of about singing/humming 2000 clips from people with mediocre singing skills. Our studies demonstrate the feasibility of using Super MBox as a prototype for music search engines over the Internet and/or query engines in digital music libraries.

References

1.Brown, J. and Zhang, B. "Musical frequency tracking using the methods of conventional and 'narrowed' autocorrelation". Journal of the Acoustical Society of America, Volume 89, Number 5, pages 2346-2354, 1991.Google ScholarCross Ref
2.Chan, Chok-ki, and Ma, Chi-Kit, "A Fast Method of Designing Better Codebooks for Image Vector Quantization," IEEE Transactions on Communications, Vol. 42, No. 21314, PP. 237-242, February/March/April, 1994.Google ScholarCross Ref
3.Chen B. and Jang, J.-S. Roger "Query by Singing", 11th JPPR Conference on Computer Vision, Graphics, and Image Processing, PP. 529-536, Taiwan, Aug 1998.Google Scholar
4.Flickner, M. and Sawhney, H. S., Ashley, Huang, J., Q., Dom, Gorkani, B., Hafner, Lee, M., J., D., Petkovic, D., D. Steele, and Yanker, P. "Query by image and video content: the QBIC system," IEEE Computers, Vol. 28, No. 9, pp.23- 32, 1995. Google ScholarDigital Library
5.Foote, J. "An Overview of Audio Information Retrieval," In Multimedia Systems, vol. 7 no. 1, pp. 2-l 1, ACM Press/Springer-Verlag, January 1999. Google ScholarDigital Library
6.Fukunaga, Keinosuke and M. Narendra, Patrenahalli "A Branch and Bound Algorithm for Computing K-Nearest Neighbors", IEEE Transactions on Computers, July 1975.Google ScholarDigital Library
7.Ghias, A. J. and Logan, D. Chamberlain, B. C. Smith, "Query by humming-musical information retrieval in an audio database", ACM Multimedia '95 San Francisco, 1995. (http://www2.cs.comell.edu/zeno/Papers/humming/hummin g.html) Google ScholarDigital Library
8.Gold, B. and Rabiner, L. "Parallel processing techniques for estimating pitch periods of speech in the time domain," J. Acoust. Sot. Am. 46 (2), pp 442-448, 1969.Google ScholarCross Ref
9.Hess, Wolfgang, "Pitch determination of speech signals: algorithms and devices," Springer-Verlag, 1983.Google Scholar
10.International Symposium on Music Information Retrieval (MUSIC IR 2000), Plymouth, Massachusetts, Oct. 23-25, 2000. (httn://ciir.cs.umass.edu/music2000/)Google Scholar
11.Jang, J.-S. Roger and Gao, Ming-Yang "A Query-by-Singing System based on Dynamic Programming", International Workshop on Intelligent Systms Resolutions (the 8th Bellman Continuum), PP. 85-89, Hsinchu, Taiwan, Dee 2000.Google Scholar
12.Katsavounidis, Ioannis and Kuo, C.-C Jay and Zhang, Zhen, "Fast Tree-Structured Nearest Neighbor Encoding for Vector Quantization," IEEE Transactions on Image Processing, Vol. 5, No. 2, PP. 398-404, Feb. 1996. Google ScholarDigital Library
13.Kosugi, N. Y., Kon'ya, Nishihara, S., Yamamura, M. and Kushima, K. "Music Retrieval by Humming - Using Similarity Retrieval over High Dimensional Feature Vector Space," pp 404-407, IEEE 1999.Google Scholar
14.Kosugi, N., Nishihara, Y., Kon'ya, S., Yamamuro, M., and Kushima, K., "Let's Search for Songs by Humming!" In Proc. ACM Multimedia 99 (Part 2), page 194, November 1999. Google ScholarDigital Library
15.Kosugi, N., Nishihara, Y., Kon'ya, S., Yamamuro, M., and Kushima, K., "Music Retrieval by Humming," In Proceedings of IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, pages 404-407, August 1999.Google Scholar
16.Kosugi, N., Nishihara, Y., Sakata, T., Yamamuro, M., and Kushima, K., "A practical query-by-humming system for a large music database," In Proc. ACM Multimedia 2000, November 2000. Google ScholarDigital Library
17.Lee, I-Yang, Jang, J.-S. Roger and Hsu, Wen-Hao "Content-based Music Retrieval from Acoustic Input", 12th IPPR Conference on Computer Vision, Graphics, and Image Processing, PP. 325-330, Taiwan, August 1999.Google Scholar
18.Liu, C. C. and Chen, A. L. P., "A Multimedia Database System Supporting Content-Based Retrieval", Journal of Information Science and Engineering, 13, PP. 369- 398,1997.Google Scholar
19.McNab, R. J. and Smith, L. A. "Melody transcription for interactive applications" Department of Computer Science University of Waikato, New Zealand.Google Scholar
20.McNab, R. J., Smith, L. A. and Witten, Jan H. "Towards the Digital Music Library: Tune Retrieval from Acoustic Input"" ACM, 1996.Google Scholar
21.McNab, R. J., Smith, L. A., Witten, I. H. and Henderson, C. L. "Tune Retrieval in the Multimedia Library,"Google Scholar
22.McNab,R. J., Smith, L. A. and Witten, Jan H. "Signal Processing for Melody Transcription" Proceedings of the 19'h Australasian Computer Science Conference, 1996.Google Scholar
23.Proakis, J. R. J. G. and Hansen, J. H. L. "Discrete-time processing of speech signals," New York, Macmillan Pub. co., 1993. Google ScholarDigital Library
24.Torres, L. and Huguet, J., "An Improvement on Codebook Search for Vector Quantization," IEEE Transactions on Communications, Vol 42, No. 2/3/4, PP. 208-210, February/March/April, 1994.Google Scholar
25.Uitdenbogerd A. and Zobel, J. ""Melodic Matching Techniques for Large Music Databases", (httn://www.kom.e-technik.tudarnstadt.de/acmmm99/ep/uitdcnbogerd/)Google Scholar
26.Yianilos, Peter N. "Data structures and algorithms for nearest neighbor search in general metric spaces," In Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 3 1 l-321, Austin, Texas, 25- 27 January 1993 Google ScholarDigital Library
27.Yianilos, Peter N. "Excluded Middle Vantage Point Forests for Nearest Neighbor Search," NEC Research Institute Technical Report, 1998Google Scholar

Index Terms

Hierarchical filtering method for content-based music retrieval via acoustic input
1. Computer systems organization
  1. Embedded and cyber-physical systems
  2. Real-time systems
2. Information systems
  1. Information retrieval
    1. Information retrieval query processing
    2. Retrieval models and ranking

Recommendations

Super MBox: an efficient/effective content-based music retrieval system
MULTIMEDIA '01: Proceedings of the ninth ACM international conference on Multimedia

This demo presents an implementation of a content-based music retrieval system that can take a user's acoustic input (8-second clip of singing or humming) via a microphone and then retrieve the intended song from a database containing 13,000 candidate ...
Read More
Microcontroller implementation of melody recognition: a prototype
MULTIMEDIA '03: Proceedings of the eleventh ACM international conference on Multimedia

This demo presents a 16-bit microcontroller implementation of a content-based music retrieval system that can take a user's acoustic input (5-second clip of singing or humming) and then retrieve the intended song from 20 candidate songs. Performance ...
Read More
Effective Music Retrieval by Sequential Pattern-Based Alignment
TAAI '12: Proceedings of the 2012 Conference on Technologies and Applications of Artificial Intelligence

Due to the rapid growth of music data, how to effectively and efficiently retrieve the interested music piece has been an attractive issue in recent years. In traditional music retrieval systems, the most popular way is to retrieve the music piece by ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MULTIMEDIA '01: Proceedings of the ninth ACM international conference on Multimedia
October 2001
664 pages
ISBN:1581133944
DOI:10.1145/500141
Conference Chairs:
Nicolas D. Georganas
University of Ottawa
,
Radu Popescu-Zeletin
GMD Fokus
Copyright © 2001 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 October 2001
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
audio indexing and retrieval
audio signal processing
content-based music retrieval
dynamic programming
dynamic time warping
nearest neighbor search
pattern recognition
query by singing
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 56
  Total Citations
  View Citations
- 953
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Hierarchical filtering method for content-based music retrieval via acoustic input

MULTIMEDIA '01: Proceedings of the ninth ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Super MBox: an efficient/effective content-based music retrieval system

Microcontroller implementation of melody recognition: a prototype

Effective Music Retrieval by Sequential Pattern-Based Alignment

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Hierarchical filtering method for content-based music retrieval via acoustic input

MULTIMEDIA '01: Proceedings of the ninth ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Super MBox: an efficient/effective content-based music retrieval system

Microcontroller implementation of melody recognition: a prototype

Effective Music Retrieval by Sequential Pattern-Based Alignment

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media