Abstract
We study how to reduce the dimensionality of the MPEG-7 audio signature descriptors in this paper. With the aid of the dimension-reduced descriptors, the comparison time for detecting copyrighted audio can be significantly reduced. The studied methods include block average, principal component analysis (PCA), Hadamard transform, Haar transform, and CDF (Cohen-Daubechies-Feauveau) 9/7 wavelet transform. For the latter four methods, we also examine whether different partition methods would affect the accuracy. The simulation results show that different reduction methods should use different partition strategies for best accuracy. In addition, we also compare the computational complexity of these methods. The experimental results show that, except the CDF 9/7 method, the rest four methods yield comparable accuracy for undistorted and MP-3 coded audio. When also considering the computational complexity, the block average method is a better choice.
Similar content being viewed by others
References
Baluja S, Covell M (2007) Audio fingerprinting: combining computer vision and data stream processing, Proc. of IEEE Int’l Conf on Acoustics, Speech and Signal Processing, Honolulu, Hawaii, USA, pp. II-213 – II-216, April
Bringer J, Chabanne H (2012) Embedding edit distance to enable private keyword search. Human-centric Comput Info Sci 2(2):1–12
Burges CJC, Platt JC, Jana S (2003) Distortion discriminant analysis for audio fingerprinting. IEEE Trans Speech Audio Process 11(3):165–174
Crysandt H (2003) Music identification with MPEG-7, 115th AES Convention, Paper 5967, Oct
Doets PJO, Gisbert MM, Lagendijk RL (2006) On the comparison of audio fingerprints for extracting quality parameters of compressed audio, Proc SPIE 6072, Security, Steganography, and Watermarking of Multimedia Contents VIII, San Jose, CA., USA, pp. 60720L-1-12, Jan
Doğan E, Sert M, Yazıcı A (2011) A flexible and scalable audio information retrieval system for mixed-type audio signals. Int J Intell Syst 26(10):952–970
Haitsma JA, Kalker T (2002) A highly robust audio fingerprinting system, Proc. Int’l. Conf. on Music Information Retrieval, Paris, France, 107–115, Oct
Hellmuth O, Allamance E, Cremer M, Grossmann H, Herre J, Kastner T (2003) Using MPEG-7 audio fingerprinting in real-world application, 115th AES Convention, Paper 5961, Oct
http://www.getreuer.info/home/waveletcdf97 (retrieved July 16, 2013)
http://www.mathworks.com/matlabcentral/fileexchange/11846-cdf-97-wavelet-transform/content/wavecdf97.m (retrieved July 16, 2013)
Huang Y-P, Lai S-L (2012) Novel query-by-humming/singing method with fuzzy inference system. J Convergence 3(4):1–8
ISO/IEC (2003) Information technology—multimedia content description interface—Part 6: fnce software, IS 15938
ISO/IEC, Information Technology–Multimedia Content Description Interface - Part 4: Audio, IS 15938–4, 2002.
Lee J-Y, You SD (2005) Dimension-reduction technique for MPEG-7 audio descriptors. Lecture Notes Comput Sci 3768:526–537
Lin P-C, Wang J-F, Wang J-C, Huang J-J (2009) Personal spoken sentence retrieval using two-level feature matching and MPEG-7 audio LLDs. J Inf Sci Eng 25(4):1221–1238
Nack F, Lindsay AT (1999) Everything you wanted to know about MPEG-7 part 1, IEEE Multimedia Magazine, vol. 6, no. 3, pp.65–77, July-Sept
Nack F, Lindsay AT (1999) Everything you wanted to know about MPEG-7 part 2, IEEE Multimedia Magazine, vol. 6, no. 4, pp.64–73, Oct–Dec
Satone MP, Kharate GK (2012) Face recognition based on PCA on wavelet subband of average-half-face. J Info Process Syst 8(3):483–494
Shlens J, “A tutorial on principal component analysis,” http://www.snl.salk.edu/~shlens/pca.pdf
Stankovic RS, Falkowski BJ (2003) The Haar wavelet transform: its status and achievements. Comput Electr Eng 29:25–44
Taubman DS, Marcellin MW (2002) JPEG 2000: Image compression fundamentals, standards and practice. Kluwer Academic Publishers, Massachusetts
Theodoridis S, Koutroumbas K (2003) Pattern recognition, 2nd ed., Elsevier Academic Press
Wang A (2006) The shazam music recognition service. Commun ACM 49(8):44–48
Yang H-Y, Bao D-W, Wang X-Y, Niu P-P (2012) A robust content based audio watermarking using UDWT and invariant histogram. Multimed Tools Appl 57(3):453–467
You SD, Chen W-H, Chen W-K (2013) Music identification system using MPEG-7 audio signature descriptors, The Scientific World Journal, 2013(752464):1–9
Acknowledgments
The authors thank the National Science Council of Taiwan to provide grants (NSC 94-2213-E-027-042 and NSC 101-2221-E-027-127) for this research.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
You, S.D., Chen, WH. Comparative study of methods for reducing dimensionality of MPEG-7 audio signature descriptors. Multimed Tools Appl 74, 3579–3598 (2015). https://doi.org/10.1007/s11042-013-1670-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1670-y