Skip to main content
Log in

Comparative study of methods for reducing dimensionality of MPEG-7 audio signature descriptors

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

We study how to reduce the dimensionality of the MPEG-7 audio signature descriptors in this paper. With the aid of the dimension-reduced descriptors, the comparison time for detecting copyrighted audio can be significantly reduced. The studied methods include block average, principal component analysis (PCA), Hadamard transform, Haar transform, and CDF (Cohen-Daubechies-Feauveau) 9/7 wavelet transform. For the latter four methods, we also examine whether different partition methods would affect the accuracy. The simulation results show that different reduction methods should use different partition strategies for best accuracy. In addition, we also compare the computational complexity of these methods. The experimental results show that, except the CDF 9/7 method, the rest four methods yield comparable accuracy for undistorted and MP-3 coded audio. When also considering the computational complexity, the block average method is a better choice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Baluja S, Covell M (2007) Audio fingerprinting: combining computer vision and data stream processing, Proc. of IEEE Int’l Conf on Acoustics, Speech and Signal Processing, Honolulu, Hawaii, USA, pp. II-213 – II-216, April

  2. Bringer J, Chabanne H (2012) Embedding edit distance to enable private keyword search. Human-centric Comput Info Sci 2(2):1–12

    Google Scholar 

  3. Burges CJC, Platt JC, Jana S (2003) Distortion discriminant analysis for audio fingerprinting. IEEE Trans Speech Audio Process 11(3):165–174

    Article  Google Scholar 

  4. Crysandt H (2003) Music identification with MPEG-7, 115th AES Convention, Paper 5967, Oct

  5. Doets PJO, Gisbert MM, Lagendijk RL (2006) On the comparison of audio fingerprints for extracting quality parameters of compressed audio, Proc SPIE 6072, Security, Steganography, and Watermarking of Multimedia Contents VIII, San Jose, CA., USA, pp. 60720L-1-12, Jan

  6. Doğan E, Sert M, Yazıcı A (2011) A flexible and scalable audio information retrieval system for mixed-type audio signals. Int J Intell Syst 26(10):952–970

    Article  Google Scholar 

  7. Haitsma JA, Kalker T (2002) A highly robust audio fingerprinting system, Proc. Int’l. Conf. on Music Information Retrieval, Paris, France, 107–115, Oct

  8. Hellmuth O, Allamance E, Cremer M, Grossmann H, Herre J, Kastner T (2003) Using MPEG-7 audio fingerprinting in real-world application, 115th AES Convention, Paper 5961, Oct

  9. http://www.getreuer.info/home/waveletcdf97 (retrieved July 16, 2013)

  10. http://www.mathworks.com/matlabcentral/fileexchange/11846-cdf-97-wavelet-transform/content/wavecdf97.m (retrieved July 16, 2013)

  11. Huang Y-P, Lai S-L (2012) Novel query-by-humming/singing method with fuzzy inference system. J Convergence 3(4):1–8

    Google Scholar 

  12. ISO/IEC (2003) Information technology—multimedia content description interface—Part 6: fnce software, IS 15938

  13. ISO/IEC, Information Technology–Multimedia Content Description Interface - Part 4: Audio, IS 15938–4, 2002.

  14. Lee J-Y, You SD (2005) Dimension-reduction technique for MPEG-7 audio descriptors. Lecture Notes Comput Sci 3768:526–537

    Article  Google Scholar 

  15. Lin P-C, Wang J-F, Wang J-C, Huang J-J (2009) Personal spoken sentence retrieval using two-level feature matching and MPEG-7 audio LLDs. J Inf Sci Eng 25(4):1221–1238

    Google Scholar 

  16. Nack F, Lindsay AT (1999) Everything you wanted to know about MPEG-7 part 1, IEEE Multimedia Magazine, vol. 6, no. 3, pp.65–77, July-Sept

  17. Nack F, Lindsay AT (1999) Everything you wanted to know about MPEG-7 part 2, IEEE Multimedia Magazine, vol. 6, no. 4, pp.64–73, Oct–Dec

  18. Satone MP, Kharate GK (2012) Face recognition based on PCA on wavelet subband of average-half-face. J Info Process Syst 8(3):483–494

    Article  Google Scholar 

  19. Shlens J, “A tutorial on principal component analysis,” http://www.snl.salk.edu/~shlens/pca.pdf

  20. Stankovic RS, Falkowski BJ (2003) The Haar wavelet transform: its status and achievements. Comput Electr Eng 29:25–44

    Article  MATH  Google Scholar 

  21. Taubman DS, Marcellin MW (2002) JPEG 2000: Image compression fundamentals, standards and practice. Kluwer Academic Publishers, Massachusetts

    Book  Google Scholar 

  22. Theodoridis S, Koutroumbas K (2003) Pattern recognition, 2nd ed., Elsevier Academic Press

  23. Wang A (2006) The shazam music recognition service. Commun ACM 49(8):44–48

    Article  Google Scholar 

  24. Yang H-Y, Bao D-W, Wang X-Y, Niu P-P (2012) A robust content based audio watermarking using UDWT and invariant histogram. Multimed Tools Appl 57(3):453–467

    Article  Google Scholar 

  25. You SD, Chen W-H, Chen W-K (2013) Music identification system using MPEG-7 audio signature descriptors, The Scientific World Journal, 2013(752464):1–9

Download references

Acknowledgments

The authors thank the National Science Council of Taiwan to provide grants (NSC 94-2213-E-027-042 and NSC 101-2221-E-027-127) for this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shingchern D. You.

Rights and permissions

Reprints and permissions

About this article

Cite this article

You, S.D., Chen, WH. Comparative study of methods for reducing dimensionality of MPEG-7 audio signature descriptors. Multimed Tools Appl 74, 3579–3598 (2015). https://doi.org/10.1007/s11042-013-1670-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1670-y

Keywords

Navigation