Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

Audio hash function based on non-negative matrix factorisation of mel-frequency cepstral coefficients

Audio hash function based on non-negative matrix factorisation of mel-frequency cepstral coefficients

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Information Security — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Robust audio hash function defines a feature vector that characterises the audio signal, independent of content preserving manipulations, such as MP3 compression, amplitude boosting/cutting, low-pass filtering etc. In this study, the authors propose a new audio hash function based on the non-negative matrix factorisation (NMF) of mel-frequency cepstral coefficients (MFCCs). Their work is motivated by the fact that the orthogonality constraints in the singular value decomposition (SVD) make the low-rank singular vectors of audio with distinct local difference be the same. Thus, the available hash function based on SVD of MFCCs cannot achieve satisfactory discrimination. Although the non-negative constraints of NMF result in the basis that captures the local feature of the audio, thereby significantly reducing misclassification. Experimental results over large audio databases demonstrate that the proposed scheme achieves better performances, in terms of perceptual robustness and discrimination, than the available SVD-MFCCs-based hash function.

References

    1. 1)
      • Jiao, Y., Yang, B., Li, M., Niu, X.: `MDCT-based perceptual hashing for compressed audio content identification', Proc. IEEE Ninth Workshop on Multimedia Signal Processing, 2007, p. 381–384.
    2. 2)
      • Seo, J.S., Jin, M., Lee, S., Jang, D., Lee, S., Yoo, C.D.: `Audio fingerprinting based on normalized spectral subband centroids', Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, 2005, vol. 2005.
    3. 3)
    4. 4)
      • Sethi, I., Kulesh, V., Petrushin, V.: `Indexing and retrieval of music via Gaussian mixture models', Proc. Third Int. Workshop on Content Based Multimedia Indexing, 2003, vol. 2003.
    5. 5)
      • L.R. Rabiner , R.W. Schafer . (1978) Digital processing of speech signals.
    6. 6)
      • Allamanche, E., Herre, J., Hellmuth, O., Froba, B., Cremer, M.: `AudioID: towards content-based identification of audio material', Preprints-Audio Engineering Society, 2001.
    7. 7)
      • A.J. Menezes , P.C. Van Oorschot , S.A. Vanstone . (1997) Handbook of applied cryptography.
    8. 8)
      • D.D. Lee , H.S. Seung . Algorithms for non-negative matrix factorization. Adv. Neural Inf. Process. Syst. , 556 - 562
    9. 9)
      • Haitsma, J., Kalker, T.: `A highly robust audio fingerprinting system', Proc. ISMIR, 2002, vol. 2002.
    10. 10)
    11. 11)
      • Mapelli, F., Lancini, R.: `Audio hashing technique for automatic song identification', Proc. 2003 Int. Conf. Information Technology: Research and Education, 2003, p. 84–88.
    12. 12)
      • Chen, N., Wan, W.G.: `Speech hashing algorithm based on shorttime stability', Proc. 19th Int. Conf. Artificial Neural Networks, 2009, p. 426–434.
    13. 13)
    14. 14)
      • Linnartz, J., Kalker, A.C.C., Depovere, G.F., Beuker, R.: `A reliability model for detection of electronic watermarks in digital images', Proc. Benelux Symp. on Communication Theory, 1997, Enschede, The Netherlands, p. 14–16, Citeseer.
    15. 15)
    16. 16)
      • Jiao, Y., Li, Q., Niu, X.: `Compressed domain perceptual hashing for MELP coded speech', Proc. Int. Conf. Intelligent Information Hiding and Multimedia Signal Processing, 2008, p. 410–413.
    17. 17)
    18. 18)
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-ifs.2010.0097
Loading

Related content

content/journals/10.1049/iet-ifs.2010.0097
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address