Skip to main content
Erschienen in: International Journal of Speech Technology 4/2015

01.09.2015

Sensitivity of automatic speaker identification to SVD digital audio watermarking

verfasst von: Fathi E. Abd El-Samie, Amira Shafik, Hala S. El-sayed, Said M. Elhalafawy, Salaheldin M. Diab, Bassiouny M. Sallam, Osama S. Faragallah

Erschienen in: International Journal of Speech Technology | Ausgabe 4/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper proposes the utilization of SVD digital audio watermarking to increase the security of automatic speaker identification (ASI) systems and presents a study for the effect of watermarking on the ASI system performance. The SVD audio watermarking algorithm can be implemented on audio signals in time domain or in another appropriate transform domain and can be applied to the audio signal as a whole or on a segment-by-segment basis. The speaker recognition system works by generating a database of speaker’s features using the MFCCs and polynomial shape coefficients extracted from each speaker after they are lexicographically ordered into 1-D signals. A matching process is performed for any new speaker to determine if he is belonging to the database or not, using a trained neural network. Experimental results show that the SVD audio watermarking doesn’t degrade the ASI system performance severely. So, it can be used with ASI to increase security. Also, it was shown the segment by segment watermarking in the time domain achieves the highest detectability of the watermark. So, we can say that it is recommended to use SVD segment by segment audio watermarking with ASI systems implementing features extracted from the DCT or the DWT.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abd El-Samie, F. E. (2009). An efficient singular value decomposition algorithm for digital audio watermarking. International Journal of Speech Technology, 17, 27–45.CrossRef Abd El-Samie, F. E. (2009). An efficient singular value decomposition algorithm for digital audio watermarking. International Journal of Speech Technology, 17, 27–45.CrossRef
Zurück zum Zitat Barnwell, T. P & Voiers W. D. (1979) An analysis of objective measures for user acceptance of voice communication systems, final report. Barnwell, T. P & Voiers W. D. (1979) An analysis of objective measures for user acceptance of voice communication systems, final report.
Zurück zum Zitat Barnwell, T. P., Bush, A. M. & Mersereau, R.M. (1978) Speech quality measurement, final report. Barnwell, T. P., Bush, A. M. & Mersereau, R.M. (1978) Speech quality measurement, final report.
Zurück zum Zitat Campbell, J. P. (1997). Speaker recognition: A tutorial. Proceedings of the IEEE, 85(9), 1437–1462.CrossRef Campbell, J. P. (1997). Speaker recognition: A tutorial. Proceedings of the IEEE, 85(9), 1437–1462.CrossRef
Zurück zum Zitat Chiyi, J., Kubichek, R. (1996) Vector quantization techniques for output-based objective speech quality. In IEEE international conference on Acoustics, speech and signal processing, ICASSP-96, conference proceedings, (Vol. 1, pp. 491–494). Chiyi, J., Kubichek, R. (1996) Vector quantization techniques for output-based objective speech quality. In IEEE international conference on Acoustics, speech and signal processing, ICASSP-96, conference proceedings, (Vol. 1, pp. 491–494).
Zurück zum Zitat Crochiere, R. E., Tribole, J. E., & Rabiner, L. R. (1980). An interpretation of the log likelihood ratio as a measure of waveform coder performance. IEEE Transactions on Acoustic, Speech and Signal Processing, ASSP-28(3), 318–323.CrossRef Crochiere, R. E., Tribole, J. E., & Rabiner, L. R. (1980). An interpretation of the log likelihood ratio as a measure of waveform coder performance. IEEE Transactions on Acoustic, Speech and Signal Processing, ASSP-28(3), 318–323.CrossRef
Zurück zum Zitat Dimolitsas, S. (1989). Objective speech distortion measures and their relevance to speech quality assessments. IEEE Proceedings Communication, Speech and Vision, 136(5), 317–324.CrossRef Dimolitsas, S. (1989). Objective speech distortion measures and their relevance to speech quality assessments. IEEE Proceedings Communication, Speech and Vision, 136(5), 317–324.CrossRef
Zurück zum Zitat Hossain, M., Ahmed B. & Asrafi M. (2007). A real time speaker identification using artificial network network. In: 10th international conference on computer and information technology 2008, ICCIT 2008 (pp. 1–5). Hossain, M., Ahmed B. & Asrafi M. (2007). A real time speaker identification using artificial network network. In: 10th international conference on computer and information technology 2008, ICCIT 2008 (pp. 1–5).
Zurück zum Zitat Lam, K. H., Au, O. C., Chan, C.C.(1996) Objective speech quality measure for cellular phone. In IEEE international conference on acoustics, speech and signal processing (ICASSP -96), conference proceeding, (Vol.1, pp. 487–490). Lam, K. H., Au, O. C., Chan, C.C.(1996) Objective speech quality measure for cellular phone. In IEEE international conference on acoustics, speech and signal processing (ICASSP -96), conference proceeding, (Vol.1, pp. 487–490).
Zurück zum Zitat Lara, J. R. (2005). A method of automatic speaker recognition using cepstral features and vectorial quantization (pp. 146–153). Berlin/Heidelberg: LNCS/Springer. Lara, J. R. (2005). A method of automatic speaker recognition using cepstral features and vectorial quantization (pp. 146–153). Berlin/Heidelberg: LNCS/Springer.
Zurück zum Zitat Lungyun, G., Harris, J.G., Shrivastav, R. (2006) Disordered speech evaluation using objective quality measures. In IEEE international conference on acoustics, speech and signal processing, proceedings (ICASSP’05), (Vol. 1, pp. 321–324). Lungyun, G., Harris, J.G., Shrivastav, R. (2006) Disordered speech evaluation using objective quality measures. In IEEE international conference on acoustics, speech and signal processing, proceedings (ICASSP’05), (Vol. 1, pp. 321–324).
Zurück zum Zitat Makhoul, J. (1973). Spectral analysis of speech by linear prediction. IEEE Transactions on Audio and Electroacoustics, AU-21(3), 140–148.CrossRef Makhoul, J. (1973). Spectral analysis of speech by linear prediction. IEEE Transactions on Audio and Electroacoustics, AU-21(3), 140–148.CrossRef
Zurück zum Zitat Makhoul, J. (1975). Linear prediction: A tutorial review. Proceedings of the IEEE, 63(4), 561–580.CrossRef Makhoul, J. (1975). Linear prediction: A tutorial review. Proceedings of the IEEE, 63(4), 561–580.CrossRef
Zurück zum Zitat Nica, A., Caruntu, A., Toderean, G. (2006) Analysis and synthesis of vowels using Matlab. In IEEE international conference on automatic, quality and testing, Robotics, (Vol. 2, pp. 371–374). Nica, A., Caruntu, A., Toderean, G. (2006) Analysis and synthesis of vowels using Matlab. In IEEE international conference on automatic, quality and testing, Robotics, (Vol. 2, pp. 371–374).
Zurück zum Zitat Noll, P.W. (1974) Adaptive quantization in speech coding systems. In IEEE international Zurich Seminar. Noll, P.W. (1974) Adaptive quantization in speech coding systems. In IEEE international Zurich Seminar.
Zurück zum Zitat Paul, AK, Das, D & Kamal, MM. (2009) Bangla speech recognition system using LPC and ANN. In seventh international conference on advances in pattern recognition, ICAPR’09 (pp. 171–174). Paul, AK, Das, D & Kamal, MM. (2009) Bangla speech recognition system using LPC and ANN. In seventh international conference on advances in pattern recognition, ICAPR’09 (pp. 171–174).
Zurück zum Zitat Picone, J. (1993). Signal modelling techniques in speech recognition. IEEE proceedings, 81(9), 1215–1247.CrossRef Picone, J. (1993). Signal modelling techniques in speech recognition. IEEE proceedings, 81(9), 1215–1247.CrossRef
Zurück zum Zitat Reynolds, D.A. (2002) An overview of automatic speaker recognition technology. In proceedings IEEE international conference on acoustics, speech and signal processing (ICASSP), (Vol.4, pp. 4072–4075). Reynolds, D.A. (2002) An overview of automatic speaker recognition technology. In proceedings IEEE international conference on acoustics, speech and signal processing (ICASSP), (Vol.4, pp. 4072–4075).
Zurück zum Zitat Saha, G., & Kumar, P. (2004). A comparative study of feature extraction algorithms on ANN based speaker model for speaker recognition application (Vol. 3773, pp. 1192–1197). Berlin/Heidelberg: LNCS/Springer. Saha, G., & Kumar, P. (2004). A comparative study of feature extraction algorithms on ANN based speaker model for speaker recognition application (Vol. 3773, pp. 1192–1197). Berlin/Heidelberg: LNCS/Springer.
Zurück zum Zitat Schroeder, M. R., Atal, B. S., & Hall, J. L. (1979). Optimizing digital speech coders by exploiting properties of the human ear. The Journal of the Acoustical Society of America, 66, 1647–1652.CrossRef Schroeder, M. R., Atal, B. S., & Hall, J. L. (1979). Optimizing digital speech coders by exploiting properties of the human ear. The Journal of the Acoustical Society of America, 66, 1647–1652.CrossRef
Zurück zum Zitat Sleit, A., Serhan. S. & Nemir L. (2008) A histogram based speaker identification technique. In First international conference on the applications of digital information and web technologies ICADIWT, (pp. 384–338). Sleit, A., Serhan. S. & Nemir L. (2008) A histogram based speaker identification technique. In First international conference on the applications of digital information and web technologies ICADIWT, (pp. 384–338).
Zurück zum Zitat Srinivasan, S. H. (2004) Speech quality measure based on auditory scene analysis. In IEEE 6th workshop on multimedia signal processing, (pp. 371–374). Srinivasan, S. H. (2004) Speech quality measure based on auditory scene analysis. In IEEE 6th workshop on multimedia signal processing, (pp. 371–374).
Zurück zum Zitat Swain, A. K., & Abdulla, W. (2004). Estimation of LPC Parameters of Speech Signal in Noise Environment. IEEE Region 10 conference TENCON 2004, 1, 134–142. Swain, A. K., & Abdulla, W. (2004). Estimation of LPC Parameters of Speech Signal in Noise Environment. IEEE Region 10 conference TENCON 2004, 1, 134–142.
Zurück zum Zitat Tanprasert, C., Wutiwiwatcha, C. & Sae- Tang, S. (2000) Text-dependent speaker identification using neural network on distinictive thai tone marks. In Nectec Technical Journal, (Vol. 1, No. 6). Tanprasert, C., Wutiwiwatcha, C. & Sae- Tang, S. (2000) Text-dependent speaker identification using neural network on distinictive thai tone marks. In Nectec Technical Journal, (Vol. 1, No. 6).
Zurück zum Zitat Thorpe, L.A. & Shelton B.R. (1993) Subjective test methodology: MOS vs. DMOS in evaluation of speech coding. In IEEE workshop on speech coding for telecommunication, proceedings, (pp. 73–74). Thorpe, L.A. & Shelton B.R. (1993) Subjective test methodology: MOS vs. DMOS in evaluation of speech coding. In IEEE workshop on speech coding for telecommunication, proceedings, (pp. 73–74).
Zurück zum Zitat Wang, S., Sekey, A., & Gersho, A. (1992). An objective measure for predicting subjective quality of speech coders. IEEE J Selected Areas Communication, 10, 819–828.CrossRef Wang, S., Sekey, A., & Gersho, A. (1992). An objective measure for predicting subjective quality of speech coders. IEEE J Selected Areas Communication, 10, 819–828.CrossRef
Metadaten
Titel
Sensitivity of automatic speaker identification to SVD digital audio watermarking
verfasst von
Fathi E. Abd El-Samie
Amira Shafik
Hala S. El-sayed
Said M. Elhalafawy
Salaheldin M. Diab
Bassiouny M. Sallam
Osama S. Faragallah
Publikationsdatum
01.09.2015
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 4/2015
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-015-9292-6

Weitere Artikel der Ausgabe 4/2015

International Journal of Speech Technology 4/2015 Zur Ausgabe

Neuer Inhalt