nach oben

International Journal of Speech Technology

Erschienen in:

01.09.2015

Sensitivity of automatic speaker identification to SVD digital audio watermarking

verfasst von: Fathi E. Abd El-Samie, Amira Shafik, Hala S. El-sayed, Said M. Elhalafawy, Salaheldin M. Diab, Bassiouny M. Sallam, Osama S. Faragallah

Erschienen in: International Journal of Speech Technology | Ausgabe 4/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper proposes the utilization of SVD digital audio watermarking to increase the security of automatic speaker identification (ASI) systems and presents a study for the effect of watermarking on the ASI system performance. The SVD audio watermarking algorithm can be implemented on audio signals in time domain or in another appropriate transform domain and can be applied to the audio signal as a whole or on a segment-by-segment basis. The speaker recognition system works by generating a database of speaker’s features using the MFCCs and polynomial shape coefficients extracted from each speaker after they are lexicographically ordered into 1-D signals. A matching process is performed for any new speaker to determine if he is belonging to the database or not, using a trained neural network. Experimental results show that the SVD audio watermarking doesn’t degrade the ASI system performance severely. So, it can be used with ASI to increase security. Also, it was shown the segment by segment watermarking in the time domain achieves the highest detectability of the watermark. So, we can say that it is recommended to use SVD segment by segment audio watermarking with ASI systems implementing features extracted from the DCT or the DWT.

Vorheriger Artikel Hybrid speech enhancement with empirical mode decomposition and spectral subtraction for efficient speaker identification

Nächster Artikel Automatic prominent syllable detection with machine learning classifiers

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Abd El-Samie, F. E. (2009). An efficient singular value decomposition algorithm for digital audio watermarking. International Journal of Speech Technology, 17, 27–45.CrossRef

Barnwell, T. P & Voiers W. D. (1979) An analysis of objective measures for user acceptance of voice communication systems, final report.

Barnwell, T. P., Bush, A. M. & Mersereau, R.M. (1978) Speech quality measurement, final report.

Campbell, J. P. (1997). Speaker recognition: A tutorial. Proceedings of the IEEE, 85(9), 1437–1462.CrossRef

Chiyi, J., Kubichek, R. (1996) Vector quantization techniques for output-based objective speech quality. In IEEE international conference on Acoustics, speech and signal processing, ICASSP-96, conference proceedings, (Vol. 1, pp. 491–494).

Crochiere, R. E., Tribole, J. E., & Rabiner, L. R. (1980). An interpretation of the log likelihood ratio as a measure of waveform coder performance. IEEE Transactions on Acoustic, Speech and Signal Processing, ASSP-28(3), 318–323.CrossRef

Dimolitsas, S. (1989). Objective speech distortion measures and their relevance to speech quality assessments. IEEE Proceedings Communication, Speech and Vision, 136(5), 317–324.CrossRef

Hossain, M., Ahmed B. & Asrafi M. (2007). A real time speaker identification using artificial network network. In: 10th international conference on computer and information technology 2008, ICCIT 2008 (pp. 1–5).

Lam, K. H., Au, O. C., Chan, C.C.(1996) Objective speech quality measure for cellular phone. In IEEE international conference on acoustics, speech and signal processing (ICASSP -96), conference proceeding, (Vol.1, pp. 487–490).

Lara, J. R. (2005). A method of automatic speaker recognition using cepstral features and vectorial quantization (pp. 146–153). Berlin/Heidelberg: LNCS/Springer.

Lungyun, G., Harris, J.G., Shrivastav, R. (2006) Disordered speech evaluation using objective quality measures. In IEEE international conference on acoustics, speech and signal processing, proceedings (ICASSP’05), (Vol. 1, pp. 321–324).

Makhoul, J. (1973). Spectral analysis of speech by linear prediction. IEEE Transactions on Audio and Electroacoustics, AU-21(3), 140–148.CrossRef

Makhoul, J. (1975). Linear prediction: A tutorial review. Proceedings of the IEEE, 63(4), 561–580.CrossRef

Nica, A., Caruntu, A., Toderean, G. (2006) Analysis and synthesis of vowels using Matlab. In IEEE international conference on automatic, quality and testing, Robotics, (Vol. 2, pp. 371–374).

Noll, P.W. (1974) Adaptive quantization in speech coding systems. In IEEE international Zurich Seminar.

Paul, AK, Das, D & Kamal, MM. (2009) Bangla speech recognition system using LPC and ANN. In seventh international conference on advances in pattern recognition, ICAPR’09 (pp. 171–174).

Picone, J. (1993). Signal modelling techniques in speech recognition. IEEE proceedings, 81(9), 1215–1247.CrossRef

Reynolds, D.A. (2002) An overview of automatic speaker recognition technology. In proceedings IEEE international conference on acoustics, speech and signal processing (ICASSP), (Vol.4, pp. 4072–4075).

Saha, G., & Kumar, P. (2004). A comparative study of feature extraction algorithms on ANN based speaker model for speaker recognition application (Vol. 3773, pp. 1192–1197). Berlin/Heidelberg: LNCS/Springer.

Schroeder, M. R., Atal, B. S., & Hall, J. L. (1979). Optimizing digital speech coders by exploiting properties of the human ear. The Journal of the Acoustical Society of America, 66, 1647–1652.CrossRef

Sleit, A., Serhan. S. & Nemir L. (2008) A histogram based speaker identification technique. In First international conference on the applications of digital information and web technologies ICADIWT, (pp. 384–338).

Srinivasan, S. H. (2004) Speech quality measure based on auditory scene analysis. In IEEE 6th workshop on multimedia signal processing, (pp. 371–374).

Swain, A. K., & Abdulla, W. (2004). Estimation of LPC Parameters of Speech Signal in Noise Environment. IEEE Region 10 conference TENCON 2004, 1, 134–142.

Tanprasert, C., Wutiwiwatcha, C. & Sae- Tang, S. (2000) Text-dependent speaker identification using neural network on distinictive thai tone marks. In Nectec Technical Journal, (Vol. 1, No. 6).

Thorpe, L.A. & Shelton B.R. (1993) Subjective test methodology: MOS vs. DMOS in evaluation of speech coding. In IEEE workshop on speech coding for telecommunication, proceedings, (pp. 73–74).

Wang, S., Sekey, A., & Gersho, A. (1992). An objective measure for predicting subjective quality of speech coders. IEEE J Selected Areas Communication, 10, 819–828.CrossRef

Titel: Sensitivity of automatic speaker identification to SVD digital audio watermarking
verfasst von: Fathi E. Abd El-Samie
Amira Shafik
Hala S. El-sayed
Said M. Elhalafawy
Salaheldin M. Diab
Bassiouny M. Sallam
Osama S. Faragallah
Publikationsdatum: 01.09.2015
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 4/2015
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-015-9292-6

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 4/2015

An intelligent audio watermarking based on KNN learning algorithm

Automatic prominent syllable detection with machine learning classifiers

Noise robust speaker verification via the fusion of SNR-independent and SNR-dependent PLDA

Supervised and unsupervised separation of convolutive speech mixtures using f 0 and formant frequencies

Performance evaluation of a ACF-AMDF based pitch detection scheme in real-time

Robust glottal closure instant detection by jointly exploiting stationary wavelet transform and harmonic superposition

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.