nach oben

International Journal of Speech Technology

Erschienen in:

29.08.2015

Hybrid speech enhancement with empirical mode decomposition and spectral subtraction for efficient speaker identification

verfasst von: Samia Abd El-Moneim, Moawad I. Dessouky, Fathi E. Abd El-Samie, M. A. Nassar, Mohammed Abd El-Naby

Erschienen in: International Journal of Speech Technology | Ausgabe 4/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Speech enhancement is a very important pre-processing step in various speech processing applications such as speech recognition, speaker identification, speech coding, and speech synthesis. In this paper, we focus on speech enhancement prior to speaker identification, because the degradations of the speech signals may cause difficulties in hearing, understanding, and speaker recognition. The paper presents a hybrid speech enhancement method based on empirical mode decomposition combined with spectral subtraction to improve the quality of speech signals prior to speaker identification. Simulation results show an improvement in speaker recognition rates with the proposed speech enhancement method as a pre-processing step.

Vorheriger Artikel Ideal binary masking for reducing convolutive noise

Nächster Artikel Sensitivity of automatic speaker identification to SVD digital audio watermarking

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Abd El-samie, F. E. (2011). Information security for automatic speaker identification., Springer briefs in electrical and computer engineering New York: Springer.CrossRef

Alotaiby, T., Alshebeili, S. A., Alshawi, T., Ahmad, I., & Abd El-Samie, F. E. (2014). EEG seizure detection and prediction algorithms: A survey. EURASIP Journal on Advances in Signal Processing, 2014, 1–21.CrossRef

Campbell, J. P. (1997). Speaker recognition: A tutorial. In Proceedings of the IEEE (Vol. 85).

das, A., Jena, M. R., Barik, K. K. (2014). Mel-frequency cepstral coefficient (MFCC)—a novel method for speaker recognition. Digital Technologies, 1(1), 1–3. Available online at http://pubs.sciepub.com/dt/1/1/1©ScienceandEducationPublishing.

Evans, N. W. D., Mason, J. S., Liu, W. M. & Fauve, B. (2005). On the fundamental limitations of spectra subtraction: An assessment by automatic speech recognition. Swansea: University of Wales Swansea Singleton Park. http://eegalilee.swan.ac.uk.

Furui, S. (1981). Cepstral analysis technique for automatic speaker verification. IEEE Transactions of Acoustics, and Signal Processing, 29, 254–272.CrossRef

Goel, P., & Garg, A. (2011). Review of spectral subtraction techniques for speech enhancement. Haryana: Electronics and Communication Department, M.M. University, Mullana, Ambala.

Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech and Language Processing, 16(1), 229–238.CrossRef

Karam, M., Khazaal, H. F., Aglan, H., & Cole, C. (2014). Noise removal in speech processing using spectral subtraction. Journal of Signal and Information Processing, 5, 32–41.CrossRef

Kim, D., & Oh, H.-S. (2009). EMD: A package for empirical mode decomposition and hilbert spectrum. The R Journal, 1, 40–46.

Kondo, K. (2012). Subjective Quality Measurement of Speech. Berlin: Springer. doi:10.1007/978-3-642-27506-7_2.CrossRef

Love, B. J., Vining, J. & Sun, X. (2004). Automatic speaker recognition using neural networks. EE371D intro. To neural networks. Austin: The University of Texas.

Muda, L., Begam, M., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. Journal of Computing 2(3), 138–143, https://sites.google.com/site/journalofcomputing/.

Pawar, A. P.,Choudhari, K. B. (2013). Enhancement of speech in noisy conditions. The International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, 2(7).

Reynolds, D. A. (2002). An overview of automatic speaker recognition technology. In Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on (Vol. 4). Orlando, FL: IEEE. Accessed 13–17 May 2002.

Rilling, G., Flandrin, P., & Goncalv`es, P. (2003). On empirical mode decomposition and its algorithms. Lyon: Ecole Normale Sup´erieure de Lyon.

Samudravijaya, K. (2003). Speech and speaker recognition: A tutorial. Mumbai: Tata Institute of Fundamental Research.

Sharma, A., Singh, S. P., Kumar, V. (2005). Text-independent speaker identification using back propagation MLP network classifier for a closed set of speaker. In 2005 IEEE International Symposium on Signal Processing and Information Technology. Allahabad: Indian Institute of Information Technology.

Titel: Hybrid speech enhancement with empirical mode decomposition and spectral subtraction for efficient speaker identification
verfasst von: Samia Abd El-Moneim
Moawad I. Dessouky
Fathi E. Abd El-Samie
M. A. Nassar
Mohammed Abd El-Naby
Publikationsdatum: 29.08.2015
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 4/2015
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-015-9293-5

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 4/2015

i-Vectors in speech processing applications: a survey

Performance evaluation of a ACF-AMDF based pitch detection scheme in real-time

Efficient audio cryptosystem based on chaotic maps and double random phase encoding

Robust glottal closure instant detection by jointly exploiting stationary wavelet transform and harmonic superposition

A comparative study of BA, APSO, GSA, hybrid PSOGSA and SPSO in dual channel speech enhancement

Bayesian estimation for speech enhancement given a priori knowledge of clean speech phase