nach oben

Erschienen in:

2015 | OriginalPaper | Buchkapitel

Improved Language Identification in Presence of Speech Coding

verfasst von : Ravi Kumar Vuddagiri, Hari Krishna Vydana, Jiteesh Varma Bhupathiraju, Suryakanth V. Gangashetty, Anil Kumar Vuppala

Erschienen in: Mining Intelligence and Knowledge Exploration

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Automatically identifying the language being spoken from speech plays a vital role in operating multilingual speech processing applications. A rapid growth in the use of mobile communication devices has inflicted the necessity of operating all speech processing applications in mobile environments. Degradation in the performance of any speech processing applications is majorly due to varying background environments, speech coding and transmission errors. In this work, we focus on developing a language identification system robust to degradations in coding environments in Indian scenario. Spectral features (MFCC) extracted from high sonority regions of speech are used for language identification. Sonorant regions of speech are the regions of speech that are perceptually loud, carry a clear pitch. The quality of coded speech in high sonority region is high compared to less sonorant regions. Spectral features (MFCC) extracted from high sonority regions of speech are used for language identification. In this work, GMM-UBM based modelling technique is employed to develop an language identification (LID) system. Present study is carried out on IITKGP-MLILSC speech database.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Spoken Document Retrieval: Sub-sequence DTW Framework and Variants

Nächstes Kapitel SHIM: A Novel Influence Maximization Algorithm for Targeted Marketing

Joseph, M.A., Guruprasad, S., Yegnanarayana, B.: Extracting formants from short segments of speech using group delay functions. In: Proceedings of Interspeech, pp. 1009–1012 (2006)

Maity, S., Vuppala, A.K., Rao, K.S., Nandi, D.: IITKGP-MLILSC speech database for language identification. In: 2012 National Conference on Communications (NCC), pp. 1–5. IEEE (2012)

Mary, L., Yegnanarayana, B.: Extraction and representation of prosodic features for language and speaker recognition. Speech Commun. 50(10), 782–796 (2008)CrossRef

Murty, K.S.R., Yegnanarayana, B.: Epoch extraction from speech signals. IEEE Trans. Speech Audio Lang. Process. 16(8), 1602–1613 (2008)CrossRef

Nagarajan, T.: Implicit systems for spoken language identification. Ph.D. thesis, Indian Institute of Technology, Madras (2004)

Nandi, D., Dutta, A.K., Rao, K.S.: Significance of cv transition and steady vowel regions for language identification. In: 2014 Seventh International Conference on Contemporary Computing (IC3), pp. 513–517. IEEE (2014)

Quatieri, T.F., Singer, E., Dunn, R.B., Reynolds, D.A., Campbell, J.P.: Speaker and language recognition using speech codec parameters. Technical report, DTIC Document (1999)

Rao, K.S., Maity, S., Reddy, V.R.: Pitch synchronous and glottal closure based speech analysis for language recognition. Int. J. Speech Technol. 16(4), 413–430 (2013)CrossRef

Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digital Sig. Process. 10(1), 19–41 (2000)CrossRef

10.

Vydana, H.K., Mounica.K, Vuppala, A.K.: Improved syllable nuclei detection using formant energy in glottal closure regions. In: International Conference on Devices, Circuits and Communications (Accepted). IEEE (2014)

Titel: Improved Language Identification in Presence of Speech Coding
verfasst von: Ravi Kumar Vuddagiri
Hari Krishna Vydana
Jiteesh Varma Bhupathiraju
Suryakanth V. Gangashetty
Anil Kumar Vuppala
Verlag: Springer International Publishing
Buch: Mining Intelligence and Knowledge Exploration
Print ISBN: 978-3-319-26831-6

Electronic ISBN: 978-3-319-26832-3

Copyright-Jahr: 2015
DOI: https://doi.org/10.1007/978-3-319-26832-3_30

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"