Skip to main content

2015 | OriginalPaper | Buchkapitel

Improved Language Identification in Presence of Speech Coding

verfasst von : Ravi Kumar Vuddagiri, Hari Krishna Vydana, Jiteesh Varma Bhupathiraju, Suryakanth V. Gangashetty, Anil Kumar Vuppala

Erschienen in: Mining Intelligence and Knowledge Exploration

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Automatically identifying the language being spoken from speech plays a vital role in operating multilingual speech processing applications. A rapid growth in the use of mobile communication devices has inflicted the necessity of operating all speech processing applications in mobile environments. Degradation in the performance of any speech processing applications is majorly due to varying background environments, speech coding and transmission errors. In this work, we focus on developing a language identification system robust to degradations in coding environments in Indian scenario. Spectral features (MFCC) extracted from high sonority regions of speech are used for language identification. Sonorant regions of speech are the regions of speech that are perceptually loud, carry a clear pitch. The quality of coded speech in high sonority region is high compared to less sonorant regions. Spectral features (MFCC) extracted from high sonority regions of speech are used for language identification. In this work, GMM-UBM based modelling technique is employed to develop an language identification (LID) system. Present study is carried out on IITKGP-MLILSC speech database.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Joseph, M.A., Guruprasad, S., Yegnanarayana, B.: Extracting formants from short segments of speech using group delay functions. In: Proceedings of Interspeech, pp. 1009–1012 (2006) Joseph, M.A., Guruprasad, S., Yegnanarayana, B.: Extracting formants from short segments of speech using group delay functions. In: Proceedings of Interspeech, pp. 1009–1012 (2006)
2.
Zurück zum Zitat Maity, S., Vuppala, A.K., Rao, K.S., Nandi, D.: IITKGP-MLILSC speech database for language identification. In: 2012 National Conference on Communications (NCC), pp. 1–5. IEEE (2012) Maity, S., Vuppala, A.K., Rao, K.S., Nandi, D.: IITKGP-MLILSC speech database for language identification. In: 2012 National Conference on Communications (NCC), pp. 1–5. IEEE (2012)
3.
Zurück zum Zitat Mary, L., Yegnanarayana, B.: Extraction and representation of prosodic features for language and speaker recognition. Speech Commun. 50(10), 782–796 (2008)CrossRef Mary, L., Yegnanarayana, B.: Extraction and representation of prosodic features for language and speaker recognition. Speech Commun. 50(10), 782–796 (2008)CrossRef
4.
Zurück zum Zitat Murty, K.S.R., Yegnanarayana, B.: Epoch extraction from speech signals. IEEE Trans. Speech Audio Lang. Process. 16(8), 1602–1613 (2008)CrossRef Murty, K.S.R., Yegnanarayana, B.: Epoch extraction from speech signals. IEEE Trans. Speech Audio Lang. Process. 16(8), 1602–1613 (2008)CrossRef
5.
Zurück zum Zitat Nagarajan, T.: Implicit systems for spoken language identification. Ph.D. thesis, Indian Institute of Technology, Madras (2004) Nagarajan, T.: Implicit systems for spoken language identification. Ph.D. thesis, Indian Institute of Technology, Madras (2004)
6.
Zurück zum Zitat Nandi, D., Dutta, A.K., Rao, K.S.: Significance of cv transition and steady vowel regions for language identification. In: 2014 Seventh International Conference on Contemporary Computing (IC3), pp. 513–517. IEEE (2014) Nandi, D., Dutta, A.K., Rao, K.S.: Significance of cv transition and steady vowel regions for language identification. In: 2014 Seventh International Conference on Contemporary Computing (IC3), pp. 513–517. IEEE (2014)
7.
Zurück zum Zitat Quatieri, T.F., Singer, E., Dunn, R.B., Reynolds, D.A., Campbell, J.P.: Speaker and language recognition using speech codec parameters. Technical report, DTIC Document (1999) Quatieri, T.F., Singer, E., Dunn, R.B., Reynolds, D.A., Campbell, J.P.: Speaker and language recognition using speech codec parameters. Technical report, DTIC Document (1999)
8.
Zurück zum Zitat Rao, K.S., Maity, S., Reddy, V.R.: Pitch synchronous and glottal closure based speech analysis for language recognition. Int. J. Speech Technol. 16(4), 413–430 (2013)CrossRef Rao, K.S., Maity, S., Reddy, V.R.: Pitch synchronous and glottal closure based speech analysis for language recognition. Int. J. Speech Technol. 16(4), 413–430 (2013)CrossRef
9.
Zurück zum Zitat Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digital Sig. Process. 10(1), 19–41 (2000)CrossRef Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digital Sig. Process. 10(1), 19–41 (2000)CrossRef
10.
Zurück zum Zitat Vydana, H.K., Mounica.K, Vuppala, A.K.: Improved syllable nuclei detection using formant energy in glottal closure regions. In: International Conference on Devices, Circuits and Communications (Accepted). IEEE (2014) Vydana, H.K., Mounica.K, Vuppala, A.K.: Improved syllable nuclei detection using formant energy in glottal closure regions. In: International Conference on Devices, Circuits and Communications (Accepted). IEEE (2014)
Metadaten
Titel
Improved Language Identification in Presence of Speech Coding
verfasst von
Ravi Kumar Vuddagiri
Hari Krishna Vydana
Jiteesh Varma Bhupathiraju
Suryakanth V. Gangashetty
Anil Kumar Vuppala
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-26832-3_30