nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

Enhanced Automatic Speech Recognition with Non-acoustic Parameters

verfasst von : N. S. Sreekanth, N. K. Narayanan

Erschienen in: Proceedings of the International Conference on Signal, Networks, Computing, and Systems

Verlag: Springer India

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

A novel method for improving the accuracy of automatic speech recognition system by adding non-acoustic parameters are discussed in this paper. The gestural features which are commonly co-expressive with speech is considered for improving the accuracy of ASR system in noisy environment. Both dynamic and static gestures are integrated with speech recognition system and tested in various environmental conditions, i.e., noise levels. The accuracy of continuous speech recognition system and isolated word recognition system are tested with and without gestures under various noise conditions. The addition of visual features provides stable recognition accuracy under different environmental noise conditions for acoustic signals.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Robust Speaker Verification Using GFCC Based i-Vectors

Nächstes Kapitel Dynamic Gesture Recognition—A Machine Vision Based Approach

Dong Yu, Li Deng; Droppo, J.; Jian Wu; Gong, Yifan; Acero, A. “A minimum-mean-square-error noise reduction algorithm on Mel-frequency cepstra for robust speech recognition” Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on DOI:10.1109/ICASSP.2008.4518541. pp. 4041–4044.

Wouters, Jan; Vanden Berghe, Jeff “Speech Recognition in Noise for Cochlear Implantees with a Two-Microphone Monaural Adaptive Noise Reduction System”- Ear & Hearing: Journal of American Auditory society. October 2001 - Volume 22 - Issue 5 - pp 420–430.

Willie Walker, Paul Lamere, Philip Kwok, Bhiksha Raj, Rita Singh, Evandro Gouvea, Peter Wolf, Joe Woelfel, “Sphinx-4: A Flexible Open Source Framework for Speech Recognition” White paper -SMLI TR2004-0811 c2004 SUN MICROSYSTEMS INC.

Maycel Isaac Faraj, Josef Bigun, “Lip Motion Features for Biometric Person Recognition” Book chapter of Medical Information Science Reference, IGI Global, Chapter XVII, pp. 495–532. Year 2009.

P.Prajith, “Investigations on the applications of dynamical instabilities and deterministic chaos for speech signal processing”, Ph.D Thesis, University of Calicut 2008.

Petajan, E. (1984). Automatic lipreading to enhance speech recognition. Global Telecommunications Conference. (pp. 265–272).

Mase, K., & Pentland, A. (1991). Automatic lip-reading by opticalflow analysis. Systems and Computers in Japan, 22(6), 67–76.CrossRef

Kittler, J., Li, Y., Matas, J., & Sanchez, M. (1997). Combining evidence in multimodal personal identity recognition systems. Proceedings of the First 48 International Conference on Audio- and Video-Based Biometric Person Authentication, LNCS 1206, (pp. 327–334).

Yamamoto, E., Nakamura, S., & Shikano, K. (1998). Lip movement synthesis from speech based on hidden markov models. Journal of Speech Communication, 26(1), 105–115.CrossRef

10.

Neti, C Potamianos, G.; Luettin, J.; Matthews, I.; Glotin, H.; Vergyri, D.” Large-vocabulary audio-visual speech recognition: a summary of the Johns Hopkins Summer 2000 Workshop “, IEEE Fourth Workshop on Multimedia Signal Processing, 2001, pp. 619–624.

11.

Mitra, V; Hosung Nam; Espy-Wilson, C.Y.; Saltzman, E.; Goldstein, L”Gesture-based Dynamic Bayesian Network for noise robust speech recognition”, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011. pp. 5172–5175, IEEE-DOI:10.1109/ICASSP.2011.5947522.

12.

Ze Lei; Zhao Hui Gan; Min Jiang; Ke Dong “Artificial robot navigation based on gesture and speech recognition”, International Conference on Security, Pattern Analysis, and Cybernetics (SPAC), 2014, pp. 323–327, IEEE DOI:10.1109/SPAC.2014.6982708.

13.

Wu-chun Feng “An integrated multimedia environment for speech recognition using handwriting and written gestures”, Proceedings of the 36th Annual Hawaii International Conference on System Sciences, 2003. IEEE DOI:10.1109/HICSS.2003.1174293.

14.

Lei Chen; Harper, M.; Quek, F. “Gesture patterns during speech repairs”, Proceedings of Fourth IEEE International Conference on Multimodal Interfaces, 2002. pp. 155–160, DOI:10.1109/ICMI.2002.1166985.

15.

Lei Yang, Hui Li, Xiaoyu Wu, Dewei Zhao, Jun Zhai. ― An algorithm of skin detection based on texture‖. IEEE Image and Signal Processing (CSIP), 2011.

16.

Noor Adnan Ibraheem, RafiqulZaman Khan “Survey on Various Gesture Recognition Technologies and Techniques”, International Journal of Computer Applications (0975–8887), Volume 50 – No.7, July 2012, pp. 38–44.

17.

B.J Manikandan, Gowri Shankar, V Anoop, A Datta, V S Chakravarthy: LEKHAK: A System for Online Recognition of Handwritten Tamil Characters. Proceeding of the International Conference on Natural Language Processing (ICON-2002) Vikas Publishing House Pvt. Ltd. pp. 285–291.

18.

Daniel Jurafsky and James H. Martin “Speech and Language Processing”, Prentice Hall, Englewood Cliffs, New Jersey 07632, 2000.

Titel: Enhanced Automatic Speech Recognition with Non-acoustic Parameters
verfasst von: N. S. Sreekanth
N. K. Narayanan
Verlag: Springer India
Buch: Proceedings of the International Conference on Signal, Networks, Computing, and Systems
Print ISBN: 978-81-322-3590-3

Electronic ISBN: 978-81-322-3592-7

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-81-322-3592-7_10

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Buchstaben, die aus einem Megaphon kommen/© MicroStockHub/Getty Images/iStock, Digitale Lieferkette/© zapp2photo / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.