nach oben

Erschienen in:

2010 | OriginalPaper | Buchkapitel

1. History and Development of Speech Recognition

verfasst von : Sadaoki Furui

Erschienen in: Speech Technology

Verlag: Springer US

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Speech is the primary means of communication between humans. For reasons ranging from technological curiosity about the mechanisms for mechanical realization of human speech capabilities to the desire to automate simple tasks which necessitate human–machine interactions, research in automatic speech recognition by machines has attracted a great deal of attention for five decades.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nächstes Kapitel Challenges in Speech Synthesis

Allen, J. (2002). From Lord Rayleigh to Shannon: How do we decode speech? In: Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, Orlando, FL, http://www.auditorymodels.org/jba/PAPERS/ICASSP/Plenary_Allen.asp.html.

ATIS Technical Reports (1995). Proc. ARPA Spoken Language Systems Technology Workshop, Austin, TX, 241–280.

Beek, B., Neuberg, E., Hodge, D. (1977). An assessment of the technology of automatic speech recognition for military applications. IEEE Trans. Acoust., Speech, Signal Process., 25, 310–322.

Bridle, J. S., Brown, M. D. (1979). Connected word recognition using whole word templates. In: Proc. Inst. Acoustics Autumn Conf., 25–28.

Chou, W. (2003). Minimum classification error (MCE) approach in pattern recognition. Chou, W., Juang, B.-H. (eds) Pattern Recognition in Speech and Language Processing. CRC Press, New York, 1–49.

Chow, Y. L., Dunham, M. O., Kimball, O. A. (1987). BYBLOS, the BBN continuous speech recognition system. In: Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, Dallas, TX, 89–92.

Davis, K. H., Biddulph, R., Balashek, S. (1952). Automatic recognition of spoken digits. J. Acoust. Soc. Am., 24 (6), 637–642.

Ferguson, J. (ed) (1980). Hidden Markov Models for Speech. IDA, Princeton, NJ.

Forgie, J. W., Forgie, C. D. (1959). Results obtained from a vowel recognition computer program. J. Acoust. Soc. Am., 31 (11), 1480–1489.

10.

Fry, D. B., Denes, P. (1959). Theoretical aspects of mechanical speech recognition. The design and operation of the mechanical speech recognizer at University College London. J. British Inst. Radio Eng., 19 (4), 211–229.

11.

Furui, S. (1986). Speaker independent isolated word recognition using dynamic features of speech spectrum. IEEE Trans. Acoust., Speech, Signal Process., 34, 52–59.

12.

Furui, S. (2004). Speech-to-text and speech-to-speech summarization of spontaneous speech. IEEE Trans. Speech Audio Process., 12, 401–408.

13.

Furui, S. (2004). Fifty years of progress in speech and speaker recognition. In: Proc. 148th Acoustical Society of America Meeting, San Diego, CA, 2497.

14.

Furui, S. (2005). Recent progress in corpus-based spontaneous speech recognition. IEICE Trans. Inf. Syst., E88-D (3), 366–375.

15.

Gales, M. J. F., Young, S. J. (1993). Parallel model combination for speech recognition in noise. Technical Report, CUED/F-INFENG/TR135.

16.

Itakura, F. (1975). Minimum prediction residual applied to speech recognition. IEEE Trans. Acoust., Speech, Signal Process., 23, 67–72.

17.

Jelinek, F. (1985). The development of an experimental discrete dictation recognizer. Proc. IEEE, 73 (11), 1616–1624.

18.

Jelinek, F., Bahl, L., Mercer, R. (1975). Design of a linguistic statistical decoder for the recognition of continuous speech. IEEE Trans. Inf. Theory, 21, 250–256.MATH

19.

Juang, B. H., Furui, S. (2000). Automatic speech recognition and understanding: A first step toward natural human-machine communication. Proc. IEEE, 88 (8), 1142–1165.

20.

Juang, B. H., Rabiner, L. R. (2005). Automatic speech recognition: History. Brown, K. (ed) Encyclopedia of Language and Linguistics, Second Edition, Oxford: Elsevier, New York, 11, 806–819.

21.

Junqua, J. C., Haton, J. P. (1996). Robustness in Automatic Speech Recognition. Kluwer, Boston.

22.

Katagiri, S. (2003). Speech pattern recognition using neural networks. Chou, W., Juang, B. H. (eds) Pattern Recognition in Speech and Language Processing. CRC Press, New York, 115–147.

23.

Kawahara, T., Lee, C. H., Juang, B. H. (1998). Key-phrase detection and verification for flexible speech understanding. IEEE Trans. Speech Audio Process, 6, 558–568.

24.

Klatt, D. (1977). Review of the ARPA speech understanding project. J. Acoust. Soc. Am., 62 (6), 1324–1366.

25.

Koo, M. W., Lee, C. H., Juang, B. H. (2001). Speech recognition and utterance verification based on a generalized confidence score. IEEE Trans. Speech Audio Process, 9, 821–832.

26.

Lee, C. H., Giachin, E., Rabiner, L. R., Pieraccini, R., Rosenberg, A. E. (1990). Acoustic modeling for large vocabulary speech recognition. Comput. Speech Lang., 4, 127–165.

27.

Lee, C. H., Rabiner, L. R. (1989). A frame synchronous network search algorithm for connected word recognition. IEEE Trans. Acoust., Speech, Signal Process, 37, 1649–1658.

28.

Lee, K. F., Hon, H., Reddy, R. (1990). An overview of the SPHINX speech recognition system. IEEE Trans. Acoust., Speech, Signal Process, 38, 600–610.

29.

Leggetter, C. J., Woodland, P. C. (1995). Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Comput. Speech Lang., 9, 171–185.

30.

Lippmann, R. P. (1987). An introduction to computing with neural nets. IEEE ASSP Mag., 4 (2), 4–22.

31.

Lippmann, R. P. (1997). Speech recognition by machines and humans. Speech Communication, 22, 1–15.

32.

Liu, Y., Shriberg, E., Stolcke, A., Peskin, B., Ang, J., Hillard, D., Ostendorf, M., Tomalin, M., Woodland, P. C., Harper, M. (2005). Structural metadata research in the EARS program. In: Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, Montreal, Canada, V, 957–960.

33.

Lowerre, B. (1980). The HARPY speech understanding system. Lea, W (ed) Trends in Speech Recognition. Prentice Hall, NJ, 576–586.

34.

Martin, T. B., Nelson, A. L., Zadell, H. J. (1964). Speech recognition by feature abstraction techniques. Technical Report AL-TDR-64-176, Air Force Avionics Lab.

35.

Moore, R. C. (1997). Using natural-language knowledge sources in speech recognition. Ponting, K. (ed) Computational Models of Speech Pattern Processing. Springer, Berlin, 304–327.

36.

Myers, C. S., Rabiner, L. R. (1981). A level building dynamic time warping algorithm for connected word recognition. IEEE Trans. Acoust., Speech, Signal Process., 29, 284–297.MATH

37.

Nagata, K., Kato, Y., Chiba, S. (1963). Spoken digit recognizer for Japanese language. NEC Res. Develop., 6.

38.

Olson, H. F., Belar, H. (1956). Phonetic typewriter. J. Acoust. Soc. Am., 28 (6), 1072–1081.

39.

Paul, D. B. (1989). The Lincoln robust continuous speech recognizer. In: Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, Glasgow, Scotland, 449–452.

40.

Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE, 77 (2), 257–286.

41.

Rabiner, L. R., Juang, B. H. (1993). Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliff, NJ.

42.

Rabiner, L. R., Levinson, S. E., Rosenberg, A. E. (1979). Speaker independent recognition of isolated words using clustering techniques. IEEE Trans. Acoust., Speech, Signal Process., 27, 336–349.MATH

43.

Reddy, D. R. (1966). An approach to computer speech recognition by direct analysis of the speech wave. Technical Report No. C549, Computer Science Department, Stanford University, Stanford.

44.

Sakai, T., Doshita, S. (1962). The phonetic typewriter, information processing. In: Proc. IFIP Congress, Munich.

45.

Sakoe, H. (1979). Two level DP matching – a dynamic programming based pattern matching algorithm for connected word recognition. IEEE Trans. Acoust., Speech, Signal Process., 27, 588–595.

46.

Sakoe, H., Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust., Speech, Signal Process., 26, 43–49.MATH

47.

Shinoda, K., Lee, C. H. (2001). A structural Bayes approach to speaker adaptation. IEEE Trans. Speech Audio Process., 9, 276–287.

48.

Soltau, H., Kingsbury, B., Mangu, L., Povey, D., Saon, G., Zweig, G. (2005). The IBM 2004 conversational telephone system for rich transcription. In: Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, Montreal, Canada, I, 205–208.

49.

Suzuki, J., Nakata, K. (1961). Recognition of Japanese vowels – preliminary to the recognition of speech. J. Radio Res. Lab., 37 (8), 193–212.

50.

Tappert, C., Dixon, N. R., Rabinowitz, A. S., Chapman, W. D. (1971). Automatic recognition of continuous speech utilizing dynamic segmentation, dual classification, sequential decoding and error recovery. Rome Air Dev. Cen, Rome, NY, Technical Report TR 71–146.

51.

Varga, P., Moore, R. K. (1990). Hidden Markov model decomposition of speech and noise. In: Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, Albuquerque, New Mexico, 845–848.

52.

Velichko, V. M., Zagoruyko, N. G. (1970). Automatic recognition of 200 words. Int. J. Man-Machine Studies, 2, 223–234.

53.

Vintsyuk, T. K. (1968). Speech discrimination by dynamic programming. Kibernetika, 4 (2), 81–88.MathSciNet

54.

Viterbi, J. (1967). Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE Trans. Inf. Theory, 13, 260–269.MATH

55.

Waibel, A., Hanazawa, T., Hinton, G., Shiano, K., Lang, K. (1989). Phoneme recognition using time-delay neural networks. IEEE Trans. Acoust., Speech, Signal Process., 37, 393–404.

56.

Weintraub, M., Murveit, H., Cohen, M., Price, P., Bernstein, J., Bell, G. (1989). Linguistic constraints in hidden Markov model based speech recognition. In: Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, Glasgow, Scotland, 699–702.

57.

Zue, V., Glass, J., Phillips, M., Seneff, S. (1989). The MIT summit speech recognition system, a progress report. In: Proc. DARPA Speech and Natural Language Workshop, Philadelphia, PA, 179–189.

58.

Zweig, G. (1998). Speech recognition with dynamic Bayesian networks. Ph.D. Thesis, University of California, Berkeley.

Titel: History and Development of Speech Recognition
verfasst von: Sadaoki Furui
Verlag: Springer US
Buch: Speech Technology
Print ISBN: 978-0-387-73818-5

Electronic ISBN: 978-0-387-73819-2

Copyright-Jahr: 2010
DOI: https://doi.org/10.1007/978-0-387-73819-2_1

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Frank Urbansky/© Peter Eichler / Leipzig, CO2-Fußabdruck/© Jenny Sturm / stock.adobe.com, Interview Entropie Bild 1/© Bernhard Weßling, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.