Skip to main content

2010 | OriginalPaper | Buchkapitel

6. Cognitive Approaches to Spoken Language Technology

verfasst von : Roger K. Moore

Erschienen in: Speech Technology

Verlag: Springer US

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As evidenced by the contributions of the other authors in this volume, spoken language technology (SLT) has made great strides over the past 20 or so years. The introduction of data-driven machine-learning approaches to building statistical models for automatic speech recognition (ASR), unit selection inventories for text-to-speech synthesis (TTS) or interaction strategies for spoken language dialogue systems (SLDS) has given rise to a steady year-on-year improvement in system capabilities. Such continued incremental progress has also been underpinned by a regime of public benchmark testing sponsored by national funding agencies, such as DARPA, coupled with an ongoing increase in available computer power.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
Current spoken language technology is literally ‘mindless’!
 
Literatur
1.
Zurück zum Zitat Moore, R. K. (2005). Research challenges in the automation of spoken language interaction. In: Proc. COST278 and ISCA Tutorial and Research Workshop on Applied Spoken Language Interaction in Distributed Environments (ASIDE 2005): Aalborg University, Denmark, 10–11. Moore, R. K. (2005). Research challenges in the automation of spoken language interaction. In: Proc. COST278 and ISCA Tutorial and Research Workshop on Applied Spoken Language Interaction in Distributed Environments (ASIDE 2005): Aalborg University, Denmark, 10–11.
2.
Zurück zum Zitat Huang, X. D. (2002). Making speech mainstream. Microsoft Speech Technologies Group. Huang, X. D. (2002). Making speech mainstream. Microsoft Speech Technologies Group.
3.
Zurück zum Zitat Henton, C. (2002). Fiction and reality of TTS, Speech Technology Magazine 7(1). Henton, C. (2002). Fiction and reality of TTS, Speech Technology Magazine 7(1).
4.
Zurück zum Zitat Moore, R. K. (2003). A comparison of the data requirements of automatic speech recognition systems and human listeners. In: Proc. EUROSPEECH’03, Geneva, Switzerland, September 1–4, 2582–2584. Moore, R. K. (2003). A comparison of the data requirements of automatic speech recognition systems and human listeners. In: Proc. EUROSPEECH’03, Geneva, Switzerland, September 1–4, 2582–2584.
5.
Zurück zum Zitat Gorin, A., Riccardi, G., Wright, J. (1997). How may I help you? Speech Commun., 23, 113–127. Gorin, A., Riccardi, G., Wright, J. (1997). How may I help you? Speech Commun., 23, 113–127.
6.
Zurück zum Zitat Young, S. J. (2006). Using POMDPs for dialog management. In: Proc. IEEE/ACL Workshop on Spoken Language Technology, Aruba Marriott, Palm Beach, Aruba, December 10–13, 8–13. Young, S. J. (2006). Using POMDPs for dialog management. In: Proc. IEEE/ACL Workshop on Spoken Language Technology, Aruba Marriott, Palm Beach, Aruba, December 10–13, 8–13.
7.
Zurück zum Zitat Maslow, A. H. (1943). A theory of human motivation. Psychol. Rev., 50, 370–396. Maslow, A. H. (1943). A theory of human motivation. Psychol. Rev., 50, 370–396.
8.
Zurück zum Zitat Scherer, K. R., Schorr, A., Johnstone, T. (2001). Appraisal Processes in Emotion: Theory, Methods, Research. Oxford University Press, New York and Oxford. Scherer, K. R., Schorr, A., Johnstone, T. (2001). Appraisal Processes in Emotion: Theory, Methods, Research. Oxford University Press, New York and Oxford.
9.
Zurück zum Zitat Broadbent, D. E. (1958). Perception and Communication. Pergamon Press, London. Broadbent, D. E. (1958). Perception and Communication. Pergamon Press, London.
10.
Zurück zum Zitat Toates, F. (2006). A model of the hierarchy of behaviour, cognition and consciousness. Consciousness Cogn., 15, 75–118. Toates, F. (2006). A model of the hierarchy of behaviour, cognition and consciousness. Consciousness Cogn., 15, 75–118.
11.
Zurück zum Zitat Brunswik, E. (1952). The conceptual framework of psychology. International Encyclopaedia of Unified Science, vol. 1, University of Chicago Press, Chicago. Brunswik, E. (1952). The conceptual framework of psychology. International Encyclopaedia of Unified Science, vol. 1, University of Chicago Press, Chicago.
12.
Zurück zum Zitat Figueredo, A. J., Hammond, K. R., McKierman, E. C. (2006). A Brunswikian evolutionary developmental theory of preparedness and plasticity. Intelligence, 34, 211–227. Figueredo, A. J., Hammond, K. R., McKierman, E. C. (2006). A Brunswikian evolutionary developmental theory of preparedness and plasticity. Intelligence, 34, 211–227.
13.
Zurück zum Zitat Scherer, K. R. (2003). Vocal communication of emotion: A review of research paradigms. Speech Commun., 40, 227–256.MATH Scherer, K. R. (2003). Vocal communication of emotion: A review of research paradigms. Speech Commun., 40, 227–256.MATH
14.
Zurück zum Zitat Rizzolatti, G., Craighero, L. (2004). The mirror-neuron system. Annu. Rev. Neurosci., 27, 169–192. Rizzolatti, G., Craighero, L. (2004). The mirror-neuron system. Annu. Rev. Neurosci., 27, 169–192.
15.
Zurück zum Zitat Powers, W. T. (1973). Behaviour: The Control of Perception. Aldine, Hawthorne, NY. Powers, W. T. (1973). Behaviour: The Control of Perception. Aldine, Hawthorne, NY.
16.
Zurück zum Zitat Wilson, M., Knoblich, G. (2005). The case for motor involvement in perceiving conspecifics. Psychol. Bull., 131, 460–473. Wilson, M., Knoblich, G. (2005). The case for motor involvement in perceiving conspecifics. Psychol. Bull., 131, 460–473.
17.
Zurück zum Zitat Becchio, C., Adenzato, M., Bara, B. G. (2006). How the brain understands intention: Different neural circuits identify the componential features of motor and prior intentions. Consciousness Cogn., 15, 64–74. Becchio, C., Adenzato, M., Bara, B. G. (2006). How the brain understands intention: Different neural circuits identify the componential features of motor and prior intentions. Consciousness Cogn., 15, 64–74.
18.
Zurück zum Zitat Grush, R. (2004). The emulation theory of representation: Motor control, imagery, and perception. Behav. Brain Sci., 27, 377–442. Grush, R. (2004). The emulation theory of representation: Motor control, imagery, and perception. Behav. Brain Sci., 27, 377–442.
19.
Zurück zum Zitat Hawkins, J. (2004). On Intelligence. Times Books, New York, NY. Hawkins, J. (2004). On Intelligence. Times Books, New York, NY.
20.
Zurück zum Zitat Lexandrov, Y. I., Sams, M. E. (2005). Emotion and consciousness: End of a continuum. Cogn. Brain Res., 25, 387–405. Lexandrov, Y. I., Sams, M. E. (2005). Emotion and consciousness: End of a continuum. Cogn. Brain Res., 25, 387–405.
21.
Zurück zum Zitat Taylor, M. M. (1992). Strategies for speech recognition and understanding using layered protocols. Speech Recognition and Understanding – Recent Advances. NATO ASI Series F75, Springer-Verlag, Berlin, Heidelberg. Taylor, M. M. (1992). Strategies for speech recognition and understanding using layered protocols. Speech Recognition and Understanding – Recent Advances. NATO ASI Series F75, Springer-Verlag, Berlin, Heidelberg.
22.
Zurück zum Zitat Gerdes, V. G. J., Happee, R. (1994). The use of an internal representation in fast goal-directed movements: A modeling approach. Biol. Cybernet., 70, 513–524. Gerdes, V. G. J., Happee, R. (1994). The use of an internal representation in fast goal-directed movements: A modeling approach. Biol. Cybernet., 70, 513–524.
23.
Zurück zum Zitat Wilson, S. M., Saygin, A. P., Sereno, M. I., Iacoboni, M. (2004). Listening to speech activates motor areas involved in speech production. Nat. Neurosci., 7, 701–702. Wilson, S. M., Saygin, A. P., Sereno, M. I., Iacoboni, M. (2004). Listening to speech activates motor areas involved in speech production. Nat. Neurosci., 7, 701–702.
24.
Zurück zum Zitat Gopnik, A., Meltzoff, A. N., Kuhl, P. K. (2001). The Scientist in the Crib. Perennial, New York, NY. Gopnik, A., Meltzoff, A. N., Kuhl, P. K. (2001). The Scientist in the Crib. Perennial, New York, NY.
25.
Zurück zum Zitat Kuhl, P. K. (2004). Early language acquisition: Cracking the speech code. Nat. Rev.: Neurosci., 5, 831–843. Kuhl, P. K. (2004). Early language acquisition: Cracking the speech code. Nat. Rev.: Neurosci., 5, 831–843.
26.
Zurück zum Zitat Cowley, S. J. (2004). Simulating others: The basis of human cognition. Lang. Sci., 26, 273–299. Cowley, S. J. (2004). Simulating others: The basis of human cognition. Lang. Sci., 26, 273–299.
27.
Zurück zum Zitat Weber, C., Wermter, S., Elshaw, M. (2006). A hybrid generative and predictive model of the motor cortex. Neural Netw., 19, 339–353.MATH Weber, C., Wermter, S., Elshaw, M. (2006). A hybrid generative and predictive model of the motor cortex. Neural Netw., 19, 339–353.MATH
28.
Zurück zum Zitat Mountcastle, V. B. (1978). An organizing principle for cerebral function: The unit model and the distributed system. In: Edelman, G. M., Mountcastle, V. B. (eds) The Mindful Brain, MIT Press, Cambridge, MA. Mountcastle, V. B. (1978). An organizing principle for cerebral function: The unit model and the distributed system. In: Edelman, G. M., Mountcastle, V. B. (eds) The Mindful Brain, MIT Press, Cambridge, MA.
29.
Zurück zum Zitat Hawkins, J., George, D. (2006). Hierarchical Temporal Memory: Concepts, Theory, and Terminology. Numenta Inc., Redwood City, CA. Hawkins, J., George, D. (2006). Hierarchical Temporal Memory: Concepts, Theory, and Terminology. Numenta Inc., Redwood City, CA.
30.
Zurück zum Zitat Chartrand, T. L., Bargh, J. A. (1999). The chameleon effect: The perception-behavior link and social interaction. Social Psychol., 76, 893–910. Chartrand, T. L., Bargh, J. A. (1999). The chameleon effect: The perception-behavior link and social interaction. Social Psychol., 76, 893–910.
31.
Zurück zum Zitat Meltzoff, M., Moore, K. (1997). Explaining facial imitation: A theoretical model. Early Dev. Parenting, 6, 179–192. Meltzoff, M., Moore, K. (1997). Explaining facial imitation: A theoretical model. Early Dev. Parenting, 6, 179–192.
32.
Zurück zum Zitat Brass, M., Bekkering, H., Wohlschlager, A., Prinz, W. (2000). Compatibility between observed and executed finger movements: Comparing symbolic, spatial, and imitative cues. Brain Cogn., 44, 124–143. Brass, M., Bekkering, H., Wohlschlager, A., Prinz, W. (2000). Compatibility between observed and executed finger movements: Comparing symbolic, spatial, and imitative cues. Brain Cogn., 44, 124–143.
33.
Zurück zum Zitat Kerzel, D., Bekkering, H. (2000). Motor activation from visible speech: Evidence from stimulus response compatibility. J. Exp. Psychol. [Hum. Percept.], 26, 634–647. Kerzel, D., Bekkering, H. (2000). Motor activation from visible speech: Evidence from stimulus response compatibility. J. Exp. Psychol. [Hum. Percept.], 26, 634–647.
34.
Zurück zum Zitat Rizzolatti, G., Fadiga, L., Gallese, V., Fogassi, L. (1996). Premotor cortex and the recognition of motor actions. Cognitive Brain Res., 3, 131–141. Rizzolatti, G., Fadiga, L., Gallese, V., Fogassi, L. (1996). Premotor cortex and the recognition of motor actions. Cognitive Brain Res., 3, 131–141.
35.
Zurück zum Zitat Iacoboni, M., Molnar-Szakacs, I., Gallesse, V., Buccino, G., Mazziotta, J. C., Rizzolatti, G. (2005). Grasping the intentions of others with one’s own mirror system. PLoS Biol., 3, 529–535. Iacoboni, M., Molnar-Szakacs, I., Gallesse, V., Buccino, G., Mazziotta, J. C., Rizzolatti, G. (2005). Grasping the intentions of others with one’s own mirror system. PLoS Biol., 3, 529–535.
36.
Zurück zum Zitat Gallese, V., Keysers, C., Rizzolatti, G. (2004). A unifying view of the basis of social cognition. Trends Cogn. Sci., 8(9), 396–403. Gallese, V., Keysers, C., Rizzolatti, G. (2004). A unifying view of the basis of social cognition. Trends Cogn. Sci., 8(9), 396–403.
37.
Zurück zum Zitat Baron-Cohen, S., Leslie, A. M., Frith, U. (1985). Does the autistic child have a “theory of mind”? Cognition, 21, 37–46. Baron-Cohen, S., Leslie, A. M., Frith, U. (1985). Does the autistic child have a “theory of mind”? Cognition, 21, 37–46.
38.
Zurück zum Zitat Baron-Cohen, S. (1997). Mindblindness: Essay on Autism and the Theory of Mind. MIT Press, Cambridge, MA. Baron-Cohen, S. (1997). Mindblindness: Essay on Autism and the Theory of Mind. MIT Press, Cambridge, MA.
39.
Zurück zum Zitat Kohler, E., Keysers, C., Umilta, M. A., Fogassi, L., Gallese, V., Rizzolatti, G. (2002). Hearing sounds, understanding actions: Action representation in mirror neurons. Science, 297, 846–848. Kohler, E., Keysers, C., Umilta, M. A., Fogassi, L., Gallese, V., Rizzolatti, G. (2002). Hearing sounds, understanding actions: Action representation in mirror neurons. Science, 297, 846–848.
40.
Zurück zum Zitat Pulvermüller, F. (2005). Brain mechanisms linking language and action. Nat. Neurosci. Rev., 6, 576–582. Pulvermüller, F. (2005). Brain mechanisms linking language and action. Nat. Neurosci. Rev., 6, 576–582.
41.
Zurück zum Zitat Rizzolatti, G., Arbib, M. A. (1998). Language within our grasp. Trends Neurosci., 21, 188–194. Rizzolatti, G., Arbib, M. A. (1998). Language within our grasp. Trends Neurosci., 21, 188–194.
42.
Zurück zum Zitat Pacherie, E., Dokic, J. (2006). From mirror neurons to joint actions. Cogn. Syst. Res., 7, 101–112. Pacherie, E., Dokic, J. (2006). From mirror neurons to joint actions. Cogn. Syst. Res., 7, 101–112.
43.
Zurück zum Zitat Studdart-Kennedy, M. (2002). Mirror neurons, vocal imitation, and the evolution of particulate speech. In: Mirror Neurons and the Evolution of Brain and Language. M.I. Stamenov, V. Gallese (Eds.), Philadelphia: Benjamins, 207–227. Studdart-Kennedy, M. (2002). Mirror neurons, vocal imitation, and the evolution of particulate speech. In: Mirror Neurons and the Evolution of Brain and Language. M.I. Stamenov, V. Gallese (Eds.), Philadelphia: Benjamins, 207–227.
44.
Zurück zum Zitat Arbib, M. A. (2005). From monkey-like action recognition to human language: An evolutionary framework for neurolinguists. Behav. Brain Sci., 28, 105–167. Arbib, M. A. (2005). From monkey-like action recognition to human language: An evolutionary framework for neurolinguists. Behav. Brain Sci., 28, 105–167.
45.
Zurück zum Zitat Aboitiz, F., Garcia, R. R., Bosman, C., Brunetti, E. (2006). Cortical memory mechanisms and language origins. Brain Lang., 40–56. Aboitiz, F., Garcia, R. R., Bosman, C., Brunetti, E. (2006). Cortical memory mechanisms and language origins. Brain Lang., 40–56.
46.
Zurück zum Zitat Newell, A. (1990). Unified Theories of Cognition. Harvard University Press, Cambridge, MA. Newell, A. (1990). Unified Theories of Cognition. Harvard University Press, Cambridge, MA.
47.
Zurück zum Zitat Rosenbloom, P. S., Laird, J. E., Newell, A. (1993). The SOAR Papers: Research on Integrated Intelligence. MIT Press, Cambridge, MA. Rosenbloom, P. S., Laird, J. E., Newell, A. (1993). The SOAR Papers: Research on Integrated Intelligence. MIT Press, Cambridge, MA.
48.
Zurück zum Zitat Anderson, J. R. (1996). ACT: A simple theory of complex cognition. American Psychol., 51(4), 355–365. Anderson, J. R. (1996). ACT: A simple theory of complex cognition. American Psychol., 51(4), 355–365.
49.
Zurück zum Zitat Bratman, M. E. (1987). Intention, Plans, and Practical Reason, Harvard University Press, Cambridge, MA. Bratman, M. E. (1987). Intention, Plans, and Practical Reason, Harvard University Press, Cambridge, MA.
50.
Zurück zum Zitat Rao, A., Georgoff, M. (1995). BDI agents: From theory to practice. Technical Report TR-56. Australian Artificial Intelligence Institute, Melbourne. Rao, A., Georgoff, M. (1995). BDI agents: From theory to practice. Technical Report TR-56. Australian Artificial Intelligence Institute, Melbourne.
51.
Zurück zum Zitat Winograd, T. (2006). Shifting viewpoints: Artificial intelligence and human-computer interaction. Artif. Intell., 170, 1256–1258. Winograd, T. (2006). Shifting viewpoints: Artificial intelligence and human-computer interaction. Artif. Intell., 170, 1256–1258.
52.
Zurück zum Zitat Brooks, R. A. (1991). Intelligence without representation. Artif. Intell., 47, 139–159. Brooks, R. A. (1991). Intelligence without representation. Artif. Intell., 47, 139–159.
53.
Zurück zum Zitat Brooks, R. A. (1991). Intelligence without reason. In: Proc. 12th Int. Joint Conf. on Artificial Intelligence, Sydney, Australia, 569–595. Brooks, R. A. (1991). Intelligence without reason. In: Proc. 12th Int. Joint Conf. on Artificial Intelligence, Sydney, Australia, 569–595.
54.
Zurück zum Zitat Brooks, R. A. (1986). A robust layered control system for a mobile robot. IEEE J. Rob. Autom. 2, 4–23. Brooks, R. A. (1986). A robust layered control system for a mobile robot. IEEE J. Rob. Autom. 2, 4–23.
55.
Zurück zum Zitat Prescott, T. J., Redgrave, P., Gurney, K. (1999). Layered control architectures in robots and vertebrates. Adaptive Behav., 7, 99–127. Prescott, T. J., Redgrave, P., Gurney, K. (1999). Layered control architectures in robots and vertebrates. Adaptive Behav., 7, 99–127.
56.
Zurück zum Zitat Roy, D., Reiter E. (2005). Connecting language to the world. Artif. Intell., 167, 1–12. Roy, D., Reiter E. (2005). Connecting language to the world. Artif. Intell., 167, 1–12.
57.
Zurück zum Zitat Roy, D. K., Pentland, A. P. (2002). Learning words from sights and sounds: A computational model. Cogn. Sci., 26, 113–146. Roy, D. K., Pentland, A. P. (2002). Learning words from sights and sounds: A computational model. Cogn. Sci., 26, 113–146.
58.
Zurück zum Zitat Roy, D. (2005). Semiotic schemas: A framework for grounding language in action and perception. Artif. Intell., 167, 170–205. Roy, D. (2005). Semiotic schemas: A framework for grounding language in action and perception. Artif. Intell., 167, 170–205.
59.
Zurück zum Zitat Wang, Y. (2003). Cognitive informatics: A new transdisciplinary research field. Brain Mind, 4, 115–127. Wang, Y. (2003). Cognitive informatics: A new transdisciplinary research field. Brain Mind, 4, 115–127.
60.
Zurück zum Zitat Wang, Y. (2003). On cognitive informatics. Brain Mind, 4, 151–167. Wang, Y. (2003). On cognitive informatics. Brain Mind, 4, 151–167.
61.
Zurück zum Zitat Moore, R. K. (2005). Cognitive informatics: The future of spoken language processing? In: Proc. SPECOM – 10th Int. Conf. on Speech and Computer, Patras, Greece, October 17–19. Moore, R. K. (2005). Cognitive informatics: The future of spoken language processing? In: Proc. SPECOM – 10th Int. Conf. on Speech and Computer, Patras, Greece, October 17–19.
62.
Zurück zum Zitat Moore, R. K. (2007). Spoken language processing: Piecing together the puzzle. J. Speech Commun. 49:418–43. Moore, R. K. (2007). Spoken language processing: Piecing together the puzzle. J. Speech Commun. 49:418–43.
63.
Zurück zum Zitat Moore, R. K. (2005). Towards a unified theory of spoken language processing. In: Proc. 4th IEEE Int. Conf. on Cognitive Informatics, Irvine, CA, USA, 8–10 August, 167–172. Moore, R. K. (2005). Towards a unified theory of spoken language processing. In: Proc. 4th IEEE Int. Conf. on Cognitive Informatics, Irvine, CA, USA, 8–10 August, 167–172.
Metadaten
Titel
Cognitive Approaches to Spoken Language Technology
verfasst von
Roger K. Moore
Copyright-Jahr
2010
Verlag
Springer US
DOI
https://doi.org/10.1007/978-0-387-73819-2_6

Neuer Inhalt