Skip to main content
Top

2015 | OriginalPaper | Chapter

5. HALEF: An Open-Source Standard-Compliant Telephony-Based Modular Spoken Dialog System: A Review and An Outlook

Authors : David Suendermann-Oeft, Vikram Ramanarayanan, Moritz Teckenbrock, Felix Neutatz, Dennis Schmidt

Published in: Natural Language Dialog Systems and Intelligent Assistants

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We describe completed and ongoing research on HALEF, a telephony-based open-source spoken dialog system that can be used with different plug-and-play back-end modules. We present two examples of such a module, one which classifies whether the person calling into the system is intoxicated or not and the other a question answering application. The system is compliant with World Wide Web Consortium and related industry standards while maintaining an open codebase to encourage progressive development and a common standard testbed for spoken dialog system development and benchmarking. The system can be deployed towards a versatile range of potential applications, including intelligent tutoring, language learning and assessment.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
1
Popular grammar formats include JSGF (Java Speech Grammar Format), SRGS (speech recognition grammar specification) and ARPA (Advanced Research Projects Agency) formats.
 
2
Since the data collected during different ALC experiments are not balanced in terms of class and gender, we removed all speakers that were recorded in only one of the classification states. We then discarded as many male speakers (selected at random) as necessary to achieve gender balance.
 
Literature
go back to reference Black AW, Burger S, Conkie A, Hastie H, Keizer S, Lemon O, Merigaud N, Parent G, Schubiner G, Thomson B, Williams J, Yu K, Young S, Eskenazi M (2011) Spoken dialog challenge 2010: comparison of live and control test results. In: Proceedings of the SIGDIAL 2011 conference. Association for Computational Linguistics, Portland, pp 2–7 Black AW, Burger S, Conkie A, Hastie H, Keizer S, Lemon O, Merigaud N, Parent G, Schubiner G, Thomson B, Williams J, Yu K, Young S, Eskenazi M (2011) Spoken dialog challenge 2010: comparison of live and control test results. In: Proceedings of the SIGDIAL 2011 conference. Association for Computational Linguistics, Portland, pp 2–7
go back to reference Bohus D, Raux A, Harris T, Eskenazi M, Rudnicky A (2007) Olympus: an open-source framework for conversational spoken language interface research. In: Proc. of the HLT-NAACL, Rochester, 2007 Bohus D, Raux A, Harris T, Eskenazi M, Rudnicky A (2007) Olympus: an open-source framework for conversational spoken language interface research. In: Proc. of the HLT-NAACL, Rochester, 2007
go back to reference Bos J, Klein E, Lemon O, Oka T (2003) Dipper: description and formalisation of an information-state update dialogue system architecture. In: 4th SIGdial workshop on discourse and dialogue, pp. 115–124 Bos J, Klein E, Lemon O, Oka T (2003) Dipper: description and formalisation of an information-state update dialogue system architecture. In: 4th SIGdial workshop on discourse and dialogue, pp. 115–124
go back to reference Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27CrossRef Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27CrossRef
go back to reference Eyben F, Wöllmer M, Schuller B (2010) Opensmile: the Munich versatile and fast open-source audio feature extractor. In: Proc. of the MM, Florence, 2010 Eyben F, Wöllmer M, Schuller B (2010) Opensmile: the Munich versatile and fast open-source audio feature extractor. In: Proc. of the MM, Florence, 2010
go back to reference Ferrucci D, Brown E, Chu-Carroll J, Fan J, Gondek D, Kalyanpur A, Lally A, Murdock W, Nyberg E, Prager J, Schlaefer N, Welty C (2010) Building Watson: an overview of the DeepQA project. AI Mag 31(3):59–79 Ferrucci D, Brown E, Chu-Carroll J, Fan J, Gondek D, Kalyanpur A, Lally A, Murdock W, Nyberg E, Prager J, Schlaefer N, Welty C (2010) Building Watson: an overview of the DeepQA project. AI Mag 31(3):59–79
go back to reference Graesser AC, Chipman P, Haynes BC, Olney A (2005) Autotutor: an intelligent tutoring system with mixed-initiative dialogue. IEEE Trans Educ 48(4):612–618CrossRef Graesser AC, Chipman P, Haynes BC, Olney A (2005) Autotutor: an intelligent tutoring system with mixed-initiative dialogue. IEEE Trans Educ 48(4):612–618CrossRef
go back to reference Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18CrossRef Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18CrossRef
go back to reference Holmes G, Donkin A, Witten IH (1994) Weka: a machine learning workbench. In: Proceedings of the 1994 second Australian and New Zealand conference on intelligent information systems. IEEE, Brisbane, pp 357–361 Holmes G, Donkin A, Witten IH (1994) Weka: a machine learning workbench. In: Proceedings of the 1994 second Australian and New Zealand conference on intelligent information systems. IEEE, Brisbane, pp 357–361
go back to reference Jurčíček F, Dušek O, Plátek O, Žilka L (2014) Alex: a statistical dialogue systems framework. In: Text, speech and dialogue. Springer, Brno, pp 587–594 Jurčíček F, Dušek O, Plátek O, Žilka L (2014) Alex: a statistical dialogue systems framework. In: Text, speech and dialogue. Springer, Brno, pp 587–594
go back to reference Lamere P, Kwok P, Gouvea E, Raj B, Singh R, Walker W, Warmuth M, Wolf P (2003) The CMU SPHINX-4 speech recognition system. In: Proc. of the ICASSP’03, Hong Kong, 2003 Lamere P, Kwok P, Gouvea E, Raj B, Singh R, Walker W, Warmuth M, Wolf P (2003) The CMU SPHINX-4 speech recognition system. In: Proc. of the ICASSP’03, Hong Kong, 2003
go back to reference Mehrez T, Abdelkawy A, Heikal Y, Lange P, Nabil H, Suendermann-Oeft D (2013) Who discovered the electron neutrino? A telephony-based distributed open-source standard-compliant spoken dialog system for question answering. In: Proc. of the GSCL, Darmstadt, 2013 Mehrez T, Abdelkawy A, Heikal Y, Lange P, Nabil H, Suendermann-Oeft D (2013) Who discovered the electron neutrino? A telephony-based distributed open-source standard-compliant spoken dialog system for question answering. In: Proc. of the GSCL, Darmstadt, 2013
go back to reference Pieraccini R, Huerta J (2005) Where do we go from here? Research and commercial spoken dialog systems. In: Proc. of the SIGdial, Lisbon, 2005 Pieraccini R, Huerta J (2005) Where do we go from here? Research and commercial spoken dialog systems. In: Proc. of the SIGdial, Lisbon, 2005
go back to reference Prylipko D, Schnelle-Walka D, Lord S, Wendemuth A (2011) Zanzibar OpenIVR: an open-source framework for development of spoken dialog systems. In: Proc. of the TSD, Pilsen Prylipko D, Schnelle-Walka D, Lord S, Wendemuth A (2011) Zanzibar OpenIVR: an open-source framework for development of spoken dialog systems. In: Proc. of the TSD, Pilsen
go back to reference Raux A, Langner B, Bohus D, Black A, Eskenazi M (2005) Let’s go public! taking a spoken dialog system to the real world. In: Proc. of the Interspeech, Lisbon, 2005 Raux A, Langner B, Bohus D, Black A, Eskenazi M (2005) Let’s go public! taking a spoken dialog system to the real world. In: Proc. of the Interspeech, Lisbon, 2005
go back to reference Schiel F, Heinrich C (2009) Laying the foundation for in-car alcohol detection by speech. In: Proc. of the Interspeech, Brighton, 2009 Schiel F, Heinrich C (2009) Laying the foundation for in-car alcohol detection by speech. In: Proc. of the Interspeech, Brighton, 2009
go back to reference Schiel F, Heinrich C, Barfüsser S, Gilg T (2008) ALC—alcohol language corpus. In: Proc. of the LREC, Marrakesh, 2008 Schiel F, Heinrich C, Barfüsser S, Gilg T (2008) ALC—alcohol language corpus. In: Proc. of the LREC, Marrakesh, 2008
go back to reference Schmitt A, Scholz M, Minker W, Liscombe J, Suendermann D (2010) Is it possible to predict task completion in automated troubleshooters? In: Proc. of the Interspeech, Makuhari, 2010 Schmitt A, Scholz M, Minker W, Liscombe J, Suendermann D (2010) Is it possible to predict task completion in automated troubleshooters? In: Proc. of the Interspeech, Makuhari, 2010
go back to reference Schnelle-Walka D, Radomski S, Mühlhäuser M (2013) JVoiceXML as a modality component in the W3C multimodal architecture. J Multimodal User Interfaces 7:183–194CrossRef Schnelle-Walka D, Radomski S, Mühlhäuser M (2013) JVoiceXML as a modality component in the W3C multimodal architecture. J Multimodal User Interfaces 7:183–194CrossRef
go back to reference Schröder M, Trouvain J (2003) The German text-to-speech synthesis system mary: a tool for research, development and teaching. Int J Speech Technol 6(4):365–377CrossRef Schröder M, Trouvain J (2003) The German text-to-speech synthesis system mary: a tool for research, development and teaching. Int J Speech Technol 6(4):365–377CrossRef
go back to reference Schuller B, Steidl S, Batliner A, Schiel F, Krajewski J (2011) The interspeech 2011 speaker state challenge. In: INTERSPEECH, pp 3201–3204 Schuller B, Steidl S, Batliner A, Schiel F, Krajewski J (2011) The interspeech 2011 speaker state challenge. In: INTERSPEECH, pp 3201–3204
go back to reference Seneff S, Wang C, Zhang J (2004) Spoken conversational interaction for language learning. In: InSTIL/ICALL symposium Seneff S, Wang C, Zhang J (2004) Spoken conversational interaction for language learning. In: InSTIL/ICALL symposium
go back to reference Suendermann D (2011) Advances in commercial deployment of spoken dialog systems. Springer, New YorkCrossRef Suendermann D (2011) Advances in commercial deployment of spoken dialog systems. Springer, New YorkCrossRef
go back to reference Suendermann-Oeft D (2014) Modern conversational agents. In: Technologien für digitale Innovationen. Springer, Wiesbaden, pp 63–84 Suendermann-Oeft D (2014) Modern conversational agents. In: Technologien für digitale Innovationen. Springer, Wiesbaden, pp 63–84
go back to reference Taylor P, Black A, Caley R (1998) The architecture of the festival speech synthesis system. In: Proc. of the ESCA workshop on speech synthesis, Jenolan Caves, 1998 Taylor P, Black A, Caley R (1998) The architecture of the festival speech synthesis system. In: Proc. of the ESCA workshop on speech synthesis, Jenolan Caves, 1998
go back to reference van Meggelen J, Smith J, Madsen L (2009) Asterisk: the future of telephony. O’Reilly, Sebastopol van Meggelen J, Smith J, Madsen L (2009) Asterisk: the future of telephony. O’Reilly, Sebastopol
go back to reference van Zaanen M (2008) Multi-lingual question answering using OpenEphyra. In: Working notes for the cross language evaluation forum (CLEF), pp 1–6 van Zaanen M (2008) Multi-lingual question answering using OpenEphyra. In: Working notes for the cross language evaluation forum (CLEF), pp 1–6
go back to reference Williams JD, Young S (2007) Partially observable markov decision processes for spoken dialog systems. Comput Speech Lang 21(2):393–422CrossRef Williams JD, Young S (2007) Partially observable markov decision processes for spoken dialog systems. Comput Speech Lang 21(2):393–422CrossRef
go back to reference Xu Y, Seneff S (2011) A generic framework for building dialogue games for language learning: application in the flight domain. In: SLaTE, pp 73–76 Xu Y, Seneff S (2011) A generic framework for building dialogue games for language learning: application in the flight domain. In: SLaTE, pp 73–76
go back to reference Young S, Gašić M, Keizer S, Mairesse F, Schatzmann J, Thomson B, Yu K (2010) The hidden information state model: a practical framework for pomdp-based spoken dialogue management. Comput Speech Lang 24(2):150–174CrossRef Young S, Gašić M, Keizer S, Mairesse F, Schatzmann J, Thomson B, Yu K (2010) The hidden information state model: a practical framework for pomdp-based spoken dialogue management. Comput Speech Lang 24(2):150–174CrossRef
Metadata
Title
HALEF: An Open-Source Standard-Compliant Telephony-Based Modular Spoken Dialog System: A Review and An Outlook
Authors
David Suendermann-Oeft
Vikram Ramanarayanan
Moritz Teckenbrock
Felix Neutatz
Dennis Schmidt
Copyright Year
2015
DOI
https://doi.org/10.1007/978-3-319-19291-8_5