Skip to main content

2016 | Buch

The Conversational Interface

Talking to Smart Devices

insite
SUCHEN

Über dieses Buch

This book provides a comprehensive introduction to the conversational interface, which is becoming the main mode of interaction with virtual personal assistants, smart devices, various types of wearable, and social robots. The book consists of four parts. Part I presents the background to conversational interfaces, examining past and present work on spoken language interaction with computers. Part II covers the various technologies that are required to build a conversational interface along with practical chapters and exercises using open source tools. Part III looks at interactions with smart devices, wearables, and robots, and discusses the role of emotion and personality in the conversational interface. Part IV examines methods for evaluating conversational interfaces and discusses future directions.

Inhaltsverzeichnis

Frontmatter
Chapter 1. Introducing the Conversational Interface
Abstract
Conversational interfaces enable people to interact with smart devices using conversational spoken language. This book describes the technologies behind the conversational interface. Following a brief introduction, we describe the intended readership of the book and how the book is organized. The final section lists the apps and code that have been developed to illustrate the technology of conversational interfaces and to enable readers to gain hands-on experience using open-source software.
Michael McTear, Zoraida Callejas, David Griol

Conversational Interfaces: Preliminaries

Frontmatter
Chapter 2. The Dawn of the Conversational Interface
Abstract
With a conversational interface, people can speak to their smartphones and other smart devices in a natural way in order to obtain information, access Web services, issue commands, and engage in general chat. This chapter presents some examples of conversational interfaces and reviews technological advances that have made conversational interfaces possible. Following this, there is an overview of the technologies that make up a conversational interface.
Michael McTear, Zoraida Callejas, David Griol
Chapter 3. Toward a Technology of Conversation
Abstract
Conversation is a natural and intuitive mode of interaction. As humans, we engage all the time in conversation without having to think about how conversation actually works. In this chapter, we examine the key features of conversational interaction that will inform us as we develop conversational interfaces for a range of smart devices. In particular, we describe how utterances in a conversation can be viewed as actions that are performed in the pursuit of a goal; how conversation is structured; how participants in conversation collaborate to make conversation work; what the language of conversation looks like; and the implications for developers of applications that engage in conversational interaction with humans.
Michael McTear, Zoraida Callejas, David Griol
Chapter 4. Conversational Interfaces: Past and Present
Abstract
Conversational interfaces have a long history, starting in the 1960s with text-based dialog systems for question answering and chatbots that simulated casual conversation. Speech-based dialog systems began to appear in the late 1980s and spoken dialog technology became a key area of research within the speech and language communities. At the same time commercially deployed spoken dialog systems, known in the industry as voice user interfaces (VUI), began to emerge. Embodied conversational agents (ECA) and social robots were also being developed. These systems combine facial expression, body stance, hand gestures, and speech in order to provide a more human-like and more engaging interaction. In this chapter we review developments in spoken dialog systems, VUI, embodied conversational agents, social robots, and chatbots, and outline findings and achievements from this work that will be important for the next generation of conversational interfaces.
Michael McTear, Zoraida Callejas, David Griol

Developing a Speech-Based Conversational Interface

Frontmatter
Chapter 5. Speech Input and Output
Abstract
When a user speaks to a conversational interface, the system has to be able to recognize what was said. The automatic speech recognition (ASR) component processes the acoustic signal that represents the spoken utterance and outputs a sequence of word hypotheses, thus transforming the speech into text. The other side of the coin is text-to-speech synthesis (TTS), in which written text is transformed into speech. There has been extensive research in both these areas, and striking improvements have been made over the past decade. In this chapter, we provide an overview of the processes of ASR and TTS.
Michael McTear, Zoraida Callejas, David Griol
Chapter 6. Implementing Speech Input and Output
Abstract
There are a number of different open-source tools that allow developers to add speech input and output to their apps. In this chapter, we describe two different technologies that can be used for conversational systems, one for systems running on the Web and the other for systems running on mobile devices. For the Web, we will focus on the HTML5 Web Speech API (Web SAPI), while for mobile devices we will describe the Android Speech APIs.
Michael McTear, Zoraida Callejas, David Griol
Chapter 7. Creating a Conversational Interface Using Chatbot Technology
Abstract
Conversational interfaces can be built using a variety of technologies. This chapter shows how to create a conversational interface using chatbot technology in which pattern matching is used to interpret the user’s input and templates are used to provide the system’s output. Numerous conversational interfaces have been built in this way, initially to develop systems that could engage in conversation in a human-like way but also more recently to create automated online assistants to complement or even replace human-provided services in call centers. In this chapter, some working examples of conversational interfaces using the Pandorabots platform are presented, along with a tutorial on AIML, a markup language for specifying conversational interactions.
Michael McTear, Zoraida Callejas, David Griol
Chapter 8. Spoken Language Understanding
Abstract
Spoken language understanding (SLU) involves taking the output of the speech recognition component and producing a representation of its meaning that can be used by the dialog manager (DM) to decide what to do next in the interaction. As systems have become more conversational, allowing the user to express their commands and queries in a more natural way, SLU has become a hot topic for the next generation of conversational interfaces. SLU embraces a wide range of technologies that can be used for various tasks involving the processing of text. In this chapter, we provide an overview of these technologies, focusing in particular on those that are relevant to the conversational interface.
Michael McTear, Zoraida Callejas, David Griol
Chapter 9. Implementing Spoken Language Understanding
Abstract
There is a wide range of tools that support various tasks in spoken language, some of which are particularly relevant for processing spoken language understanding in conversational interfaces. Here, the main task is to detect the user’s intent and to extract any further information that is required to understand the utterance. This chapter provides a tutorial on the Api.ai platform that has been widely used to support the development of mobile and wearable devices as well as applications for smart homes and automobiles. The chapter also reviews some similar tools provided by Wit.ai, Amazon Alexa, and Microsoft LUIS, and looks briefly at other tools that have been widely used in natural language processing and that are potentially relevant for conversational interfaces.
Michael McTear, Zoraida Callejas, David Griol
Chapter 10. Dialog Management
Abstract
One of the core aspects in the development of conversational interfaces is to design the dialog management strategy. The dialog management strategy defines the system’s conversational behaviors in response to user utterances and environmental states. The design of this strategy is usually carried out in industry by handcrafting dialog strategies that are tightly coupled to the application domain in order to optimize the behavior of the conversational interface in that context. More recently, the research community has proposed ways of automating the design of dialog strategies by using statistical models trained with real conversations. This chapter describes the main challenges and tasks in dialog management. We also analyze the main approaches that have been proposed for developing dialog managers and the most important methodologies and standards that can be used for the practical implementation of this important component of a conversational interface.
Michael McTear, Zoraida Callejas, David Griol
Chapter 11. Implementing Dialog Management
Abstract
There is a wide range of tools that support the generation of rule-based dialog managers for conversational interfaces. However, it is not as easy to find toolkits to develop statistical dialog managers based on reinforcement learning and/or corpus-based techniques. In this chapter, we have selected the VoiceXML standard to put into practice the handcrafted approach, given that this standard is used widely in industry to develop voice user interfaces. The second part of the chapter describes the use of a statistical dialog management technique to show the application of this kind of methodology for the development of practical conversational interfaces.
Michael McTear, Zoraida Callejas, David Griol
Chapter 12. Response Generation
Abstract
Once the dialog manager has interpreted the user’s input and decided how to respond, the next step for the conversational interface is to determine the content of the response and how best to express it. This stage is known as response generation (RG). The system’s verbal output is generated as a stretch of text and passed to the text-to-speech component to be rendered as speech. In this chapter, we provide an overview of the technology of RG and discuss tools and other resources.
Michael McTear, Zoraida Callejas, David Griol

Conversational Interfaces and Devices

Frontmatter
Chapter 13. Conversational Interfaces: Devices, Wearables, Virtual Agents, and Robots
Abstract
We are surrounded by a plethora of smart objects such as devices, wearables, virtual agents, and social robots that should help to make our life easier in many different ways by fulfilling various needs and requirements. A conversational interface is the best way to communicate with this wide range of smart objects. In this chapter, we cover the special requirements of conversational interaction with smart objects, describing the main development platforms, the possibilities offered by different types of device, and the relevant issues that need to be considered in interaction design.
Michael McTear, Zoraida Callejas, David Griol
Chapter 14. Emotion, Affect, and Personality
Abstract
Affect is a key factor in human conversation. It allows us to fully understand each other, be socially competent, and show that we care. As such, in order to build conversational interfaces that display credible and expressive behaviors, we should endow them with the capability to recognize, adapt to, and render emotion. In this chapter, we explain the background to how emotional aspects and personality are conceptualized in artificial systems and outline the benefits of endowing the conversational interface with the ability to recognize and display emotions and personality.
Michael McTear, Zoraida Callejas, David Griol
Chapter 15. Affective Conversational Interfaces
Abstract
In order to build artificial conversational interfaces that display behaviors that are credible and expressive, we should endow them with the capability to recognize, adapt to, and render emotion. In this chapter, we explain how the recognition of emotional aspects is managed within conversational interfaces, including modeling and representation, emotion recognition from physiological signals, acoustics, text, facial expressions, and gestures and how emotion synthesis is managed through expressive speech and multimodal embodied agents. We also cover the main open tools and databases available for developers wishing to incorporate emotion into their conversational interfaces.
Michael McTear, Zoraida Callejas, David Griol
Chapter 16. Implementing Multimodal Conversational Interfaces Using Android Wear
Abstract
When they first appeared, conversational systems were developed as speech-only interfaces accessible usually via landline phones. Currently, they are employed in a wide variety of devices such as smartphones and wearables, with different input and output capabilities. Traditional speech-based multimodal interfaces were designed for Web and desktop applications, but current devices pose particular restrictions and challenges for multimodal interaction that must be tackled differently. In this chapter, we discuss these issues and show how they can be solved practically by building several apps for smartwatches using Android Wear that demonstrate the different alternatives available.
Michael McTear, Zoraida Callejas, David Griol

Evaluation and Future Directions

Frontmatter
Chapter 17. Evaluating the Conversational Interface
Abstract
The evaluation of conversational interfaces is a continuously evolving research area that encompasses a rich variety of methodologies, techniques, and tools. As conversational interfaces become more complex, their evaluation has become multifaceted. Furthermore, evaluation involves paying attention not only to the different components in isolation, but also to interrelations between the components and the operation of the system as a whole. This chapter discusses the main measures that are employed for evaluating conversational interfaces from a variety of perspectives.
Michael McTear, Zoraida Callejas, David Griol
Chapter 18. Future Directions
Abstract
As a result of advances in technology, particularly in areas such as cognitive computing and deep learning, the conversational interface is becoming a reality. Given the vast number of devices that will be connected in the so-called Internet of Things, a uniform interface will be necessary both for users and for developers. We describe current developments in technology and review a number of application areas that will benefit from conversational interfaces, including smart environments, health care, care of the elderly, and conversational toys and educational assistants for children. We also discuss the need for developers of conversational interfaces to focus on bridging the digital divide for under-resourced languages.
Michael McTear, Zoraida Callejas, David Griol
Backmatter
Metadaten
Titel
The Conversational Interface
verfasst von
Michael McTear
Zoraida Callejas
David Griol
Copyright-Jahr
2016
Electronic ISBN
978-3-319-32967-3
Print ISBN
978-3-319-32965-9
DOI
https://doi.org/10.1007/978-3-319-32967-3

Neuer Inhalt