Top

2019 | Book

Read chapter Read first chapter

Developing Enterprise Chatbots

Learning Linguistic Structures

Author: Boris Galitsky

Publisher: Springer International Publishing

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

A chatbot is expected to be capable of supporting a cohesive and coherent conversation and be knowledgeable, which makes it one of the most complex intelligent systems being designed nowadays. Designers have to learn to combine intuitive, explainable language understanding and reasoning approaches with high-performance statistical and deep learning technologies.

Today, there are two popular paradigms for chatbot construction:

1. Build a bot platform with universal NLP and ML capabilities so that a bot developer for a particular enterprise, not being an expert, can populate it with training data;

2. Accumulate a huge set of training dialogue data, feed it to a deep learning network and expect the trained chatbot to automatically learn “how to chat”.

Although these two approaches are reported to imitate some intelligent dialogues, both of them are unsuitable for enterprise chatbots, being unreliable and too brittle.

The latter approach is based on a belief that some learning miracle will happen and a chatbot will start functioning without a thorough feature and domain engineering by an expert and interpretable dialogue management algorithms.

Enterprise high-performance chatbots with extensive domain knowledge require a mix of statistical, inductive, deep machine learning and learning from the web, syntactic, semantic and discourse NLP, ontology-based reasoning and a state machine to control a dialogue. This book will provide a comprehensive source of algorithms and architectures for building chatbots for various domains based on the recent trends in computational linguistics and machine learning. The foci of this book are applications of discourse analysis in text relevant assessment, dialogue management and content generation, which help to overcome the limitations of platform-based and data driven-based approaches.

Supplementary material and code is available at https://github.com/bgalitsky/relevance-based-on-parse-trees

Frontmatter

Chapter 1. Introduction

Abstract

This chapter is an Introduction to the book. We analyze a lack of intelligence as a major bottleneck of current dialogue systems, briefly survey current trends, discuss how to demo a chatbot and outline the pathway towards an industrial-strength one.

Boris Galitsky

Chapter 2. Chatbot Components and Architectures

Abstract

In the Introduction, we discussed that chatbot platforms offered by enterprises turned out to be good for simple cases, not really enterprise-level deployments. In this chapter we make a first step towards industrial–strength chatbots. We will outline the main components of chatbots and show various kinds of architectures employing these components. The descriptions of these components will be the reader’s starting points to learning them in-depth in the consecutive chapters.

Building a chatbot for commercial use via data-driven methods poses two main challenges. First is broad-coverage: modeling natural conversation in an unrestricted number of topics is still an open problem as shown by the current concentration of research on dialogues in restricted domains. Second is the difficulty to get a clean, systematic, unbiased and comprehensive datasets of open-ended and task-oriented conversations, which makes it difficult for chatbot improvement and limits the viability of using purely data-driven methods such as neural networks.

We will explore the usability of rule-based and statistical machine learning - based dialogue managers, the central component in a chatbot architecture. We conclude this chapter by illustrating specific learning architectures, based on active and transfer learning.

Boris Galitsky

Chapter 3. Explainable Machine Learning for Chatbots

Abstract

Machine learning (ML) has been successfully applied to a wide variety of fields ranging from information retrieval, data mining, and speech recognition, to computer graphics, visualization, and human-computer interaction. However, most users often treat a machine learning model as a black box because of its incomprehensible functions and unclear working mechanism (Liu et al. 2017). Without a clear understanding of how and why a model works, the development of high performance models for chatbots typically relies on a time-consuming trial-and-error process. As a result, academic and industrial ML chatbot developers are facing challenges that demand more transparent and explainable systems for better understanding and analyzing ML models, especially their inner working mechanisms.

In this Chapter we focus on explainability. We first discuss what is explainable ML and how its features are desired by users. We then draw an example chatbot-related classification problem and show how it is solved by a transparent rule-based or ML method. After that we present a decision support-enabled chatbot that shares its explanations to back up its decisions and tackles that of a human peer. We conclude this chapter with a learning framework representing a deterministic inductive approach with complete explainability.

Boris Galitsky, Saveli Goldberg

Chapter 4. Developing Conversational Natural Language Interface to a Database

Abstract

In this Chapter we focus on a problem of a natural language access to a database, well-known and highly desired to be solved. We start with the modern approaches based on deep learning and analyze lessons learned from unusable database access systems. This chapter can serve as a brief introduction to neural networks for learning logic representations. Then a number of hybrid approaches are presented and their strong points are analyzed. Finally, we describe our approach that relies on parsing, thesaurus and disambiguation via chatbot communication mode. The conclusion is that a reliable and flexible database access via NL needs to employ a broad spectrum of linguistic, knowledge representation and learning techniques. We conclude this chapter by surveying the general technology trends related to NL2SQL, observing how AI and ML are seeping into virtually everything and represent a major battleground for technology providers.

Boris Galitsky

Chapter 5. Assuring Chatbot Relevance at Syntactic Level

Abstract

In this chapter we implement relevance mechanism based on similarity of parse trees for a number of chatbot components including search. We extend the mechanism of logical generalization towards syntactic parse trees and attempt to detect weak semantic signals from them. Generalization of syntactic parse tree as a syntactic similarity measure is defined as the set of maximum common sub-trees and performed at a level of paragraphs, sentences, phrases and individual words. We analyze semantic features of such similarity measure and compare it with semantics of traditional anti-unification of terms. Nearest neighbor machine learning is then applied to relate a sentence to a semantic class.

Using syntactic parse tree-based similarity measure instead of bag-of-words and keyword frequency approaches, we expect to detect a weak semantic signal otherwise unobservable. The proposed approach is evaluated in four distinct domains where a lack of semantic information makes classification of sentences rather difficult. We describe a toolkit which is a part of Apache Software Foundation project OpenNLP.chatbot, designed to aid search engineers and chatbot designers in tasks requiring text relevance assessment.

Boris Galitsky

Chapter 6. Semantic Skeleton Thesauri for Question Answering Bots

Abstract

We build a question–answering (Q/A) chatbot component for answering complex questions in poorly formalized and logically complex domains. Answers are annotated with deductively linked logical expressions (semantic skeletons), which are to be matched with formal representations for questions. We utilize a logic programming approach so that the search for an answer is implemented as determining clauses (associated with this answer) from which the formal representation of a question can be deduced. This Q/A technique has been implemented for the financial and legal domains, which are rather sophisticated on one hand and requires fairly precise answers on the other hand.

Boris Galitsky

Chapter 7. Learning Discourse-Level Structures for Question Answering

Traditional parse trees are combined together and enriched with anaphora and rhetoric information to form a unified representation for a paragraph of text. We refer to these representations as parse thickets. They are introduced to support answering complex questions, which include multiple sentences, to tackle as many constraints expressed in this question as possible. The question answering system is designed so that an initial set of answers, which is obtained by a TF*IDF or other keyword search model, is re-ranked. Passage re-ranking is performed using matching of the parse thickets of answers with the parse thicket of the question. To do that, a graph representation and matching technique for parse structures for paragraphs of text have been developed. We define the operation of generalization of two parse thickets as a measure of semantic similarity between paragraphs of text to be the maximal common sub-thicket of these parse thickets.Passage re-ranking improvement via parse thickets is evaluated in a variety of chatbot question-answering domains with long questions. Using parse thickets improves search accuracy compared with the bag-of words, the pairwise matching of parse trees for sentences, and the tree kernel approaches. As a baseline, we use a web search engine API, which provides much more accurate search results than the majority of search benchmarks, such as TREC. A comparative analysis of the impact of various sources of discourse information on the search accuracy is conducted. An open source plug-in for SOLR is developed so that the proposed technology can be easily integrated with industrial search engines.

Boris Galitsky

Chapter 8. Building Chatbot Thesaurus

Abstract

We implement a scalable mechanism to build a thesaurus of entities which is intended to improve the relevance of a chatbot. The thesaurus construction process starts from the seed entities and mines available source domains for new entities associated with these seed entities. New entities are formed by applying the machine learning of syntactic parse trees (their generalizations) to the search results for existing entities to form commonalities between them. These commonality expressions then form parameters of existing entities, and are turned into new entities at the next learning iteration. To match natural language expressions between source and target domains, we use syntactic generalization, an operation that finds a set of maximal common sub-trees of the parse trees of these expressions.

Thesaurus and syntactic generalization are applied to relevance improvement in search and text similarity assessment. We conduct an evaluation of the search relevance improvement in vertical and horizontal domains and observe significant contribution of the learned thesaurus in the former, and a noticeable contribution of a hybrid system in the latter domain. We also perform industrial evaluation of thesaurus and syntactic generalization-based text relevance assessment and conclude that a proposed algorithm for automated thesaurus learning is suitable for integration into chatbots. The proposed algorithm is implemented as a component of Apache OpenNLP project.

Boris Galitsky

Chapter 9. A Content Management System for Chatbots

Abstract

In this chapter we describe the industrial applications of our linguistic-based relevance technology for processing, classification and delivery of a stream of texts as data sources for chatbots. We present the content pipeline for eBay entertainment domain that employs this technology, and show that text processing relevance is the main bottleneck for its performance. A number of components of the chatbot content pipeline such as content mining, thesaurus formation, aggregation from multiple sources, validation, de-duplication, opinion mining and integrity enforcement need to rely on domain-independent efficient text classification, entity extraction and relevance assessment operations.

Text relevance assessment is based on the operation of syntactic generalization (SG, Chap. 5) which finds a maximum common sub-tree for a pair of parse trees for sentences. Relevance of two portions of texts is then defined as a cardinality of this sub-tree. SG is intended to substitute keyword-based analysis for more accurate assessment of relevance that takes phrase-level and sentence-level information into account. In the partial case of SG, where short expression are commonly used terms such as Facebook likes, SG ascends to the level of categories and a reasoning technique is required to map these categories in the course of relevance assessment.

A number of content pipeline components employ web mining which needs SG to compare web search results. We describe how SG works in a number of components in the content pipeline including personalization and recommendation, and provide the evaluation results for eBay deployment. Content pipeline support is implemented as an open source contribution OpenNLP.Similarity.

Boris Galitsky

Chapter 10. Rhetorical Agreement: Maintaining Cohesive Conversations

Abstract

To support a natural flow of a conversation in a chatbot, rhetorical structures of each message has to be analyzed. We classify a pair of paragraphs of text as appropriate for one to follow another, or inappropriate, based on communicative discourse considerations. To represent a multi-sentence message with respect to how it should follow a previous message in a conversation or dialogue, we build an extension of a discourse tree for it. Extended discourse tree is based on a discourse tree for RST relations with labels for communicative actions, and also additional arcs for anaphora and ontology-based relations for entities. We refer to such trees as Communicative Discourse Trees (CDTs). We explore syntactic and discourse features that are indicative of correct vs incorrect request-response or question-answer pairs. Two learning frameworks are used to recognize such correct pairs: deterministic, nearest-neighbor learning of CDTs as graphs, and a tree kernel learning of CDTs, where a feature space of all CDT sub-trees is subject to SVM learning. We form the positive training set from the correct pairs obtained from Yahoo Answers, social network, corporate conversations including Enron emails, customer complaints and interviews by journalists. The corresponding negative training set is artificially created by attaching responses for different, inappropriate requests that include relevant keywords. The evaluation showed that it is possible to recognize valid pairs in 70% of cases in the domains of weak request-response agreement and 80% of cases in the domains of strong agreement, which is essential to support automated conversations. These accuracies are comparable with the benchmark task of classification of discourse trees themselves as valid or invalid, and also with classification of multi-sentence answers in factoid question-answering systems. The applicability of proposed machinery to the problem of chatbots, social chats and programming via NL is demonstrated. We conclude that learning rhetorical structures in the form of CDTs is the key source of data to support answering complex questions, chatbots and dialogue management.

Boris Galitsky

Chapter 11. Discourse-Level Dialogue Management

Abstract

In this Chapter we learn how to manage a dialogue relying on discourse of its utterances. We first explain how to build an invariant discourse tree for a corpus of texts to arrange a chatbot-facilitated navigation through this corpus. We define extended discourse trees, introduce means to manipulate with them, and outline scenarios of multi-document navigation. We then show how a dialogue structure can be built from an initial utterance. After that, we introduce imaginary discourse tree to address a problem of involving background knowledge on demand, answering questions. Finally, an approach to dialogue management based on lattice walk is described.

Boris Galitsky

Chapter 12. A Social Promotion Chatbot

Abstract

We describe a chatbot performing advertising and social promotion (CASP) to assist in automation of managing friends and other social network contacts. This agent employs a domain-independent natural language relevance technique that filters web mining results to support a conversation with friends and other network members. This technique relies on learning parse trees and parse thickets (sets of parse trees) of paragraphs of text such as Facebook postings. To yield a web mining query from a sequence of previous postings by human agents discussing a topic, we develop a Lattice Querying algorithm which automatically adjusts the optimal level of query generality. We also propose an algorithm for CASP to make a translation into multiple languages plausible as well as a method to merge web mined textual chunks. We evaluate the relevance features, overall robustness and trust of CASP in a number of domains, acting on behalf of the author of this Chapter in his Facebook account in 2014–2016. Although some Facebook friends did not like CASP postings and even unfriended the host, overall social promotion results are positive as long as relevance, style and rhetorical appropriateness is properly maintained.

Boris Galitsky

Chapter 13. Enabling a Bot with Understanding Argumentation and Providing Arguments

Abstract

We make our chatbot capable of exchanging arguments with users. The chatbot needs to tackle various argumentation patterns provided by a user as well as provide adequate argumentation patterns in response. To do that, the system needs to detect certain types of arguments in user utterances to “understand” her and detect arguments in textual content to reply back accordingly. Various patterns of logical and affective argumentation are detected by analyzing the discourse and communicative structure of user utterances and content to be delivered to the user. Unlike most argument-mining systems, the chatbot not only detects arguments but performs reasoning on them for the purpose of validation the claims. We explore how the chatbot can leverage discourse-level features to assess the quality and validity of arguments as well as overall text truthfulness, integrity, cohesiveness and how emotions and sentiments are communicated. Communicative discourse trees and their extensions for sentiments and noisy user generated content are employed in these tasks.

We conduct evaluation of argument detection on a variety of datasets with distinct argumentation patterns, from news articles to reviews and customer complaints, to observe how discourse analysis can support a chatbot operating in these domains. Our conclusion is that domain-independent discourse-level features are a critical source of information to enable the chatbot to reproduce such complex form of human activity as providing and analyzing arguments.

Boris Galitsky

Chapter 14. Rhetorical Map of an Answer

Abstract

In this Chapter we explore an anatomy of an arbitrary text with respect to how it can answer questions. One more opportunity for discourse analysis to assist with topical relevance of an answer is identified. We discover that a discourse tree of an answer sheds a light on how an answer is constructed, and how to treat keyword occurrence. There is a simple observation employed by search engines: keywords from a query need to occur in a single answer sentence, for this answer to be relevant. Relying on answer anatomy, we substantially extend the notion of how query keywords should occur in answer areas such as its elementary discourse units. We explore how to identify informative and uninformative parts of answers in terms of matching with questions. It turns out that discourse trees contribute a lot in building answer maps which are fairly important for determining whether this answer is good or not for a given question.

Boris Galitsky

Chapter 15. Conclusions

Abstract

We conclude the book with the analysis of why it is so hard to build an industrial-strength chatbot and what the main problems are which need to be solved. We summarize the techniques employed in this book, mention deployment at Oracle and a university course on chatbots.

Boris Galitsky

Title: Developing Enterprise Chatbots
Author: Boris Galitsky
Publisher: Springer International Publishing
Electronic ISBN: 978-3-030-04299-8
Print ISBN: 978-3-030-04298-1
DOI: https://doi.org/10.1007/978-3-030-04299-8

Springer Professional

Developing Enterprise Chatbots

Learning Linguistic Structures

About this book

Table of Contents

Frontmatter

Chapter 1. Introduction

Chapter 2. Chatbot Components and Architectures

Chapter 3. Explainable Machine Learning for Chatbots

Chapter 4. Developing Conversational Natural Language Interface to a Database

Chapter 5. Assuring Chatbot Relevance at Syntactic Level

Chapter 6. Semantic Skeleton Thesauri for Question Answering Bots

Chapter 7. Learning Discourse-Level Structures for Question Answering

Chapter 8. Building Chatbot Thesaurus

Chapter 9. A Content Management System for Chatbots

Chapter 10. Rhetorical Agreement: Maintaining Cohesive Conversations

Chapter 11. Discourse-Level Dialogue Management

Chapter 12. A Social Promotion Chatbot

Chapter 13. Enabling a Bot with Understanding Argumentation and Providing Arguments

Chapter 14. Rhetorical Map of an Answer

Chapter 15. Conclusions

Premium Partner