2013 | OriginalPaper | Chapter
Concept Discovery and Automatic Semantic Annotation for Language Understanding in an Information-Query Dialogue System Using Latent Dirichlet Allocation and Segmental Methods
Authors : Nathalie Camelin, Boris Detienne, Stéphane Huet, Dominique Quadri, Fabrice Lefèvre
Published in: Knowledge Discovery, Knowledge Engineering and Knowledge Management
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Efficient statistical approaches have been recently proposed for natural language understanding in the context of dialogue systems. However, these approaches are trained on data semantically annotated at the segmental level, which increases the production cost of these resources. This kind of semantic annotation implies both to determine the concepts in a sentence and to link them to their corresponding word segments. In this paper, we propose a two-step automatic method for semantic annotation. The first step is an implementation of the latent Dirichlet allocation aiming at discovering concepts in a dialogue corpus. Then this knowledge is used as a bootstrap to infer automatically a segmentation of a word sequence into concepts using either integer linear optimisation or stochastic word alignment models (IBM models). The relation between automatically-derived and manually-defined task-dependent concepts is evaluated on a spoken dialogue task with a reference annotation.