Skip to main content

2017 | Buch

Redescription Mining

insite
SUCHEN

Über dieses Buch

This book provides a gentle introduction to redescription mining, a versatile data mining tool that is useful to find distinct common characterizations of the same objects and, vice versa, to identify sets of objects that admit multiple shared descriptions. It is intended for readers who are familiar with basic data analysis techniques such as clustering, frequent itemset mining, and classification. Redescription mining is defined in a general way, making it applicable to different types of data. The general framework is made more concrete through many practical examples that show the versatility of redescription mining. The book also introduces the main algorithmic ideas for mining redescriptions, together with applications from various domains. The final part of the book contains variations and extensions of the basic redescription mining problem, and discusses some future directions and open questions.

Inhaltsverzeichnis

Frontmatter
Chapter 1. What Is Redescription Mining
Abstract
In scientific investigations, data oftentimes differ in nature; for instance, they might originate from distinct sources or be cast over separate terminologies. In order to gain insight into the phenomenon of interest, an intuitive first task is to identify the correspondences that exist between these different aspects. This is the motivating principle behind redescription mining, a data analysis task that aims at finding distinct common characterizations of the same objects. In this chapter, we provide the basic definitions of redescription mining, including the data model, query languages, similarity measures, p-value calculations, and methods for pruning redundant redescriptions. We will also briefly cover related data analysis methods and provide a short history of redescription mining research.
Esther Galbrun, Pauli Miettinen
Chapter 2. Algorithms for Redescription Mining
Abstract
The aim of redescription mining is to find valid redescriptions for given data, query language, similarity relation, and user-specified constraints. In other words, we need to explore the search space consisting of query pairs from the query language, looking for those pairs that have similar enough support in the data and that satisfy the other constraints. In this chapter, we present the different methods that have been proposed to carry out this exploration efficiently. Existing methods can be arranged into three main categories: (1) mine-and-pair approaches, (2) alternating approaches, and (3) approaches that use atomic updates. We consider each one in turn, explaining its general common principles and looking at different algorithms designed on these principles. Next, we compare the different methods and discuss their relative strengths and weaknesses. Finally, we consider how to adapt the algorithms to handle cases where some values are missing from the input data.
Esther Galbrun, Pauli Miettinen
Chapter 3. Applications, Variants, and Extensions of Redescription Mining
Abstract
Redescription mining is a data analysis task that aims at finding distinct common characterizations of the same objects. After defining the core problem and presenting algorithmic techniques to solve this task, we look in this chapter at some of the applications, variants, and extensions of redescription mining. We start by outlining different applications, as examples of how the method can be used in various domains. Next, we present two problem variants, namely, relational redescription mining and storytelling. The former aims at finding alternative descriptions for groups of objects in a relational data set, while the goal in the latter is to build a sequence of related queries in order to establish a connection between two given queries. Finally, we point out extensions of the task that constitute possible directions for future research. In particular, we discuss how redescription mining could be augmented with richer query languages and consider going beyond pairs of queries to multiple descriptions.
Esther Galbrun, Pauli Miettinen
Metadaten
Titel
Redescription Mining
verfasst von
Esther Galbrun
Dr. Pauli Miettinen
Copyright-Jahr
2017
Electronic ISBN
978-3-319-72889-6
Print ISBN
978-3-319-72888-9
DOI
https://doi.org/10.1007/978-3-319-72889-6