2012 | OriginalPaper | Buchkapitel
A Framework for Evaluating the Smoothness of Data-Mining Results
verfasst von : Gaurav Misra, Behzad Golshan, Evimaria Terzi
Erschienen in: Machine Learning and Knowledge Discovery in Databases
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
The data-mining literature is rich in problems that are formalized as combinatorial-optimization problems. An indicative example is the
entity-selection
formulation that has been used to model the problem of selecting a subset of representative reviews from a review corpus [11,22]or important nodes in a social network [10]. Existing combinatorial algorithms for solving such entity-selection problems identify a set of entities (e.g., reviews or nodes) as important. Here, we consider the following question: how do small or large changes in the input dataset change the value or the structure of the such reported solutions?
We answer this question by developing a general framework for evaluating the
smoothness
(i.e, consistency) of the data-mining results obtained for the input dataset
X
. We do so by comparing these results with the results obtained for datasets that are within a small or a large distance from
X
. The algorithms we design allow us to perform such comparisons effectively and thus, approximate the results’ smoothness efficiently. Our experimental evaluation on real datasets demonstrates the efficacy and the practical utility of our framework in a wide range of applications.