2010 | OriginalPaper | Buchkapitel
Permutation Testing Improves Bayesian Network Learning
verfasst von : Ioannis Tsamardinos, Giorgos Borboudakis
Erschienen in: Machine Learning and Knowledge Discovery in Databases
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
We are taking a peek “under the hood” of constraint-based learning of graphical models such as Bayesian Networks. This mainstream approach to learning is founded on performing statistical tests of conditional independence. In all prior work however, the tests employed for categorical data are only asymptotically-correct, i.e., they converge to the exact
p
-value in the sample limit. In this paper we present, evaluate, and compare exact tests, based on standard, adjustable, and semi-parametric Monte-Carlo permutation testing procedures appropriate for small sample sizes. It is demonstrated that (a) permutation testing is calibrated, i.e, the actual Type I error matches the significance level
α
set by the user; this is not the case with asymptotic tests, (b) permutation testing leads to more robust structural learning, and (c) permutation testing allows learning networks from multiple datasets sharing a common underlying structure but different distribution functions (e.g. continuous vs. discrete); we name this problem the
Bayesian Network Meta-Analysis
problem. In contrast, asymptotic tests may lead to erratic learning behavior in this task (error increasing with total sample-size). The semi-parametric permutation procedure we propose is a reasonable approximation of the basic procedure using 5000 permutations, while being only 10-20 times slower than the asymptotic tests for small sample sizes. Thus, this test should be practical in most graphical learning problems and could substitute asymptotic tests. The conclusions of our studies have ramifications for learning not only Bayesian Networks but other graphical models too and for related causal-based variable selection algorithms, such as HITON. The code is available at mensxmachina.org.