2015 | OriginalPaper | Chapter
Tracing Errors in Probabilistic Databases Based on the Bayesian Network
Authors : Liang Duan, Kun Yue, Cheqing Jin, Wenlin Xu, Weiyi Liu
Published in: Database Systems for Advanced Applications
Publisher: Springer International Publishing
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Data in probabilistic databases may not be absolutely correct, and worse, may be erroneous. Many existing data cleaning methods can be used to detect errors in traditional databases, but they fall short of guiding us to find errors in probabilistic databases, especially for databases with complex correlations among data. In this paper, we propose a method for tracing errors in probabilistic databases by adopting Bayesian network (BN) as the framework of representing the correlations among data. We first develop the techniques to construct an augmented Bayesian network (ABN) for an anomalous query to represent correlations among input data, intermediate data and output data in the query execution. Inspired by the notion of blame in causal models, we then define a notion of blame for ranking candidate errors. Next, we provide an efficient method for computing the degree of blame for each candidate error based on the probabilistic inference upon the ABN. Experimental results show the effectiveness and efficiency of our method.