30-09-2020 | Special Issue Paper
DIFF: a relational interface for large-scale data explanation
Published in: The VLDB Journal | Issue 1/2021
Log inActivate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Abstract
DIFF
operator, a relational aggregation operator that unifies the core functionality of these engines with declarative relational query processing. We implement both single-node and distributed versions of the DIFF
operator in MB SQL, an extension of MacroBase, and demonstrate how DIFF
can provide the same semantics as existing explanation engines while capturing a broad set of production use cases in industry, including at Microsoft and Facebook. Additionally, we illustrate how this declarative approach to data explanation enables new logical and physical query optimizations. We evaluate these optimizations on several real-world production applications and find that DIFF
in MB SQL can outperform state-of-the-art engines by up to an order of magnitude.