2012 | OriginalPaper | Chapter
Discovering Descriptive Tile Trees
By Mining Optimal Geometric Subtiles
Authors : Nikolaj Tatti, Jilles Vreeken
Published in: Machine Learning and Knowledge Discovery in Databases
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
When analysing binary data, the ease at which one can interpret results is very important. Many existing methods, however, discover either models that are difficult to read, or return so many results interpretation becomes impossible. Here, we study a fully automated approach for mining easily interpretable models for binary data. We model data hierarchically with noisy tiles—rectangles with significantly different density than their parent tile. To identify good trees, we employ the Minimum Description Length principle.
We propose Stijl, a greedy any-time algorithm for mining good tile trees from binary data. Iteratively, it finds the locally
optimal
addition to the current tree, allowing overlap with tiles of the same parent. A major result of this paper is that we find the optimal tile in only Θ(
NM
min(
N
,
M
)) time. Stijl can either be employed as a top-
k
miner, or by MDL we can identify the tree that describes the data best.
Experiments show we find succinct models that accurately summarise the data, and, by their hierarchical property are easily interpretable.