2001 | OriginalPaper | Buchkapitel
Search Space Partition-Based Rule Mining
verfasst von : Jean-Marc Adamo
Erschienen in: Data Mining for Association Rules and Sequential Patterns
Verlag: Springer New York
Enthalten in: Professional Book Archive
Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
In this chapter, we define the Boolean association rules and give a formulation of the problem of mining for such rules. Next, we devise sequential and parallel algorithms, based on search space partitioning, that solve this problem. In Section 2.1, we describe the basic objects involved in the problem of mining for association rules. We briefly discuss the relationship between the support/confidence measures and probability. We also discuss the relationship between the association rules and the logical implication. The procedure proposed in [S96] is considered to solve the mining problem. It consists of the following steps: (1) enumerate all frequent attribute sets, and (2) derive all association rules from this set. In Section 2.2, following and extending ideas proposed in [Z98], we identify the search space attached to the mining problem to the powerset 2A, where A is the set of attributes involved. We define a family of equivalence relations over 2A that are used in Section 2.3 to derive a partitioning procedure. The procedure is based on recursive bisection and consists of progressively splitting the powerset into finer and finer equivalence classes. Section 2.4 shows that the classes can be processed independently of one another. A join operator, stable with respect to the class partitioning, is proposed for this purpose. The join operator is next used within two partition-based algorithms for frequent attribute set enumeration: a sequential algorithm (Section 2.5) and a parallel one (Section 2.6). The latter requires both initial and dynamic load balancing. These issues are discussed in Section 2.6 where load balancing algorithms are given. Once the set of frequent attribute sets has been found, this set is processed in order to discover all association rules. This is the topic of Section 2.7 where two algorithms (a sequential algorithm and a parallel one) are described.