1999 | OriginalPaper | Chapter
ZigZag, a New Clustering Algorithm to Analyze Categorical Variable Cross-Classification Tables
Author : Stéphane Lallich
Published in: Principles of Data Mining and Knowledge Discovery
Publisher: Springer Berlin Heidelberg
Included in: Professional Book Archive
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
This Paper proposes ZigZag, a new clustering algorithm, that works on categorical variable Cross-classification tables. Zigzag creates simultaneously two partitions of row and column categories in accordance with the equivalence relation ”to have the Same conditional mode” . These two partitions are associated one to one and onto, creating by that way row-column clusters. Thus, we have an efficient KDD tool which we tan apply to any database. Moreover, ZigZag visualizes predictive association for nominal data in the sense of Guttman, Goodman and Kruskal. Accordingly, the prediction rule of a nominal variable Y conditionally to an other X consists in choosing the conditionally most probable category of Y when knowing X and the power of this rule is evaluated by the mean proportional reduction in error denoted by λ Y/X . It would appear then that the mapping furnished by ZigZag plays for nominal data the Same role as the scattered diagram and the curves of conditional means or the straight regression line plays for quantitative data, the first increased with the values of λ Y/X and λ X/Y , the second increased with the correlation ratio or the R2.