Skip to main content
main-content

Tipp

Weitere Kapitel dieses Buchs durch Wischen aufrufen

2003 | OriginalPaper | Buchkapitel

Comparing Clusterings by the Variation of Information

verfasst von: Marina Meilă

Erschienen in: Learning Theory and Kernel Machines

Verlag: Springer Berlin Heidelberg

share
TEILEN

This paper proposes an information theoretic criterion for comparing two partitions, or clusterings, of the same data set. The criterion, called variation of information (VI), measures the amount of information lost and gained in changing from clustering ${\cal C}$ to clustering ${\cal C}'$. The criterion makes no assumptions about how the clusterings were generated and applies to both soft and hard clusterings. The basic properties of VI are presented and discussed from the point of view of comparing clusterings. In particular, the VI is positive, symmetric and obeys the triangle inequality. Thus, surprisingly enough, it is a true metric on the space of clusterings.

Metadaten
Titel
Comparing Clusterings by the Variation of Information
verfasst von
Marina Meilă
Copyright-Jahr
2003
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-540-45167-9_14

Premium Partner