research-article

A principled and flexible framework for finding alternative clusterings

Authors:
ZiJie Qi

University of California, Davis, Davis, CA, USA

University of California, Davis, Davis, CA, USA
View Profile

,
Ian Davidson

University of California, Davis, Davis, CA, USA

University of California, Davis, Davis, CA, USA
View Profile

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data miningJune 2009Pages 717–726https://doi.org/10.1145/1557019.1557099

Published:28 June 2009Publication History

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 717–726

ABSTRACT

The aim of data mining is to find novel and actionable insights in data. However, most algorithms typically just find a single (possibly non-novel/actionable) interpretation of the data even though alternatives could exist. The problem of finding an alternative to a given original clustering has received little attention in the literature. Current techniques (including our previous work) are unfocused/unrefined in that they broadly attempt to find an alternative clustering but do not specify which properties of the original clustering should or should not be retained. In this work, we explore a principled and flexible framework in order to find alternative clusterings of the data. The approach is principled since it poses a constrained optimization problem, so its exact behavior is understood. It is flexible since the user can formally specify positive and negative feedback based on the existing clustering, which ranges from which clusters to keep (or not) to making a trade-off between alternativeness and clustering quality.

Supplemental Material

p717-davidson.mp4

mp4

77 MB

Download

References

A. Asuncion and D. Newman. UCI machine learning repository, 2007.Google Scholar
E. Bae and J. Bailey. Coala: A novel approach for the extraction of an alternate clustering of high quality and high dissimilarity. In ICDM '06: Proceedings of the Sixth International Conference on Data Mining, pages 53--62, 2006. Google ScholarDigital Library
T. Coleman, J. Saunderson, and A. Wirth. Spectral clustering with inconsistent advice. In ICML '08: Proceedings of the 25th international conference on Machine learning, pages 152--159, 2008. Google ScholarDigital Library
Y. Cui, X. Z. Fern, and J. G. Dy. Non-redundant multi-view clustering via orthogonalization. In ICDM '07: Proceedings of the 2007 Seventh IEEE International Conference on Data Mining, pages 133--142, 2007. Google ScholarDigital Library
I. Davidson and Z. Qi. Finding alternative clusterings using constraints. In ICDM '08: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, 2008. Google ScholarDigital Library
I. Davidson and S. S. Ravi. The complexity of non-hierarchical clustering with instance and cluster level constraints. Data Min. Knowl. Discov., 14(1):25--61, 2007. Google ScholarDigital Library
I. Davidson and S. S. Ravi. Intractability and clustering with constraints. In ICML '07: Proceedings of the 24th international conference on Machine learning, pages 201--208, 2007. Google ScholarDigital Library
I. Davidson, S. S. Ravi, and M. Ester. Efficient incremental constrained clustering. In KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 240--249, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
D. Gondek and T. Hofmann. Non-redundant data lustering. In ICDM '04: Proceedings of the Fourth IEEE International Conference on Data Mining, pages 75--82, 2004. Google ScholarDigital Library
P. Jain, R. Meka, and I. S. Dhillon. Simultaneous unsupervised learning of disparate clusterings. In SDM '08: Proceedings of the SIAM International Conference on Data Mining, pages 858--869, 2008.Google ScholarCross Ref
J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888--905, 2000. Google ScholarDigital Library

Index Terms

A principled and flexible framework for finding alternative clusterings
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

A framework to uncover multiple alternative clusterings

Clustering is often referred to as unsupervised learning which aims at uncovering hidden structures from data. Unfortunately, though widely being used as one of the principal tools to understand the data, most conventional clustering techniques are ...
Read More
A novel approach for finding alternative clusterings using feature selection
DASFAA'12: Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I

Alternative clustering algorithms target finding alternative groupings of a dataset, on which traditional clustering algorithms can find only one even though many alternatives could exist. In this research, we propose a method for finding alternative ...
Read More
Finding Alternative Clusterings Using Constraints
ICDM '08: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining

The aim of data mining is to find novel and actionable insights. However, most algorithms typically just find a single explanation of the data even though alternatives could exist. In this work, we explore a general purpose approach to find an ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
June 2009
1426 pages
ISBN:9781605584959
DOI:10.1145/1557019
General Chairs:
John Elder
Elder Research, Inc., USA
,
Françoise Soulié Fogelman
KXEN, France
,
Program Chairs:
Peter Flach
University of Bristol, UK
,
Mohammed Zaki
RPI, USA
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 June 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
clustering
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 57
  Total Citations
  View Citations
- 993
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A principled and flexible framework for finding alternative clusterings

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

A framework to uncover multiple alternative clusterings

A novel approach for finding alternative clusterings using feature selection

Finding Alternative Clusterings Using Constraints

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A principled and flexible framework for finding alternative clusterings

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

A framework to uncover multiple alternative clusterings

A novel approach for finding alternative clusterings using feature selection

Finding Alternative Clusterings Using Constraints

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media