skip to main content
10.1145/1557019.1557099acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

A principled and flexible framework for finding alternative clusterings

Published:28 June 2009Publication History

ABSTRACT

The aim of data mining is to find novel and actionable insights in data. However, most algorithms typically just find a single (possibly non-novel/actionable) interpretation of the data even though alternatives could exist. The problem of finding an alternative to a given original clustering has received little attention in the literature. Current techniques (including our previous work) are unfocused/unrefined in that they broadly attempt to find an alternative clustering but do not specify which properties of the original clustering should or should not be retained. In this work, we explore a principled and flexible framework in order to find alternative clusterings of the data. The approach is principled since it poses a constrained optimization problem, so its exact behavior is understood. It is flexible since the user can formally specify positive and negative feedback based on the existing clustering, which ranges from which clusters to keep (or not) to making a trade-off between alternativeness and clustering quality.

Skip Supplemental Material Section

Supplemental Material

p717-davidson.mp4

mp4

77 MB

References

  1. A. Asuncion and D. Newman. UCI machine learning repository, 2007.Google ScholarGoogle Scholar
  2. E. Bae and J. Bailey. Coala: A novel approach for the extraction of an alternate clustering of high quality and high dissimilarity. In ICDM '06: Proceedings of the Sixth International Conference on Data Mining, pages 53--62, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. Coleman, J. Saunderson, and A. Wirth. Spectral clustering with inconsistent advice. In ICML '08: Proceedings of the 25th international conference on Machine learning, pages 152--159, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Y. Cui, X. Z. Fern, and J. G. Dy. Non-redundant multi-view clustering via orthogonalization. In ICDM '07: Proceedings of the 2007 Seventh IEEE International Conference on Data Mining, pages 133--142, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. I. Davidson and Z. Qi. Finding alternative clusterings using constraints. In ICDM '08: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. I. Davidson and S. S. Ravi. The complexity of non-hierarchical clustering with instance and cluster level constraints. Data Min. Knowl. Discov., 14(1):25--61, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. I. Davidson and S. S. Ravi. Intractability and clustering with constraints. In ICML '07: Proceedings of the 24th international conference on Machine learning, pages 201--208, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. I. Davidson, S. S. Ravi, and M. Ester. Efficient incremental constrained clustering. In KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 240--249, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Gondek and T. Hofmann. Non-redundant data lustering. In ICDM '04: Proceedings of the Fourth IEEE International Conference on Data Mining, pages 75--82, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Jain, R. Meka, and I. S. Dhillon. Simultaneous unsupervised learning of disparate clusterings. In SDM '08: Proceedings of the SIAM International Conference on Data Mining, pages 858--869, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  11. J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888--905, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A principled and flexible framework for finding alternative clusterings

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
      June 2009
      1426 pages
      ISBN:9781605584959
      DOI:10.1145/1557019

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 June 2009

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Author Tags

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader