Article

A human-computer cooperative system for effective high dimensional clustering

Author:
Charu C. Aggarwal

IBM T. J. Watson Research Center, Yorktown Heights, NY

IBM T. J. Watson Research Center, Yorktown Heights, NY
View Profile

KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2001Pages 221–226https://doi.org/10.1145/502512.502542

Published:26 August 2001Publication History

KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 221–226

ABSTRACT

High dimensional data has always been a challenge for clustering algorithms because of the inherent sparsity of the points. Therefore, techniques have recently been proposed to find clusters in hidden subspaces of the data. However, since the behavior of the data may vary considerably in different subspaces, it is often difficult to define the notion of a cluster with the use of simple mathematical formalizations. In fact, the meaningfulness and definition of a cluster is best characterized with the use of human intuition. In this paper, we propose a system which performs high dimensional clustering by effective cooperation between the human and the computer. The complex task of cluster creation is accomplished by a combination of human intuition and the computational support provided by the computer. The result is a system which leverages the best abilities of both the human and the computer in order to create very meaningful sets of clusters in high dimensionality.

References

1.C. C. Aggarwal. Re-designing distance functions and distance based applications for high dimensional data. ACM SIGMOD Record, March 2001. Google ScholarDigital Library
2.C. C. Aggarwal. A Human-Computer Cooperative System for Effective High Dimensional Clustering, IBM Research Report, 2001.Google ScholarDigital Library
3.C. C. Aggarwal et al. Fast algorithms for projected clustering. A CM SIGMOD Conference, 1999. Google ScholarDigital Library
4.C. C. Aggarwal, P. S. Yu. Finding Generalized Projected Clusters in High Dimensional Spaces. ACM SIGMOD Conference, 2000. Google ScholarDigital Library
5.R. Srikant, R. Agrawal. Mining Quantitative Association Rules in Large Relational Tables. A CM SIGMOD Conference, 1996. Google ScholarDigital Library
6.It. Agrawal et al. Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. ACM SIGMOD Conference, 1998. Google ScholarDigital Library
7.A. Hinneburg, D. A. Keim, M. Wawryniuk. HD-Eye: Visual Mining of High Dimensional Data. IEEE Comp. Graphics and Applications, 19(5), pp. 22-31, 1999. Google ScholarDigital Library
8.I. T. Jolliffe. Principal Component Analysis, Springer-Verlag, New York, 1986.Google ScholarCross Ref

Index Terms

A human-computer cooperative system for effective high dimensional clustering
1. Information systems

Recommendations

Iterative random projections for high-dimensional data clustering

In this text we propose a method which efficiently performs clustering of high-dimensional data. The method builds on random projection and the K-means algorithm. The idea is to apply K-means several times, increasing the dimensionality of the data ...
Read More
Subspace clustering for high dimensional data: a review
Special issue on learning from imbalanced datasets

Subspace clustering is an extension of traditional clustering that seeks to find clusters in different subspaces within a dataset. Often in high dimensional data, many dimensions are irrelevant and can mask existing clusters in noisy data. Feature ...
Read More
Subspace clustering of high-dimensional data: an evolutionary approach

Clustering high-dimensional data has been a major challenge due to the inherent sparsity of the points. Most existing clustering algorithms become substantially inefficient if the required similarity measure is computed between data points in the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
August 2001
493 pages
ISBN:158113391X
DOI:10.1145/502512
Conference Chair:
Doheon Lee
Chonnam National University, Korea
,
General Chair:
Mario Schkolnick
SGI
,
Program Chairs:
Foster Provost
New York University
,
Ramakrishnan Srikant
IBM Almaden Research Center
Copyright © 2001 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 August 2001
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
KDD '01 Paper Acceptance Rate31of237submissions,13%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 24
  Total Citations
  View Citations
- 283
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A human-computer cooperative system for effective high dimensional clustering

KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Iterative random projections for high-dimensional data clustering

Subspace clustering for high dimensional data: a review

Subspace clustering of high-dimensional data: an evolutionary approach