Article

Approximate clustering via core-sets

Authors:
Mihai Bādoiu

MIT Laboratory for Comp. Sci., Cambridge, MA

MIT Laboratory for Comp. Sci., Cambridge, MA
View Profile

,
Sariel Har-Peled

University of Illinois, Urbana, IL

University of Illinois, Urbana, IL
View Profile

,
Piotr Indyk

MIT Laboratory for Comp. Sci., Cambridge, MA

MIT Laboratory for Comp. Sci., Cambridge, MA
View Profile

STOC '02: Proceedings of the thiry-fourth annual ACM symposium on Theory of computingMay 2002Pages 250–257https://doi.org/10.1145/509907.509947

Published:19 May 2002Publication History

STOC '02: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing

Pages 250–257

ABSTRACT

In this paper, we show that for several clustering problems one can extract a small set of points, so that using those core-sets enable us to perform approximate clustering efficiently. The surprising property of those core-sets is that their size is independent of the dimension.Using those, we present a (1+ ε)-approximation algorithms for the k-center clustering and k-median clustering problems in Euclidean space. The running time of the new algorithms has linear or near linear dependency on the number of points and the dimension, and exponential dependency on 1/ε and k. As such, our results are a substantial improvement over what was previously known.We also present some other clustering results including (1+ ε)-approximate 1-cylinder clustering, and k-center clustering with outliers.

References

P. Agarwal and C. Procopiuc. Exact and approximation algorithms for clustering. In Proc. 9th ACM-SIAM Sympos. Discrete Algorithms, pages 658--667, 1998. Google ScholarDigital Library
N. Alon, S. Dar, M. Parnas, and D. Ron. Testing of clustering. In Proc. 41th Annu. IEEE Sympos. Found. Comput. Sci., 2000. Google ScholarDigital Library
S. Arora, P. Raghavan, and S. Rao. Approximation schemes for Euclidean k-median and related problems. In Proc. 30th Annu. ACM Sympos. Theory Comput., pages 106--113, 1998. Google ScholarDigital Library
M. Bern and D. Eppstein. Approximation algorithms for geometric problems. In D. Hochbaum, editor, Approximationg algorithms for NP-Hard problems. PWS Publishing Company, 1997. Google ScholarDigital Library
R. Duda, P. Hart, and D. Stork. Pattern Classification. Wiley-Interscience, New York, 2nd edition, 2001. Google ScholarDigital Library
L. Engebretsen, P. Indyk, and R. O'Donnell. Derandomized dimensionality reduction with applications. In Proc. 13th ACM-SIAM Sympos. Discrete Algorithms, 2002. to appear. Google ScholarDigital Library
A. Goel, P. Indyk, and K. Varadarajan. Reductions among high dimensional proximity problems. In Proc. 12th ACM-SIAM Sympos. Discrete Algorithms, pages 769--778, 2001. Google ScholarDigital Library
T. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoret. Comput. Sci., 38:293--306, 1985.Google ScholarCross Ref
M. Grötschel, L. Lovász, and A. Schrijver. Geometric Algorithms and Combinatorial Optimization, volume 2 of Algorithms and Combinatorics. Springer-Verlag, Berlin Heidelberg, 2nd edition, 1988. 2nd edition 1994.Google Scholar
S. Har-Peled. Clustering motion. In Proc. 42nd Annu. IEEE Sympos. Found. Comput. Sci., pages 84--93, 2001. Google ScholarDigital Library
S. Har-Peled and K. Varadarajan. Approximate shape fitting via linearization. In Proc. 42nd Annu. IEEE Sympos. Found. Comput. Sci., pages 66--73, 2001. Google ScholarDigital Library
P. Indyk. Algorithmic applications of low-distortion geometric embeddings. In Proc. 42nd Annu. IEEE Sympos. Found. Comput. Sci., pages 10--31, 2001. Tutorial. Google ScholarDigital Library
P. Indyk and M. Thorup. Approximate 1-median. manuscript, 2000.Google Scholar
N. Mishra, D. Oblinger, and L. Pitt. Sublinear time approximate clustering. In SODA 12, 2001. Google ScholarDigital Library
R. Ostrovsky and Y. Rabani. Polynomial time approximation schemes for geometric k-clustering. In Proc. 41st Symp. Foundations of Computer Science, pages 349--358. IEEE, Nov 2000. Google ScholarDigital Library

Index Terms

Approximate clustering via core-sets

Recommendations

Approximate Kernel Clustering
FOCS '08: Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science

In the kernel clustering problem we are given a large $n\times n$positive semi-definite matrix $A=(a_{ij})$ with$\sum_{i,j=1}^na_{ij}=0$ and a small $k\times k$ positivesemi-definite matrix $B=(b_{ij})$. The goal is to find a partition$S_1,\ldots,S_k$ ...
Read More
Robust Clustering Based on Dominant Sets
ICPR '14: Proceedings of the 2014 22nd International Conference on Pattern Recognition

Clustering is an important unsupervised learning approach and widely used in pattern recognition, data mining and image processing, etc. Different from existing clustering algorithms based on partitioning within data, dominant sets clustering extracts ...
Read More
Approximate clustering without the approximation
SODA '09: Proceedings of the twentieth annual ACM-SIAM symposium on Discrete algorithms

Approximation algorithms for clustering points in metric spaces is a flourishing area of research, with much research effort spent on getting a better understanding of the approximation guarantees possible for many objective functions such as k-median, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
STOC '02: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
May 2002
840 pages
ISBN:1581134959
DOI:10.1145/509907
Conference Chair:
John Reif
Duke University
Copyright © 2002 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 May 2002
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
STOC '02 Paper Acceptance Rate91of287submissions,32%Overall Acceptance Rate1,469of4,586submissions,32%
More
Upcoming Conference
STOC '24

Sponsor:

sigact

56th Annual ACM Symposium on Theory of Computing (STOC 2024)

June 24 - 28, 2024

Vancouver , BC , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 219
  Total Citations
  View Citations
- 470
  Total Downloads
- Downloads (Last 12 months)128
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Approximate clustering via core-sets

STOC '02: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Approximate Kernel Clustering

Robust Clustering Based on Dominant Sets

Approximate clustering without the approximation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Approximate clustering via core-sets

STOC '02: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Approximate Kernel Clustering

Robust Clustering Based on Dominant Sets

Approximate clustering without the approximation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media