skip to main content
10.1145/509907.509947acmconferencesArticle/Chapter ViewAbstractPublication PagesstocConference Proceedingsconference-collections
Article

Approximate clustering via core-sets

Published:19 May 2002Publication History

ABSTRACT

In this paper, we show that for several clustering problems one can extract a small set of points, so that using those core-sets enable us to perform approximate clustering efficiently. The surprising property of those core-sets is that their size is independent of the dimension.Using those, we present a (1+ ε)-approximation algorithms for the k-center clustering and k-median clustering problems in Euclidean space. The running time of the new algorithms has linear or near linear dependency on the number of points and the dimension, and exponential dependency on 1/ε and k. As such, our results are a substantial improvement over what was previously known.We also present some other clustering results including (1+ ε)-approximate 1-cylinder clustering, and k-center clustering with outliers.

References

  1. P. Agarwal and C. Procopiuc. Exact and approximation algorithms for clustering. In Proc. 9th ACM-SIAM Sympos. Discrete Algorithms, pages 658--667, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. N. Alon, S. Dar, M. Parnas, and D. Ron. Testing of clustering. In Proc. 41th Annu. IEEE Sympos. Found. Comput. Sci., 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Arora, P. Raghavan, and S. Rao. Approximation schemes for Euclidean k-median and related problems. In Proc. 30th Annu. ACM Sympos. Theory Comput., pages 106--113, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Bern and D. Eppstein. Approximation algorithms for geometric problems. In D. Hochbaum, editor, Approximationg algorithms for NP-Hard problems. PWS Publishing Company, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Duda, P. Hart, and D. Stork. Pattern Classification. Wiley-Interscience, New York, 2nd edition, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Engebretsen, P. Indyk, and R. O'Donnell. Derandomized dimensionality reduction with applications. In Proc. 13th ACM-SIAM Sympos. Discrete Algorithms, 2002. to appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Goel, P. Indyk, and K. Varadarajan. Reductions among high dimensional proximity problems. In Proc. 12th ACM-SIAM Sympos. Discrete Algorithms, pages 769--778, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoret. Comput. Sci., 38:293--306, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  9. M. Grötschel, L. Lovász, and A. Schrijver. Geometric Algorithms and Combinatorial Optimization, volume 2 of Algorithms and Combinatorics. Springer-Verlag, Berlin Heidelberg, 2nd edition, 1988. 2nd edition 1994.Google ScholarGoogle Scholar
  10. S. Har-Peled. Clustering motion. In Proc. 42nd Annu. IEEE Sympos. Found. Comput. Sci., pages 84--93, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Har-Peled and K. Varadarajan. Approximate shape fitting via linearization. In Proc. 42nd Annu. IEEE Sympos. Found. Comput. Sci., pages 66--73, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Indyk. Algorithmic applications of low-distortion geometric embeddings. In Proc. 42nd Annu. IEEE Sympos. Found. Comput. Sci., pages 10--31, 2001. Tutorial. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Indyk and M. Thorup. Approximate 1-median. manuscript, 2000.Google ScholarGoogle Scholar
  14. N. Mishra, D. Oblinger, and L. Pitt. Sublinear time approximate clustering. In SODA 12, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Ostrovsky and Y. Rabani. Polynomial time approximation schemes for geometric k-clustering. In Proc. 41st Symp. Foundations of Computer Science, pages 349--358. IEEE, Nov 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Approximate clustering via core-sets

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            STOC '02: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
            May 2002
            840 pages
            ISBN:1581134959
            DOI:10.1145/509907

            Copyright © 2002 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 19 May 2002

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            STOC '02 Paper Acceptance Rate91of287submissions,32%Overall Acceptance Rate1,469of4,586submissions,32%

            Upcoming Conference

            STOC '24
            56th Annual ACM Symposium on Theory of Computing (STOC 2024)
            June 24 - 28, 2024
            Vancouver , BC , Canada

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader