skip to main content
10.1145/1137856.1137880acmconferencesArticle/Chapter ViewAbstractPublication PagessocgConference Proceedingsconference-collections
Article

How slow is the k-means method?

Published:05 June 2006Publication History

ABSTRACT

The k-means method is an old but popular clustering algorithm known for its observed speed and its simplicity. Until recently, however, no meaningful theoretical bounds were known on its running time. In this paper, we demonstrate that the worst-case running time of k-means is superpolynomial by improving the best known lower bound from Ω(n) iterations to 2Ω(√n).

References

  1. Pankaj K. Agarwal and Nabil H. Mustafa. k-means projective clustering. In PODS '04: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 155--165, New York, NY, USA, 2004. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. David Arthur and Sergei Vassilvitskii. Improved smoothed analysis for the k-means method. Manuscript, 2006.Google ScholarGoogle Scholar
  3. David Arthur and Sergei Vassilvitskii. k-means lower bound implementation. http://www.stanford.edu/~darthur/kMeansLbTest.zip, 2006.Google ScholarGoogle Scholar
  4. Sanjoy Dasgupta. How fast is k-means? In COLT Computational Learning Theory, volume 2777, page 735, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  5. R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley-Interscience Publication, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Frédéric Gibou and Ronald Fedkiw. A fast hybrid k-means level set algorithm for segmentation. In 4th Annual Hawaii International Conference on Statistics and Mathematics, pages 281--291, 2005.Google ScholarGoogle Scholar
  7. Sariel Har-Peled and Bardia Sadri. How fast is the k-means method? Algorithmica, 41(3):185--202, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Herwig, A.J. Poustka, C. Muller, C. Bull, H. Lehrach, and J O'Brien. Large-scale clustering of cdna-fingerprinting data. Genome Research, 9:1093--1105, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  9. Mary Inaba, Naoki Katoh, and Hiroshi Imai. Applications of weighted voronoi diagrams and randomization to variance-based k-clustering: extended abstract. In SCG '94: Proceedings of the tenth annual symposium on Computational geometry, pages 332--339, New York, NY, USA, 1994. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. Johnson and J. Lindenstrauss. Extensions of lipschitz maps into a hilbert space. Contemporary Mathematics, 26:189--206, 1984.Google ScholarGoogle ScholarCross RefCross Ref
  11. Tapas Kanungo, David M. Mount, Nathan S. Netanyahu, Christine D. Piatko, Ruth Silverman, and Angela Y. Wu. A local search approximation algorithm for k-means clustering. In SCG '02: Proceedings of the eighteenth annual symposium on Computational geometry, pages 10--18, New York, NY, USA, 2002. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Stuart P. Lloyd. Least squares quantization in pcm. IEEE Transactions on Information Theory, 28(2):129--136, 1982.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Daniel A. Spielman and Shang-Hua Teng. Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. J. ACM, 51(3):385--463, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. How slow is the k-means method?

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SCG '06: Proceedings of the twenty-second annual symposium on Computational geometry
      June 2006
      500 pages
      ISBN:1595933409
      DOI:10.1145/1137856
      • Program Chairs:
      • Nina Amenta,
      • Otfried Cheong

      Copyright © 2006 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 June 2006

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate625of1,685submissions,37%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader