Skip to main content
Top
Published in: Journal of Combinatorial Optimization 3/2022

12-02-2020

The bi-criteria seeding algorithms for two variants of k-means problem

Author: Min Li

Published in: Journal of Combinatorial Optimization | Issue 3/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The k-means problem is very classic and important in computer science and machine learning, so there are many variants presented depending on different backgrounds, such as the k-means problem with penalties, the spherical k-means clustering, and so on. Since the k-means problem is NP-hard, the research of its approximation algorithm is very hot. In this paper, we apply a bi-criteria seeding algorithm to both k-means problem with penalties and spherical k-means problem, and improve (upon) the performance guarantees given by the k-means++ algorithm for these two problems.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Aggarwal A, Deshpande A, Kannan R (2009) Adaptive sampling for \(k\)-means clustering. In: Proceedings of APPROX and RANDOM, pp 15–28 Aggarwal A, Deshpande A, Kannan R (2009) Adaptive sampling for \(k\)-means clustering. In: Proceedings of APPROX and RANDOM, pp 15–28
go back to reference Ahmadian S, Norouzi-Fard A, Svensson O, Ward J (2017) Better guarantees for \(k\)-means and Euclidean \(k\)-median by primal-dual algorithms. In: Proceedings of FOCS, pp 61–72 Ahmadian S, Norouzi-Fard A, Svensson O, Ward J (2017) Better guarantees for \(k\)-means and Euclidean \(k\)-median by primal-dual algorithms. In: Proceedings of FOCS, pp 61–72
go back to reference Aloise D, Deshpande A, Hansen P, Popat P (2009) NP-hardness of Euclidean sum-of-squares clustering. Mach Learn 75:245–248CrossRef Aloise D, Deshpande A, Hansen P, Popat P (2009) NP-hardness of Euclidean sum-of-squares clustering. Mach Learn 75:245–248CrossRef
go back to reference Arthur D, Vassilvitskii S (2007) \(k\)-means++: the advantages of careful seeding. In: Proceedings of SODA, pp 1027–1035 Arthur D, Vassilvitskii S (2007) \(k\)-means++: the advantages of careful seeding. In: Proceedings of SODA, pp 1027–1035
go back to reference Awasthi P, Charikar M, Krishnaswamy R, Sinop AK (2015) The hardness of approximation of Euclidean \(k\)-means. In: Proceedings of SoCG, pp 754–767 Awasthi P, Charikar M, Krishnaswamy R, Sinop AK (2015) The hardness of approximation of Euclidean \(k\)-means. In: Proceedings of SoCG, pp 754–767
go back to reference Bachem O, Lucic M, Hassani SH, Krause A (2016) Approximate \(k\)-means++ in sublinear time. In: Proceedings of AAAI, pp 1459–1467 Bachem O, Lucic M, Hassani SH, Krause A (2016) Approximate \(k\)-means++ in sublinear time. In: Proceedings of AAAI, pp 1459–1467
go back to reference Bachem O, Lucic M, Krause A (2017) Distributed and provably good seedings for \(k\)-means in constant rounds. In: Proceedings of ICML, pp 292–300 Bachem O, Lucic M, Krause A (2017) Distributed and provably good seedings for \(k\)-means in constant rounds. In: Proceedings of ICML, pp 292–300
go back to reference Blömer J, Lammersen C, Schmidt M, Sohler C (2016) Theoretical analysis of the \(k\)-means algorithm—a survey. In: Algorithm engineering. Springer, pp 81–116 Blömer J, Lammersen C, Schmidt M, Sohler C (2016) Theoretical analysis of the \(k\)-means algorithm—a survey. In: Algorithm engineering. Springer, pp 81–116
go back to reference Dhillon IS, Modha DS (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42:143–175CrossRef Dhillon IS, Modha DS (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42:143–175CrossRef
go back to reference Drineas P, Frieze A, Kannan R, Vempala S, Vinay V (2004) Clustering large graphs via the singular value decomposition. Mach Learn 56:9–33CrossRef Drineas P, Frieze A, Kannan R, Vempala S, Vinay V (2004) Clustering large graphs via the singular value decomposition. Mach Learn 56:9–33CrossRef
go back to reference Endo Y, Miyamoto S (2015) Spherical \(k\)-means++ clustering. In: Proceedings of MDAI, pp 103–114 Endo Y, Miyamoto S (2015) Spherical \(k\)-means++ clustering. In: Proceedings of MDAI, pp 103–114
go back to reference Feng Q, Zhang Z, Shi F, Wang J (2019) An improved approximation algorithm for the \(k\)-means problem with penalties. In: Proceedings of FAW, pp 170–181 Feng Q, Zhang Z, Shi F, Wang J (2019) An improved approximation algorithm for the \(k\)-means problem with penalties. In: Proceedings of FAW, pp 170–181
go back to reference Lee E, Schmidt M, Wright J (2017) Improved and simplified inapproximability for \(k\)-means. Inf Process Lett 120:40–43MathSciNetCrossRef Lee E, Schmidt M, Wright J (2017) Improved and simplified inapproximability for \(k\)-means. Inf Process Lett 120:40–43MathSciNetCrossRef
go back to reference Li M, Xu D, Yue J, Zhang D, Zhang P (2020) The seeding algorithm for \(k\)-means problem with penalties. J Comb Optim 39:15–32MathSciNetCrossRef Li M, Xu D, Yue J, Zhang D, Zhang P (2020) The seeding algorithm for \(k\)-means problem with penalties. J Comb Optim 39:15–32MathSciNetCrossRef
go back to reference Makarychev K, Makarychev Y, Sviridenko M, Ward J (2016) A bi-criteria approximation algorithm for \(k\)-means. In: Proceedings of APPROX/RONDOM, pp 14:1–14:20 Makarychev K, Makarychev Y, Sviridenko M, Ward J (2016) A bi-criteria approximation algorithm for \(k\)-means. In: Proceedings of APPROX/RONDOM, pp 14:1–14:20
go back to reference Ostrovsky R, Rabani Y, Schulman L, Swamy C (2012) The effectiveness of Lloyd-type methods for the \(k\)-means problem. J ACM 59:28:1–28:22MathSciNetCrossRef Ostrovsky R, Rabani Y, Schulman L, Swamy C (2012) The effectiveness of Lloyd-type methods for the \(k\)-means problem. J ACM 59:28:1–28:22MathSciNetCrossRef
go back to reference Tseng GC (2007) Penalized and weighted \(k\)-means for clustering with scattered objects and prior information in high-throughput biological data. Bioinformatics 23:2247–2255CrossRef Tseng GC (2007) Penalized and weighted \(k\)-means for clustering with scattered objects and prior information in high-throughput biological data. Bioinformatics 23:2247–2255CrossRef
go back to reference Wei D (2016) A constant-factor bi-criteria approximation guarantee for \(k\)-means++. In: Proceedings of NIPS, pp 604–612 Wei D (2016) A constant-factor bi-criteria approximation guarantee for \(k\)-means++. In: Proceedings of NIPS, pp 604–612
go back to reference Wu X, Kumar V, Quinlan J, Ross Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou Z, Steinbach M, Hand DJ, Steinberg D (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14:1–37CrossRef Wu X, Kumar V, Quinlan J, Ross Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou Z, Steinbach M, Hand DJ, Steinberg D (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14:1–37CrossRef
go back to reference Xu X, Ding S, Shi Z (2018) An improved density peaks clustering algorithm with fast finding cluster centers. Knowl Based Syst 158:65–74CrossRef Xu X, Ding S, Shi Z (2018) An improved density peaks clustering algorithm with fast finding cluster centers. Knowl Based Syst 158:65–74CrossRef
go back to reference Xu D, Xu Y, Zhang D (2017) A survey on algorithm for \(k\)-means problem and its variants. Oper Res Trans 21:101–109MATH Xu D, Xu Y, Zhang D (2017) A survey on algorithm for \(k\)-means problem and its variants. Oper Res Trans 21:101–109MATH
go back to reference Xu D, Xu Y, Zhang D (2018) A survey on the initialization methods for the \(k\)-means algorithm. Oper Res Trans 22:31–40MathSciNetMATH Xu D, Xu Y, Zhang D (2018) A survey on the initialization methods for the \(k\)-means algorithm. Oper Res Trans 22:31–40MathSciNetMATH
go back to reference Zhang D, Cheng Y, Li M, Wang Y, Xu D (2019) Local search approximation algorithms for the spherical \(k\)-means problem. In: Proceedings of AAIM, pp 341–351 Zhang D, Cheng Y, Li M, Wang Y, Xu D (2019) Local search approximation algorithms for the spherical \(k\)-means problem. In: Proceedings of AAIM, pp 341–351
go back to reference Zhang D, Hao C, Wu C, Xu D, Zhang Z (2018) A local search approximation algorithm for the \(k\)-means problem with penalties. J Comb Optim 37:439–453MathSciNetCrossRef Zhang D, Hao C, Wu C, Xu D, Zhang Z (2018) A local search approximation algorithm for the \(k\)-means problem with penalties. J Comb Optim 37:439–453MathSciNetCrossRef
go back to reference Zhao Y, Karypis G (2001) Criterion functions for document clustering: experiments and analysis. Technical report \(\sharp \)01-40, Department of Computer Science, University of Minnesota Zhao Y, Karypis G (2001) Criterion functions for document clustering: experiments and analysis. Technical report \(\sharp \)01-40, Department of Computer Science, University of Minnesota
Metadata
Title
The bi-criteria seeding algorithms for two variants of k-means problem
Author
Min Li
Publication date
12-02-2020
Publisher
Springer US
Published in
Journal of Combinatorial Optimization / Issue 3/2022
Print ISSN: 1382-6905
Electronic ISSN: 1573-2886
DOI
https://doi.org/10.1007/s10878-020-00537-9

Other articles of this Issue 3/2022

Journal of Combinatorial Optimization 3/2022 Go to the issue

Premium Partner