Skip to main content
Erschienen in: Journal of Combinatorial Optimization 3/2022

12.02.2020

The bi-criteria seeding algorithms for two variants of k-means problem

verfasst von: Min Li

Erschienen in: Journal of Combinatorial Optimization | Ausgabe 3/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The k-means problem is very classic and important in computer science and machine learning, so there are many variants presented depending on different backgrounds, such as the k-means problem with penalties, the spherical k-means clustering, and so on. Since the k-means problem is NP-hard, the research of its approximation algorithm is very hot. In this paper, we apply a bi-criteria seeding algorithm to both k-means problem with penalties and spherical k-means problem, and improve (upon) the performance guarantees given by the k-means++ algorithm for these two problems.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aggarwal A, Deshpande A, Kannan R (2009) Adaptive sampling for \(k\)-means clustering. In: Proceedings of APPROX and RANDOM, pp 15–28 Aggarwal A, Deshpande A, Kannan R (2009) Adaptive sampling for \(k\)-means clustering. In: Proceedings of APPROX and RANDOM, pp 15–28
Zurück zum Zitat Ahmadian S, Norouzi-Fard A, Svensson O, Ward J (2017) Better guarantees for \(k\)-means and Euclidean \(k\)-median by primal-dual algorithms. In: Proceedings of FOCS, pp 61–72 Ahmadian S, Norouzi-Fard A, Svensson O, Ward J (2017) Better guarantees for \(k\)-means and Euclidean \(k\)-median by primal-dual algorithms. In: Proceedings of FOCS, pp 61–72
Zurück zum Zitat Aloise D, Deshpande A, Hansen P, Popat P (2009) NP-hardness of Euclidean sum-of-squares clustering. Mach Learn 75:245–248CrossRef Aloise D, Deshpande A, Hansen P, Popat P (2009) NP-hardness of Euclidean sum-of-squares clustering. Mach Learn 75:245–248CrossRef
Zurück zum Zitat Arthur D, Vassilvitskii S (2007) \(k\)-means++: the advantages of careful seeding. In: Proceedings of SODA, pp 1027–1035 Arthur D, Vassilvitskii S (2007) \(k\)-means++: the advantages of careful seeding. In: Proceedings of SODA, pp 1027–1035
Zurück zum Zitat Awasthi P, Charikar M, Krishnaswamy R, Sinop AK (2015) The hardness of approximation of Euclidean \(k\)-means. In: Proceedings of SoCG, pp 754–767 Awasthi P, Charikar M, Krishnaswamy R, Sinop AK (2015) The hardness of approximation of Euclidean \(k\)-means. In: Proceedings of SoCG, pp 754–767
Zurück zum Zitat Bachem O, Lucic M, Hassani SH, Krause A (2016) Approximate \(k\)-means++ in sublinear time. In: Proceedings of AAAI, pp 1459–1467 Bachem O, Lucic M, Hassani SH, Krause A (2016) Approximate \(k\)-means++ in sublinear time. In: Proceedings of AAAI, pp 1459–1467
Zurück zum Zitat Bachem O, Lucic M, Krause A (2017) Distributed and provably good seedings for \(k\)-means in constant rounds. In: Proceedings of ICML, pp 292–300 Bachem O, Lucic M, Krause A (2017) Distributed and provably good seedings for \(k\)-means in constant rounds. In: Proceedings of ICML, pp 292–300
Zurück zum Zitat Blömer J, Lammersen C, Schmidt M, Sohler C (2016) Theoretical analysis of the \(k\)-means algorithm—a survey. In: Algorithm engineering. Springer, pp 81–116 Blömer J, Lammersen C, Schmidt M, Sohler C (2016) Theoretical analysis of the \(k\)-means algorithm—a survey. In: Algorithm engineering. Springer, pp 81–116
Zurück zum Zitat Dhillon IS, Modha DS (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42:143–175CrossRef Dhillon IS, Modha DS (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42:143–175CrossRef
Zurück zum Zitat Drineas P, Frieze A, Kannan R, Vempala S, Vinay V (2004) Clustering large graphs via the singular value decomposition. Mach Learn 56:9–33CrossRef Drineas P, Frieze A, Kannan R, Vempala S, Vinay V (2004) Clustering large graphs via the singular value decomposition. Mach Learn 56:9–33CrossRef
Zurück zum Zitat Endo Y, Miyamoto S (2015) Spherical \(k\)-means++ clustering. In: Proceedings of MDAI, pp 103–114 Endo Y, Miyamoto S (2015) Spherical \(k\)-means++ clustering. In: Proceedings of MDAI, pp 103–114
Zurück zum Zitat Feng Q, Zhang Z, Shi F, Wang J (2019) An improved approximation algorithm for the \(k\)-means problem with penalties. In: Proceedings of FAW, pp 170–181 Feng Q, Zhang Z, Shi F, Wang J (2019) An improved approximation algorithm for the \(k\)-means problem with penalties. In: Proceedings of FAW, pp 170–181
Zurück zum Zitat Lee E, Schmidt M, Wright J (2017) Improved and simplified inapproximability for \(k\)-means. Inf Process Lett 120:40–43MathSciNetCrossRef Lee E, Schmidt M, Wright J (2017) Improved and simplified inapproximability for \(k\)-means. Inf Process Lett 120:40–43MathSciNetCrossRef
Zurück zum Zitat Li M, Xu D, Yue J, Zhang D, Zhang P (2020) The seeding algorithm for \(k\)-means problem with penalties. J Comb Optim 39:15–32MathSciNetCrossRef Li M, Xu D, Yue J, Zhang D, Zhang P (2020) The seeding algorithm for \(k\)-means problem with penalties. J Comb Optim 39:15–32MathSciNetCrossRef
Zurück zum Zitat Makarychev K, Makarychev Y, Sviridenko M, Ward J (2016) A bi-criteria approximation algorithm for \(k\)-means. In: Proceedings of APPROX/RONDOM, pp 14:1–14:20 Makarychev K, Makarychev Y, Sviridenko M, Ward J (2016) A bi-criteria approximation algorithm for \(k\)-means. In: Proceedings of APPROX/RONDOM, pp 14:1–14:20
Zurück zum Zitat Ostrovsky R, Rabani Y, Schulman L, Swamy C (2012) The effectiveness of Lloyd-type methods for the \(k\)-means problem. J ACM 59:28:1–28:22MathSciNetCrossRef Ostrovsky R, Rabani Y, Schulman L, Swamy C (2012) The effectiveness of Lloyd-type methods for the \(k\)-means problem. J ACM 59:28:1–28:22MathSciNetCrossRef
Zurück zum Zitat Tseng GC (2007) Penalized and weighted \(k\)-means for clustering with scattered objects and prior information in high-throughput biological data. Bioinformatics 23:2247–2255CrossRef Tseng GC (2007) Penalized and weighted \(k\)-means for clustering with scattered objects and prior information in high-throughput biological data. Bioinformatics 23:2247–2255CrossRef
Zurück zum Zitat Wei D (2016) A constant-factor bi-criteria approximation guarantee for \(k\)-means++. In: Proceedings of NIPS, pp 604–612 Wei D (2016) A constant-factor bi-criteria approximation guarantee for \(k\)-means++. In: Proceedings of NIPS, pp 604–612
Zurück zum Zitat Wu X, Kumar V, Quinlan J, Ross Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou Z, Steinbach M, Hand DJ, Steinberg D (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14:1–37CrossRef Wu X, Kumar V, Quinlan J, Ross Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou Z, Steinbach M, Hand DJ, Steinberg D (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14:1–37CrossRef
Zurück zum Zitat Xu X, Ding S, Shi Z (2018) An improved density peaks clustering algorithm with fast finding cluster centers. Knowl Based Syst 158:65–74CrossRef Xu X, Ding S, Shi Z (2018) An improved density peaks clustering algorithm with fast finding cluster centers. Knowl Based Syst 158:65–74CrossRef
Zurück zum Zitat Xu D, Xu Y, Zhang D (2017) A survey on algorithm for \(k\)-means problem and its variants. Oper Res Trans 21:101–109MATH Xu D, Xu Y, Zhang D (2017) A survey on algorithm for \(k\)-means problem and its variants. Oper Res Trans 21:101–109MATH
Zurück zum Zitat Xu D, Xu Y, Zhang D (2018) A survey on the initialization methods for the \(k\)-means algorithm. Oper Res Trans 22:31–40MathSciNetMATH Xu D, Xu Y, Zhang D (2018) A survey on the initialization methods for the \(k\)-means algorithm. Oper Res Trans 22:31–40MathSciNetMATH
Zurück zum Zitat Zhang D, Cheng Y, Li M, Wang Y, Xu D (2019) Local search approximation algorithms for the spherical \(k\)-means problem. In: Proceedings of AAIM, pp 341–351 Zhang D, Cheng Y, Li M, Wang Y, Xu D (2019) Local search approximation algorithms for the spherical \(k\)-means problem. In: Proceedings of AAIM, pp 341–351
Zurück zum Zitat Zhang D, Hao C, Wu C, Xu D, Zhang Z (2018) A local search approximation algorithm for the \(k\)-means problem with penalties. J Comb Optim 37:439–453MathSciNetCrossRef Zhang D, Hao C, Wu C, Xu D, Zhang Z (2018) A local search approximation algorithm for the \(k\)-means problem with penalties. J Comb Optim 37:439–453MathSciNetCrossRef
Zurück zum Zitat Zhao Y, Karypis G (2001) Criterion functions for document clustering: experiments and analysis. Technical report \(\sharp \)01-40, Department of Computer Science, University of Minnesota Zhao Y, Karypis G (2001) Criterion functions for document clustering: experiments and analysis. Technical report \(\sharp \)01-40, Department of Computer Science, University of Minnesota
Metadaten
Titel
The bi-criteria seeding algorithms for two variants of k-means problem
verfasst von
Min Li
Publikationsdatum
12.02.2020
Verlag
Springer US
Erschienen in
Journal of Combinatorial Optimization / Ausgabe 3/2022
Print ISSN: 1382-6905
Elektronische ISSN: 1573-2886
DOI
https://doi.org/10.1007/s10878-020-00537-9

Weitere Artikel der Ausgabe 3/2022

Journal of Combinatorial Optimization 3/2022 Zur Ausgabe

Premium Partner