nach oben

Journal of Combinatorial Optimization

Erschienen in:

26.09.2019

The seeding algorithm for k-means problem with penalties

verfasst von: Min Li, Dachuan Xu, Jun Yue, Dongmei Zhang, Peng Zhang

Erschienen in: Journal of Combinatorial Optimization | Ausgabe 1/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The k-means problem is a classic NP-hard problem in machine learning and computational geometry. And its goal is to separate the given set into k clusters according to the minimal squared distance. The k-means problem with penalties, as one generalization of k-means problem, allows that some point need not be clustered instead of being paid some penalty. In this paper, we study the k-means problem with penalties by using the seeding algorithm. We propose that the accuracy only involves the ratio of the maximal penalty value to the minimal one. When the penalty is uniform, the approximation factor reduces to the same one for the k-means problem. Moreover, our result generalizes the k-means++ for k-means problem to the penalty version. Numerical experiments show that our seeding algorithm is more effective than the one without using seeding.

Vorheriger Artikel Approximation algorithms for two-stage flexible flow shop scheduling

Nächster Artikel Less is more approach for competing groups scheduling with different learning effects

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Aggarwal A, Deshpande A, Kannan R (2009) Adaptive sampling for \(k\)-means clustering. In: Proceedings of APPROX and RANDOM, pp. 15–28

Ahmadian S, Norouzi-Fard A, Svensson O, Ward J (2017) Better guarantees for \(k\)-means and Euclidean \(k\)-median by primal–dual algorithms. In: Proceedings of FOCS, pp. 61–72

Aloise D, Deshpande A, Hansen P, Popat P (2009) NP-hardness of Euclidean sum-of-squares clustering. Mach Learn 75:245–248CrossRef

Arthur D, Vassilvitskii S (2007) \(k\)-means++: The advantages of careful seeding. In: Proceedings of SODA, pp. 1027–1035

Awasthi P, Charikar M, Krishnaswamy R, Sinop AK (2015) The hardness of approximation of Euclidean \(k\)-means. In: Proceedings of SoCG, pp. 754–767

Bachem O, Lucic M, Hassani SH, Krause A (2016a) Approximate \(k\)-means++ in sublinear time. In: Proceedings of AAAI, pp. 1459–1467

Bachem O, Lucic M, Hassani SH, Krause A (2016b) Fast and provably good seedings for \(k\)-means. In: Proceedings of NIPS, pp. 55–63

Bachem O, Lucic M, Krause A (2017) Distributed and provably good seedings for \(k\)-means in constant rounds. In: Proceedings of ICML, pp. 292–300

Bahmani B, Moseley B, Vattani A, Kumar R, Vassilvitskii S (2012) Scalable \(k\)-means++. In: Proceedings of the VLDB endowment, pp. 622–633

Blömer J, Lammersen C, Schmidt M, Sohler C (2016) Theoretical analysis of the \(k\)-means algorithm–a survey. In: Kliemann L, Sanders P (eds) Algorithm engineering, Springer, New York, pp. 81–116

Chang XY, Wang Y, Li R, Xu Z (2014) Sparse \(k\)-means with \(l_\infty \)/\(l_0\) penalty for high-dimensional data clustering. arXiv:1403.7890v1

Drineas P, Frieze A, Kannan R, Vempala S, Vinay V (2004) Clustering large graphs via the singular value decomposition. Mach Learn 56:9–33CrossRef

Har-Peled S, Sadri B (2005) How fast is the \(k\)-means method? In: Proceedings of SODA, pp. 332–229

Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverma R, Wu AY (2004) A local search approximation algorithm for \(k\)-means clustering. Comput Geom Theory Appl 28:89–112MathSciNetCrossRef

Lee E, Schmidt M, Wright J (2017) Improved and simplified inapproximability for \(k\)-means. Inf Process Lett 120:40–43MathSciNetCrossRef

Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28:21–33MathSciNetCrossRef

Ostrovsky R, Rabani Y, Schulman L, Swamy C (2012) The effectiveness of Lloyd-type methods for the \(k\)-means problem. J ACM 59:28:1–28:22MathSciNetCrossRef

Rezaei M, Färnti P (2016) Set matching measures for external cluster validity. IEEE Trans Knowl Data Eng 28:2173–2186CrossRef

Tseng GC (2007) Penalized and weighted \(k\)-means for clustering with scattered objects and prior information in high-throughput biological data. Bioinformatics 23:2247–2255CrossRef

Xu D, Xu Y, Zhang D (2017) A survey on algorithm for \(k\)-means problem and its variants. Oper Res Trans 21:101–109CrossRef

Zhang D, Hao C, Wu C, Xu D, Zhang Z (2017) A local search approximation algorithm for the \(k\)-means problem with penalties. In: Proceedings of COCOON, pp. 568–574

Titel: The seeding algorithm for k-means problem with penalties
verfasst von: Min Li
Dachuan Xu
Jun Yue
Dongmei Zhang
Peng Zhang
Publikationsdatum: 26.09.2019
Verlag: Springer US
Erschienen in: Journal of Combinatorial Optimization / Ausgabe 1/2020
Print ISSN: 1382-6905
Elektronische ISSN: 1573-2886
DOI: https://doi.org/10.1007/s10878-019-00450-w

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 1/2020

Tree-coloring problems of bounded treewidth graphs

Dynamic three-stage operating room scheduling considering patient waiting time and surgical overtime costs

Efficient algorithms for measuring the funnel-likeness of DAGs

Less is more approach for competing groups scheduling with different learning effects

Approximation algorithm for a generalized Roman domination problem in unit ball graphs

On the extremal graphs with respect to the total reciprocal edge-eccentricity

Premium Partner