Skip to main content
Erschienen in: Journal of Combinatorial Optimization 4/2022

28.06.2021

Improved local search algorithms for Bregman k-means and its variants

verfasst von: Xiaoyun Tian, Dachuan Xu, Longkun Guo, Dan Wu

Erschienen in: Journal of Combinatorial Optimization | Ausgabe 4/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we consider the Bregman k-means problem (BKM) which is a variant of the classical k-means problem. For an n-point set \({\mathcal {S}}\) and \(k \le n\) with respect to \(\mu \)-similar Bregman divergence, the BKM problem aims first to find a center subset \(C \subseteq {\mathcal {S}}\) with \( \mid C \mid = k\) and then separate \({\mathcal {S}}\) into k clusters according to C, such that the sum of \(\mu \)-similar Bregman divergence from each point in \({\mathcal {S}}\) to its nearest center is minimized. We propose a \(\mu \)-similar BregMeans++ algorithm by employing the local search scheme, and prove that the algorithm deserves a constant approximation guarantee. Moreover, we extend our algorithm to solve a variant of BKM called noisy \(\mu \)-similar Bregman k-means++ (noisy \(\mu \)-BKM++) which is BKM in the noisy scenario. For the same instance and purpose as BKM, we consider the case of sampling a point with an imprecise probability by a factor between \(1-\varepsilon _1\) and \(1+ \varepsilon _2\) for \(\varepsilon _1 \in [0,1)\) and \(\varepsilon _2 \ge 0\), and obtain an approximation ratio of \(O(\log ^2 k)\) in expectation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Ackermann M Blömer J (2009) Coresets and approximate clustering for Bregman divergences. In: Proceedings of SODA, pp 1088–1097 Ackermann M Blömer J (2009) Coresets and approximate clustering for Bregman divergences. In: Proceedings of SODA, pp 1088–1097
Zurück zum Zitat Ackermann M, Blömer J (2010) Bregman clustering for separable instances. In: Proceedings of SWAT, pp 212–223 Ackermann M, Blömer J (2010) Bregman clustering for separable instances. In: Proceedings of SWAT, pp 212–223
Zurück zum Zitat Ackermann M, Blömer J, Sohler C (2010) Clustering for metric and non-metric distance measures. ACM Trans Algorithms 6(4):1–59MathSciNetCrossRef Ackermann M, Blömer J, Sohler C (2010) Clustering for metric and non-metric distance measures. ACM Trans Algorithms 6(4):1–59MathSciNetCrossRef
Zurück zum Zitat Arthur D, Vassilvitskii S (2007) \(k\)-means++: the advantages of careful seeding. In: Proceedings of SODA, pp 1027–1035 Arthur D, Vassilvitskii S (2007) \(k\)-means++: the advantages of careful seeding. In: Proceedings of SODA, pp 1027–1035
Zurück zum Zitat Banerjee A, Guo X, Wang H (2005) On the optimality of conditional expectation as a Bregman predictor. IEEE Trans Inf Theory 51(7):2664–2669MathSciNetCrossRef Banerjee A, Guo X, Wang H (2005) On the optimality of conditional expectation as a Bregman predictor. IEEE Trans Inf Theory 51(7):2664–2669MathSciNetCrossRef
Zurück zum Zitat Banerjee A, Merugu S, Dhillon I, Ghosh J (2005) Clustering with Bregman divergences. J Mach Learn Res 6:1705–1749MathSciNetMATH Banerjee A, Merugu S, Dhillon I, Ghosh J (2005) Clustering with Bregman divergences. J Mach Learn Res 6:1705–1749MathSciNetMATH
Zurück zum Zitat Belhassena A, Wang H (2019) Trajectory big data processing based on frequent activity. Tsinghua Sci Technol 24(3):317–332CrossRef Belhassena A, Wang H (2019) Trajectory big data processing based on frequent activity. Tsinghua Sci Technol 24(3):317–332CrossRef
Zurück zum Zitat Bhattacharya A, Eube J, Heiko Röglin H, Schmidt MN (2020) Greedy and Not So Greedy \(k\)-means++. In: Proceedings of ESA, pp 18:1-18:21 Bhattacharya A, Eube J, Heiko Röglin H, Schmidt MN (2020) Greedy and Not So Greedy \(k\)-means++. In: Proceedings of ESA, pp 18:1-18:21
Zurück zum Zitat Bregman L (1967) The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Comput Math Math Phys 7:200–217MathSciNetCrossRef Bregman L (1967) The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Comput Math Math Phys 7:200–217MathSciNetCrossRef
Zurück zum Zitat Censor Y, Lent A (1981) An iterative rowaction method for interval convex programming. J Optim Theory Appl 34(3):321–353MathSciNetCrossRef Censor Y, Lent A (1981) An iterative rowaction method for interval convex programming. J Optim Theory Appl 34(3):321–353MathSciNetCrossRef
Zurück zum Zitat Choo D, Grunau C, Portmann J, Rozhon V (2020) \(k\)-means++: few more steps yield constant approximation In: Proceedings of the 37th International Conference on Machine Learning (ICML), pp 1909–1917 Choo D, Grunau C, Portmann J, Rozhon V (2020) \(k\)-means++: few more steps yield constant approximation In: Proceedings of the 37th International Conference on Machine Learning (ICML), pp 1909–1917
Zurück zum Zitat Censor Y, Zenios S (1997) Parallel optimization: theory, algorithms, and applications. Oxford University Press, OxfordMATH Censor Y, Zenios S (1997) Parallel optimization: theory, algorithms, and applications. Oxford University Press, OxfordMATH
Zurück zum Zitat Feldman D, Schmidt M, Sohler C (2020) Turning big data into tiny data: constant-size coresets for \(k\)-means, PCA and projective clustering. SIAM J Comput 49(3):601–657MathSciNetCrossRef Feldman D, Schmidt M, Sohler C (2020) Turning big data into tiny data: constant-size coresets for \(k\)-means, PCA and projective clustering. SIAM J Comput 49(3):601–657MathSciNetCrossRef
Zurück zum Zitat Jain A, Dubes R (1988) Algorithms for clustering data. Prentice Hall, New JerseyMATH Jain A, Dubes R (1988) Algorithms for clustering data. Prentice Hall, New JerseyMATH
Zurück zum Zitat Jain A, Murty M, Flynn P (1999) Data clustering: a review. ACM Comput Surveys 31:264–323CrossRef Jain A, Murty M, Flynn P (1999) Data clustering: a review. ACM Comput Surveys 31:264–323CrossRef
Zurück zum Zitat Kanungo T, Mount D, Netanyahu N, Piatko C, Silverman R, Wu A (2002) A local search approximation algorithm for \(k\)-means clustering. In: Proceedings of SoCG, pp 10–18 Kanungo T, Mount D, Netanyahu N, Piatko C, Silverman R, Wu A (2002) A local search approximation algorithm for \(k\)-means clustering. In: Proceedings of SoCG, pp 10–18
Zurück zum Zitat Kumar A, Sabharwal Y, Sen S. A simple linear time \((1+\varepsilon )\)-approximation algorithm for \(k\)-means clustering in any dimensions. In: Proceedings of FOCS, pp 454–462 Kumar A, Sabharwal Y, Sen S. A simple linear time \((1+\varepsilon )\)-approximation algorithm for \(k\)-means clustering in any dimensions. In: Proceedings of FOCS, pp 454–462
Zurück zum Zitat Lattanzi S, Sohler C (2019) A better \(k\)-means++ algorithm via local search. In: Proceedings of ICML, pp 3662–3671 Lattanzi S, Sohler C (2019) A better \(k\)-means++ algorithm via local search. In: Proceedings of ICML, pp 3662–3671
Zurück zum Zitat Wang N, Guo G, Wang B, Wang C (2020) Traffic clustering algorithm of urban data brain based on a hybrid-augmented architecture of quantum annealing and brain-inspired cognitive computing. Tsinghua Sci Technol 25(6):813–825CrossRef Wang N, Guo G, Wang B, Wang C (2020) Traffic clustering algorithm of urban data brain based on a hybrid-augmented architecture of quantum annealing and brain-inspired cognitive computing. Tsinghua Sci Technol 25(6):813–825CrossRef
Metadaten
Titel
Improved local search algorithms for Bregman k-means and its variants
verfasst von
Xiaoyun Tian
Dachuan Xu
Longkun Guo
Dan Wu
Publikationsdatum
28.06.2021
Verlag
Springer US
Erschienen in
Journal of Combinatorial Optimization / Ausgabe 4/2022
Print ISSN: 1382-6905
Elektronische ISSN: 1573-2886
DOI
https://doi.org/10.1007/s10878-021-00771-9

Weitere Artikel der Ausgabe 4/2022

Journal of Combinatorial Optimization 4/2022 Zur Ausgabe

Premium Partner