Skip to main content
Top
Published in: Journal of Scientific Computing 2/2018

31-05-2018

Stochastic Backward Euler: An Implicit Gradient Descent Algorithm for k-Means Clustering

Authors: Penghang Yin, Minh Pham, Adam Oberman, Stanley Osher

Published in: Journal of Scientific Computing | Issue 2/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, we propose an implicit gradient descent algorithm for the classic k-means problem. The implicit gradient step or backward Euler is solved via stochastic fixed-point iteration, in which we randomly sample a mini-batch gradient in every iteration. It is the average of the fixed-point trajectory that is carried over to the next gradient step. We draw connections between the proposed stochastic backward Euler and the recent entropy stochastic gradient descent for improving the training of deep neural networks. Numerical experiments on various synthetic and real datasets show that the proposed algorithm provides better clustering results compared to k-means algorithms in the sense that it decreased the objective function (the cluster) and is much more robust to initialization.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Artina, M., Fornasier, M., Solombrino, F.: Linearly constrained nonsmooth and nonconvex minimization. SIAM J. Optim. 23(3), 1904–1937 (2013)MathSciNetCrossRef Artina, M., Fornasier, M., Solombrino, F.: Linearly constrained nonsmooth and nonconvex minimization. SIAM J. Optim. 23(3), 1904–1937 (2013)MathSciNetCrossRef
2.
go back to reference Arthur, D., Vassilvitskii, S.: \(k\)-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics (2007) Arthur, D., Vassilvitskii, S.: \(k\)-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics (2007)
3.
go back to reference Baldassi, C., Ingrosso, A., Lucibello, C., Saglietti, L., Zecchina, R.: Subdominant dense clusters allow for simple learning and high computational performance in neural networks with discrete synapses. Phys. Rev. Lett. 115(12), 101–128 (2015)CrossRef Baldassi, C., Ingrosso, A., Lucibello, C., Saglietti, L., Zecchina, R.: Subdominant dense clusters allow for simple learning and high computational performance in neural networks with discrete synapses. Phys. Rev. Lett. 115(12), 101–128 (2015)CrossRef
4.
go back to reference Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (2008) Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (2008)
5.
go back to reference Bottou, L., Bengio, Y.: Convergence properties of the \(k\)-means algorithms. Adv. Neural Inf. Process. Syst. 3, 82 (1995) Bottou, L., Bengio, Y.: Convergence properties of the \(k\)-means algorithms. Adv. Neural Inf. Process. Syst. 3, 82 (1995)
6.
go back to reference Chaudhari, P., Choromanska, A., Soatto, S., LeCun, Y., Baldassi, C., Borgs, C., Chayes, J., Sagun, L., Zecchina, R.: Entropy-SGD: Biasing Gradient Descent into Wide Valleys (2016). arXiv:1611.01838 Chaudhari, P., Choromanska, A., Soatto, S., LeCun, Y., Baldassi, C., Borgs, C., Chayes, J., Sagun, L., Zecchina, R.: Entropy-SGD: Biasing Gradient Descent into Wide Valleys (2016). arXiv:​1611.​01838
7.
go back to reference Chaudhari, P., Oberman, A., Osher, S., Soatto, S., Carlier, G.: Deep Relaxation: Partial Differential Equations for Optimizing Deep Neural Networks (2017). arXiv:1704.04932 Chaudhari, P., Oberman, A., Osher, S., Soatto, S., Carlier, G.: Deep Relaxation: Partial Differential Equations for Optimizing Deep Neural Networks (2017). arXiv:​1704.​04932
8.
go back to reference Ding, Y., Zhao, Y., Shen, X., Musuvathi, M., Mytkowicz, T.: Yinyang \(k\)-means: a drop-in replacement of the classic \(k\)-means with consistent speedup. In: Proceedings of the 32nd International Conference on Machine Learning (2015) Ding, Y., Zhao, Y., Shen, X., Musuvathi, M., Mytkowicz, T.: Yinyang \(k\)-means: a drop-in replacement of the classic \(k\)-means with consistent speedup. In: Proceedings of the 32nd International Conference on Machine Learning (2015)
9.
go back to reference Elkan, C.: Using the triangle inequality to accelerate \(k\)-means. In: Proceedings of the 20th International Conference on Machine Learning (2003) Elkan, C.: Using the triangle inequality to accelerate \(k\)-means. In: Proceedings of the 20th International Conference on Machine Learning (2003)
10.
go back to reference Kaplan, A., Tichatschke, R.: Proximal point method and nonconvex optimization. J. Global Optim. 13, 389–406 (1998)MathSciNetCrossRef Kaplan, A., Tichatschke, R.: Proximal point method and nonconvex optimization. J. Global Optim. 13, 389–406 (1998)MathSciNetCrossRef
11.
go back to reference LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef
12.
go back to reference LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef
14.
15.
go back to reference Newling, J., Fleuret, F.: Nested mini-batch \(k\)-means. In: Advances in Neural Information Processing Systems, pp. 1352–1360 (2016) Newling, J., Fleuret, F.: Nested mini-batch \(k\)-means. In: Advances in Neural Information Processing Systems, pp. 1352–1360 (2016)
16.
17.
go back to reference Sculley, D.: Web-scale \(k\)-means clustering. In: Proceedings of the 19th International Conference on World wide web. ACM (2010) Sculley, D.: Web-scale \(k\)-means clustering. In: Proceedings of the 19th International Conference on World wide web. ACM (2010)
Metadata
Title
Stochastic Backward Euler: An Implicit Gradient Descent Algorithm for k-Means Clustering
Authors
Penghang Yin
Minh Pham
Adam Oberman
Stanley Osher
Publication date
31-05-2018
Publisher
Springer US
Published in
Journal of Scientific Computing / Issue 2/2018
Print ISSN: 0885-7474
Electronic ISSN: 1573-7691
DOI
https://doi.org/10.1007/s10915-018-0744-4

Other articles of this Issue 2/2018

Journal of Scientific Computing 2/2018 Go to the issue

Premium Partner