Skip to main content

2019 | OriginalPaper | Buchkapitel

A Binary Optimization Approach for Constrained K-Means Clustering

verfasst von : Huu M. Le, Anders Eriksson, Thanh-Toan Do, Michael Milford

Erschienen in: Computer Vision – ACCV 2018

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

K-Means clustering still plays an important role in many computer vision problems. While the conventional Lloyd method, which alternates between centroid update and cluster assignment, is primarily used in practice, it may converge to solutions with empty clusters. Furthermore, some applications may require the clusters to satisfy a specific set of constraints, e.g., cluster sizes, must-link/cannot-link. Several methods have been introduced to solve constrained K-Means clustering. Due to the non-convex nature of K-Means, however, existing approaches may result in sub-optimal solutions that poorly approximate the true clusters. In this work, we provide a new perspective to tackle this problem by considering constrained K-Means as a special instance of Binary Optimization. We then propose a novel optimization scheme to search for feasible solutions in the binary domain. This approach allows us to solve constrained K-Means clustering in such a way that multiple types of constraints can be simultaneously enforced. Experimental results on synthetic and real datasets show that our method provides better clustering accuracy with faster run time compared to several existing techniques.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Althoff, T., Ulges, A., Dengel, A.: Balanced clustering for content-based image browsing. Series of the Gesellschaft fur Informatik, pp. 27–30 (2011) Althoff, T., Ulges, A., Dengel, A.: Balanced clustering for content-based image browsing. Series of the Gesellschaft fur Informatik, pp. 27–30 (2011)
2.
Zurück zum Zitat Bertacco, L., Fischetti, M., Lodi, A.: A feasibility pump heuristic for general mixed-integer problems. Discret. Optim. 4(1), 63–76 (2007)MathSciNetCrossRef Bertacco, L., Fischetti, M., Lodi, A.: A feasibility pump heuristic for general mixed-integer problems. Discret. Optim. 4(1), 63–76 (2007)MathSciNetCrossRef
3.
Zurück zum Zitat Bradley, P., Bennett, K., Demiriz, A.: Constrained K-Means Clustering, pp. 1–8. Microsoft Research, Redmond (2000) Bradley, P., Bennett, K., Demiriz, A.: Constrained K-Means Clustering, pp. 1–8. Microsoft Research, Redmond (2000)
4.
Zurück zum Zitat Fard, M.M., Thonet, T., Gaussier, E.: Deep \(k\)-means: jointly clustering with \(k\)-means and learning representations. arXiv preprint arXiv:1806.10069 (2018) Fard, M.M., Thonet, T., Gaussier, E.: Deep \(k\)-means: jointly clustering with \(k\)-means and learning representations. arXiv preprint arXiv:​1806.​10069 (2018)
6.
Zurück zum Zitat Ge, T., He, K., Ke, Q., Sun, J.: Optimized product quantization for approximate nearest neighbor search. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2946–2953. IEEE (2013) Ge, T., He, K., Ke, Q., Sun, J.: Optimized product quantization for approximate nearest neighbor search. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2946–2953. IEEE (2013)
7.
Zurück zum Zitat Geißler, B., Morsi, A., Schewe, L., Schmidt, M.: Penalty alternating direction methods for mixed-integer optimization: a new view on feasibility pumps. SIAM J. Optim. 27(3), 1611–1636 (2017)MathSciNetCrossRef Geißler, B., Morsi, A., Schewe, L., Schmidt, M.: Penalty alternating direction methods for mixed-integer optimization: a new view on feasibility pumps. SIAM J. Optim. 27(3), 1611–1636 (2017)MathSciNetCrossRef
9.
Zurück zum Zitat Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)CrossRef Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)CrossRef
10.
Zurück zum Zitat Johnson, S.C.: Hierarchical clustering schemes. Psychometrika 32(3), 241–254 (1967)CrossRef Johnson, S.C.: Hierarchical clustering schemes. Psychometrika 32(3), 241–254 (1967)CrossRef
11.
Zurück zum Zitat Kalantidis, Y., Avrithis, Y.: Locally optimized product quantization for approximate nearest neighbor search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2321–2328 (2014) Kalantidis, Y., Avrithis, Y.: Locally optimized product quantization for approximate nearest neighbor search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2321–2328 (2014)
12.
Zurück zum Zitat Khan, S.S., Ahmad, A.: Cluster center initialization algorithm for k-means clustering. Pattern Recogn. Lett. 25(11), 1293–1302 (2004)CrossRef Khan, S.S., Ahmad, A.: Cluster center initialization algorithm for k-means clustering. Pattern Recogn. Lett. 25(11), 1293–1302 (2004)CrossRef
13.
Zurück zum Zitat Le Tan, D.K., Le, H., Hoang, T., Do, T.T., Cheung, N.M.: DeepVQ: a deep network architecture for vector quantization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 2579–2582 (2018) Le Tan, D.K., Le, H., Hoang, T., Do, T.T., Cheung, N.M.: DeepVQ: a deep network architecture for vector quantization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 2579–2582 (2018)
14.
Zurück zum Zitat LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef
15.
Zurück zum Zitat Li, Z., Liu, J.: Constrained clustering by spectral kernel learning. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 421–427. IEEE (2009) Li, Z., Liu, J.: Constrained clustering by spectral kernel learning. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 421–427. IEEE (2009)
16.
Zurück zum Zitat Liu, H., Han, J., Nie, F., Li, X.: Balanced clustering with least square regression (2017) Liu, H., Han, J., Nie, F., Li, X.: Balanced clustering with least square regression (2017)
17.
Zurück zum Zitat MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, vol. 1, pp. 281–297 (1967) MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, vol. 1, pp. 281–297 (1967)
18.
Zurück zum Zitat Mahajan, M., Nimbhorkar, P., Varadarajan, K.: The planar k-means problem is NP-hard. Theor. Comput. Sci. 442, 13–21 (2012)MathSciNetCrossRef Mahajan, M., Nimbhorkar, P., Varadarajan, K.: The planar k-means problem is NP-hard. Theor. Comput. Sci. 442, 13–21 (2012)MathSciNetCrossRef
20.
Zurück zum Zitat Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)MathSciNetMATH Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)MathSciNetMATH
21.
Zurück zum Zitat Pena, J.M., Lozano, J.A., Larranaga, P.: An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recogn. Lett. 20(10), 1027–1040 (1999)CrossRef Pena, J.M., Lozano, J.A., Larranaga, P.: An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recogn. Lett. 20(10), 1027–1040 (1999)CrossRef
22.
Zurück zum Zitat Rifkin, R.M., Lippert, R.A.: Notes on regularized least squares (2007) Rifkin, R.M., Lippert, R.A.: Notes on regularized least squares (2007)
23.
Zurück zum Zitat Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means clustering with background knowledge. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 577–584. Citeseer (2001) Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means clustering with background knowledge. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 577–584. Citeseer (2001)
24.
Zurück zum Zitat Wright, S., Nocedal, J.: Numerical optimization. Springer Sci. 35(67–68), 7 (1999)MATH Wright, S., Nocedal, J.: Numerical optimization. Springer Sci. 35(67–68), 7 (1999)MATH
25.
Zurück zum Zitat Yang, B., Fu, X., Sidiropoulos, N.D., Hong, M.: Towards k-means-friendly spaces: simultaneous deep learning and clustering. arXiv preprint arXiv:1610.04794 (2016) Yang, B., Fu, X., Sidiropoulos, N.D., Hong, M.: Towards k-means-friendly spaces: simultaneous deep learning and clustering. arXiv preprint arXiv:​1610.​04794 (2016)
26.
Zurück zum Zitat Zhu, S., Wang, D., Li, T.: Data clustering with size constraints. Knowl.-Based Syst. 23(8), 883–889 (2010)CrossRef Zhu, S., Wang, D., Li, T.: Data clustering with size constraints. Knowl.-Based Syst. 23(8), 883–889 (2010)CrossRef
Metadaten
Titel
A Binary Optimization Approach for Constrained K-Means Clustering
verfasst von
Huu M. Le
Anders Eriksson
Thanh-Toan Do
Michael Milford
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-20870-7_24

Premium Partner