Skip to main content
Erschienen in: Datenbank-Spektrum 3/2018

06.09.2018 | Schwerpunktbeitrag

Efficient and Scalable k‑Means on GPUs

verfasst von: Clemens Lutz, Sebastian Breß, Tilmann Rabl, Steffen Zeuch, Volker Markl

Erschienen in: Datenbank-Spektrum | Ausgabe 3/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

k-Means is a versatile clustering algorithm widely used in practice. To cluster large data sets, state-of-the-art implementations use GPUs to shorten the data to knowledge time. These implementations commonly assign points on a GPU and update centroids on a CPU.
We identify two main shortcomings of this approach. First, it requires expensive data exchange between processors when switching between the two processing steps point assignment and centroid update. Second, even when processing both steps of k-means on the same processor, points still need to be read two times within an iteration, leading to inefficient use of memory bandwidth.
In this paper, we present a novel approach for centroid update that allows us to efficiently process both phases of k-means on GPUs. We fuse point assignment and centroid update to execute one iteration with a single pass over the points. Our evaluation shows that our k-means approach scales to very large data sets. Overall, we achieve up to 20 × higher throughput compared to the state-of-the-art approach.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Fußnoten
1
We previously sketched our work as a two-page short paper [25].
 
2
Note that the Cross-Processing strategy uses the GPU for point assignment, whereas Single-Pass and Multi-Pass are executed on CPU only. Therefore we include the Cross-Processing strategy in both plots.
 
Literatur
2.
Zurück zum Zitat Arthur D, Vassilvitskii S (2007) k‑means++: The advantages of careful seeding. In: ACM-SIAM, pp 1027–1035 Arthur D, Vassilvitskii S (2007) k‑means++: The advantages of careful seeding. In: ACM-SIAM, pp 1027–1035
3.
Zurück zum Zitat Bai H et al (2009) k‑means on commodity GPUs with CUDA. In: WRI CSIE, pp 651–655 Bai H et al (2009) k‑means on commodity GPUs with CUDA. In: WRI CSIE, pp 651–655
4.
Zurück zum Zitat Breß S, Funke H, Teubner J (2016) Robust query processing in co-processor-accelerated databases. In: SIGMOD, pp 1891–1906CrossRef Breß S, Funke H, Teubner J (2016) Robust query processing in co-processor-accelerated databases. In: SIGMOD, pp 1891–1906CrossRef
5.
Zurück zum Zitat Breß S et al (2017) Generating custom code for efficient query execution on heterogeneous processors. CoRR abs/1709.00700 Breß S et al (2017) Generating custom code for efficient query execution on heterogeneous processors. CoRR abs/1709.00700
6.
Zurück zum Zitat Cao F, Tung AKH, Zhou A (2006) Scalable clustering using graphics processors. In: WAIM, pp 372–384 Cao F, Tung AKH, Zhou A (2006) Scalable clustering using graphics processors. In: WAIM, pp 372–384
7.
Zurück zum Zitat Cassou C (2008) Intraseasonal interaction between the madden–julian oscillation and the north atlantic oscillation. Nature 455(7212):523–527CrossRef Cassou C (2008) Intraseasonal interaction between the madden–julian oscillation and the north atlantic oscillation. Nature 455(7212):523–527CrossRef
8.
Zurück zum Zitat Che S et al (2009) Rodinia: a benchmark suite for heterogeneous computing. In: IISWC, pp 44–54 Che S et al (2009) Rodinia: a benchmark suite for heterogeneous computing. In: IISWC, pp 44–54
9.
Zurück zum Zitat Dall M et al (2017) Arctic sea ice melt leads to atmospheric new particle formation. Sci Rep 7(1):3318CrossRef Dall M et al (2017) Arctic sea ice melt leads to atmospheric new particle formation. Sci Rep 7(1):3318CrossRef
10.
Zurück zum Zitat Elkan C (2003) Using the triangle inequality to accelerate k‑means. In: ICML, pp 147–153 Elkan C (2003) Using the triangle inequality to accelerate k‑means. In: ICML, pp 147–153
11.
Zurück zum Zitat Fang W et al (2008) Parallel data mining on graphics processors. Tech. Rep. HKUST-CS08-07, HKUST Fang W et al (2008) Parallel data mining on graphics processors. Tech. Rep. HKUST-CS08-07, HKUST
12.
Zurück zum Zitat Farivar R et al (2008) A parallel implementation of k‑means clustering on GPUs. In: PDPTA, pp 340–345 Farivar R et al (2008) A parallel implementation of k‑means clustering on GPUs. In: PDPTA, pp 340–345
13.
Zurück zum Zitat Fernando R (2004) GPU gems: programming techniques, tips and tricks for real-time graphics. In: Pearson higher education (chap 37.2) Fernando R (2004) GPU gems: programming techniques, tips and tricks for real-time graphics. In: Pearson higher education (chap 37.2)
14.
Zurück zum Zitat Funke H et al (2018) Pipelined query processing in coprocessor environments. In: SIGMOD, ACM Funke H et al (2018) Pipelined query processing in coprocessor environments. In: SIGMOD, ACM
15.
Zurück zum Zitat Hall J, Hart J (2004) GPU acceleration of iterative clustering. In: GPGPU, pp 45–52 Hall J, Hart J (2004) GPU acceleration of iterative clustering. In: GPGPU, pp 45–52
17.
Zurück zum Zitat Heimel M et al (2013) Hardware-oblivious parallelism for in-memory column-stores. Proceedings VLDB Endowment 6(9):709–720CrossRef Heimel M et al (2013) Hardware-oblivious parallelism for in-memory column-stores. Proceedings VLDB Endowment 6(9):709–720CrossRef
18.
Zurück zum Zitat Heintzman ND et al (2007) Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet 39(3):311CrossRef Heintzman ND et al (2007) Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet 39(3):311CrossRef
19.
Zurück zum Zitat Hellerstein J et al (2012) The MADlib analytics library or MAD skills, the SQL. Proceedings VLDB Endowment 5(12):1700–1711CrossRef Hellerstein J et al (2012) The MADlib analytics library or MAD skills, the SQL. Proceedings VLDB Endowment 5(12):1700–1711CrossRef
20.
Zurück zum Zitat Karnagel T, Müller R, Lohman GM (2015) Optimizing GPU-accelerated group-by and aggregation. In: ADMS, pp 13–24 Karnagel T, Müller R, Lohman GM (2015) Optimizing GPU-accelerated group-by and aggregation. In: ADMS, pp 13–24
21.
Zurück zum Zitat Kleisner KM et al (2016) The effects of sub-regional climate velocity on the distribution and spatial extent of marine species assemblages. PLoS ONE 11:1–21CrossRef Kleisner KM et al (2016) The effects of sub-regional climate velocity on the distribution and spatial extent of marine species assemblages. PLoS ONE 11:1–21CrossRef
22.
Zurück zum Zitat Lee S et al (2016) Evaluation of k‑means data clustering algorithm on intel xeon phi. In: BigData, pp 2251–2260 Lee S et al (2016) Evaluation of k‑means data clustering algorithm on intel xeon phi. In: BigData, pp 2251–2260
23.
Zurück zum Zitat Li Y et al (2010) Speeding up k‑means algorithm by GPUs. In: IEEE CIT, pp 115–122 Li Y et al (2010) Speeding up k‑means algorithm by GPUs. In: IEEE CIT, pp 115–122
26.
Zurück zum Zitat MacQueen J et al (1967) Some methods for classification and analysis of multivariate observations. In: Proc. Fifth Berkeley Symp. on Math. Statist. and Prob., vol 1, pp 281–297 MacQueen J et al (1967) Some methods for classification and analysis of multivariate observations. In: Proc. Fifth Berkeley Symp. on Math. Statist. and Prob., vol 1, pp 281–297
27.
Zurück zum Zitat Mhembere D et al (2017) knor: A NUMA-optimized in-memory, distributed and semi-external-memory k‑means library. In: HPDC Mhembere D et al (2017) knor: A NUMA-optimized in-memory, distributed and semi-external-memory k‑means library. In: HPDC
28.
Zurück zum Zitat Müller I et al (2015) Cache-efficient aggregation: hashing is sorting. In: SIGMOD, pp 1123–1136 Müller I et al (2015) Cache-efficient aggregation: hashing is sorting. In: SIGMOD, pp 1123–1136
29.
Zurück zum Zitat Nugteren C et al (2011) High performance predictable histogramming on GPUs: exploring and evaluating algorithm trade-offs. In: GPGPU, p 1 Nugteren C et al (2011) High performance predictable histogramming on GPUs: exploring and evaluating algorithm trade-offs. In: GPGPU, p 1
32.
Zurück zum Zitat Passing L et al (2017) SQL- and operator-centric data analytics in relational main-memory databases. In: EDBT, pp 84–95 Passing L et al (2017) SQL- and operator-centric data analytics in relational main-memory databases. In: EDBT, pp 84–95
33.
Zurück zum Zitat Pirk H, Manegold S, Kersten ML (2014) Waste not…efficient co-processing of relational data. In: ICDE, pp 508–519 Pirk H, Manegold S, Kersten ML (2014) Waste not…efficient co-processing of relational data. In: ICDE, pp 508–519
34.
Zurück zum Zitat Pirk H et al (2016) Voodoo – A vector algebra for portable database performance on modern hardware. Proceedings VLDB Endowment 9(14):1707–1718CrossRef Pirk H et al (2016) Voodoo – A vector algebra for portable database performance on modern hardware. Proceedings VLDB Endowment 9(14):1707–1718CrossRef
36.
Zurück zum Zitat Shalom A, Dash M, Tue M (2008) Efficient k‑means clustering using accelerated graphics processors. In: DaWaK, pp 166–175 Shalom A, Dash M, Tue M (2008) Efficient k‑means clustering using accelerated graphics processors. In: DaWaK, pp 166–175
37.
Zurück zum Zitat Shindler M, Wong A, Meyerson AW (2011) Fast and accurate k‑means for large datasets. In: NIPS, pp 2375–2383 Shindler M, Wong A, Meyerson AW (2011) Fast and accurate k‑means for large datasets. In: NIPS, pp 2375–2383
38.
Zurück zum Zitat Sitaridi EA, Ross KA (2013) Optimizing select conditions on gpus. In: DaMoN, p 4 Sitaridi EA, Ross KA (2013) Optimizing select conditions on gpus. In: DaMoN, p 4
39.
Zurück zum Zitat Stehle E, Jacobsen H (2017) A memory bandwidth-efficient hybrid radix sort on GPUs. In: SIGMOD, pp 417–432 Stehle E, Jacobsen H (2017) A memory bandwidth-efficient hybrid radix sort on GPUs. In: SIGMOD, pp 417–432
41.
Zurück zum Zitat Vitak SA et al (2017) Sequencing thousands of single-cell genomes with combinatorial indexing. Nat Methods 14(3):302CrossRef Vitak SA et al (2017) Sequencing thousands of single-cell genomes with combinatorial indexing. Nat Methods 14(3):302CrossRef
42.
Zurück zum Zitat Wu F et al (2013) A vectorized k‑means algorithm for intel many integrated core architecture. In: APPT, pp 277–294 Wu F et al (2013) A vectorized k‑means algorithm for intel many integrated core architecture. In: APPT, pp 277–294
43.
Zurück zum Zitat Zang C et al (2016) High-dimensional genomic data bias correction and data integration using mancie. Nat Commun 7:11305CrossRef Zang C et al (2016) High-dimensional genomic data bias correction and data integration using mancie. Nat Commun 7:11305CrossRef
44.
Zurück zum Zitat Zhang T, Ramakrishnan R, Livny M (1996) Birch: an efficient data clustering method for very large databases. In: SIGMOD, pp 103–114CrossRef Zhang T, Ramakrishnan R, Livny M (1996) Birch: an efficient data clustering method for very large databases. In: SIGMOD, pp 103–114CrossRef
Metadaten
Titel
Efficient and Scalable k‑Means on GPUs
verfasst von
Clemens Lutz
Sebastian Breß
Tilmann Rabl
Steffen Zeuch
Volker Markl
Publikationsdatum
06.09.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
Datenbank-Spektrum / Ausgabe 3/2018
Print ISSN: 1618-2162
Elektronische ISSN: 1610-1995
DOI
https://doi.org/10.1007/s13222-018-0293-x

Weitere Artikel der Ausgabe 3/2018

Datenbank-Spektrum 3/2018 Zur Ausgabe

Editorial

Editorial

Dissertationen

Dissertationen

Premium Partner