nach oben

Datenbank-Spektrum

Erschienen in:

06.09.2018 | Schwerpunktbeitrag

Efficient and Scalable k‑Means on GPUs

verfasst von: Clemens Lutz, Sebastian Breß, Tilmann Rabl, Steffen Zeuch, Volker Markl

Erschienen in: Datenbank-Spektrum | Ausgabe 3/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

k-Means is a versatile clustering algorithm widely used in practice. To cluster large data sets, state-of-the-art implementations use GPUs to shorten the data to knowledge time. These implementations commonly assign points on a GPU and update centroids on a CPU.

We identify two main shortcomings of this approach. First, it requires expensive data exchange between processors when switching between the two processing steps point assignment and centroid update. Second, even when processing both steps of k-means on the same processor, points still need to be read two times within an iteration, leading to inefficient use of memory bandwidth.

In this paper, we present a novel approach for centroid update that allows us to efficiently process both phases of k-means on GPUs. We fuse point assignment and centroid update to execute one iteration with a single pass over the points. Our evaluation shows that our k-means approach scales to very large data sets. Overall, we achieve up to 20 × higher throughput compared to the state-of-the-art approach.

Vorheriger Artikel Integration of FPGAs in Database Management Systems: Challenges and Opportunities

Nächster Artikel Data Management on Non-Volatile Memory: A Perspective

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Datenbank-Spektrum

Datenbank-Spektrum ist das offizielle Organ der Fachgruppe Datenbanken und Information Retrieval der Gesellschaft für Informatik (GI) e.V. Die Zeitschrift widmet sich den Themen Datenbanken, Datenbankanwendungen und Information Retrieval.

Jetzt informieren

We previously sketched our work as a two-page short paper [25].

Note that the Cross-Processing strategy uses the GPU for point assignment, whereas Single-Pass and Multi-Pass are executed on CPU only. Therefore we include the Cross-Processing strategy in both plots.

Amazon EC (2018) Amazon ec2 pricing. https://aws.amazon.com/ec2/pricing/on-demand. Accessed: 25 May 2018

Arthur D, Vassilvitskii S (2007) k‑means++: The advantages of careful seeding. In: ACM-SIAM, pp 1027–1035

Bai H et al (2009) k‑means on commodity GPUs with CUDA. In: WRI CSIE, pp 651–655

Breß S, Funke H, Teubner J (2016) Robust query processing in co-processor-accelerated databases. In: SIGMOD, pp 1891–1906CrossRef

Breß S et al (2017) Generating custom code for efficient query execution on heterogeneous processors. CoRR abs/1709.00700

Cao F, Tung AKH, Zhou A (2006) Scalable clustering using graphics processors. In: WAIM, pp 372–384

Cassou C (2008) Intraseasonal interaction between the madden–julian oscillation and the north atlantic oscillation. Nature 455(7212):523–527CrossRef

Che S et al (2009) Rodinia: a benchmark suite for heterogeneous computing. In: IISWC, pp 44–54

Dall M et al (2017) Arctic sea ice melt leads to atmospheric new particle formation. Sci Rep 7(1):3318CrossRef

10.

Elkan C (2003) Using the triangle inequality to accelerate k‑means. In: ICML, pp 147–153

11.

Fang W et al (2008) Parallel data mining on graphics processors. Tech. Rep. HKUST-CS08-07, HKUST

12.

Farivar R et al (2008) A parallel implementation of k‑means clustering on GPUs. In: PDPTA, pp 340–345

13.

Fernando R (2004) GPU gems: programming techniques, tips and tricks for real-time graphics. In: Pearson higher education (chap 37.2)

14.

Funke H et al (2018) Pipelined query processing in coprocessor environments. In: SIGMOD, ACM

15.

Hall J, Hart J (2004) GPU acceleration of iterative clustering. In: GPGPU, pp 45–52

16.

He B et al (2009) Relational query coprocessing on graphics processors. ACM Trans Database Syst. https://doi.org/10.1145/1620585.1620588 CrossRef

17.

Heimel M et al (2013) Hardware-oblivious parallelism for in-memory column-stores. Proceedings VLDB Endowment 6(9):709–720CrossRef

18.

Heintzman ND et al (2007) Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet 39(3):311CrossRef

19.

Hellerstein J et al (2012) The MADlib analytics library or MAD skills, the SQL. Proceedings VLDB Endowment 5(12):1700–1711CrossRef

20.

Karnagel T, Müller R, Lohman GM (2015) Optimizing GPU-accelerated group-by and aggregation. In: ADMS, pp 13–24

21.

Kleisner KM et al (2016) The effects of sub-regional climate velocity on the distribution and spatial extent of marine species assemblages. PLoS ONE 11:1–21CrossRef

22.

Lee S et al (2016) Evaluation of k‑means data clustering algorithm on intel xeon phi. In: BigData, pp 2251–2260

23.

Li Y et al (2010) Speeding up k‑means algorithm by GPUs. In: IEEE CIT, pp 115–122

24.

Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–136MathSciNetCrossRef

25.

Lutz C et al (2018) Efficient k‑means on GPUs. In: DaMoN https://doi.org/10.1145/3211922.3211925 CrossRef

26.

MacQueen J et al (1967) Some methods for classification and analysis of multivariate observations. In: Proc. Fifth Berkeley Symp. on Math. Statist. and Prob., vol 1, pp 281–297

27.

Mhembere D et al (2017) knor: A NUMA-optimized in-memory, distributed and semi-external-memory k‑means library. In: HPDC

28.

Müller I et al (2015) Cache-efficient aggregation: hashing is sorting. In: SIGMOD, pp 1123–1136

29.

Nugteren C et al (2011) High performance predictable histogramming on GPUs: exploring and evaluating algorithm trade-offs. In: GPGPU, p 1

30.

Nvidia (2017a) CUDA C programming guide. Tech. Rep. PG-02829-001_v8.0. http://docs.nvidia.com/pdf/CUDA_C_Programming_Guide.pdf. Accessed: 20 Jan 2017

31.

Nvidia (2017b) Tuning CUDA applications for maxwell. Tech. Rep. DA-07173-001_v9.0. http://docs.nvidia.com/cuda/pdf/Maxwell_Tuning_Guide.pdf. Accessed: 20 Jan 2017

32.

Passing L et al (2017) SQL- and operator-centric data analytics in relational main-memory databases. In: EDBT, pp 84–95

33.

Pirk H, Manegold S, Kersten ML (2014) Waste not…efficient co-processing of relational data. In: ICDE, pp 508–519

34.

Pirk H et al (2016) Voodoo – A vector algebra for portable database performance on modern hardware. Proceedings VLDB Endowment 9(14):1707–1718CrossRef

35.

Sanderson C, Curtin R (2016) Armadillo: a template-based c++ library for linear algebra. J Open Source Softw. https://doi.org/10.21105/joss.00026 CrossRef

36.

Shalom A, Dash M, Tue M (2008) Efficient k‑means clustering using accelerated graphics processors. In: DaWaK, pp 166–175

37.

Shindler M, Wong A, Meyerson AW (2011) Fast and accurate k‑means for large datasets. In: NIPS, pp 2375–2383

38.

Sitaridi EA, Ross KA (2013) Optimizing select conditions on gpus. In: DaMoN, p 4

39.

Stehle E, Jacobsen H (2017) A memory bandwidth-efficient hybrid radix sort on GPUs. In: SIGMOD, pp 417–432

40.

TPC-H (2017) Transaction processing performance council. http://www.tpc.org/tpch. Accessed: 29 Sep 2017

41.

Vitak SA et al (2017) Sequencing thousands of single-cell genomes with combinatorial indexing. Nat Methods 14(3):302CrossRef

42.

Wu F et al (2013) A vectorized k‑means algorithm for intel many integrated core architecture. In: APPT, pp 277–294

43.

Zang C et al (2016) High-dimensional genomic data bias correction and data integration using mancie. Nat Commun 7:11305CrossRef

44.

Zhang T, Ramakrishnan R, Livny M (1996) Birch: an efficient data clustering method for very large databases. In: SIGMOD, pp 103–114CrossRef

Titel: Efficient and Scalable k‑Means on GPUs
verfasst von: Clemens Lutz
Sebastian Breß
Tilmann Rabl
Steffen Zeuch
Volker Markl
Publikationsdatum: 06.09.2018
Verlag: Springer Berlin Heidelberg
Erschienen in: Datenbank-Spektrum / Ausgabe 3/2018
Print ISSN: 1618-2162
Elektronische ISSN: 1610-1995
DOI: https://doi.org/10.1007/s13222-018-0293-x

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Datenbank-Spektrum

Weitere Artikel der Ausgabe 3/2018

Editorial

Cooking DBMS Operations using Granular Primitives

Data Management on Non-Volatile Memory: A Perspective

Dissertationen

Scalable Data Management on Modern Networks

Integration of FPGAs in Database Management Systems: Challenges and Opportunities

Premium Partner