nach oben

The Journal of Supercomputing

Erschienen in:

01.07.2014

Scalable CAIM discretization on multiple GPUs using concurrent kernels

verfasst von: Alberto Cano, Sebastián Ventura, Krzysztof J. Cios

Erschienen in: The Journal of Supercomputing | Ausgabe 1/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Class-attribute interdependence maximization (CAIM) is one of the state-of-the-art algorithms for discretizing data for which classes are known. However, it may take a long time when run on high-dimensional large-scale data, with large number of attributes and/or instances. This paper presents a solution to this problem by introducing a graphic processing unit (GPU)-based implementation of the CAIM algorithm that significantly speeds up the discretization process on big complex data sets. The GPU-based implementation is scalable to multiple GPU devices and enables the use of concurrent kernels execution capabilities of modern GPUs. The CAIM GPU-based model is evaluated and compared with the original CAIM using single and multi-threaded parallel configurations on 40 data sets with different characteristics. The results show great speedup, up to 139 times faster using four GPUs, which makes discretization of big data efficient and manageable. For example, discretization time of one big data set is reduced from 2 h to \(<\)2 min.

Vorheriger Artikel Evaluating the SAT problem on P systems for different high-performance architectures

Nächster Artikel The decycling problem in hierarchical cubic networks

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

The data sets description, the algorithm’s source code, the experimental settings and results are fully described and publicly available to facilitate the replicability of the experiments and future comparisons at the website: http://www.uco.es/grupos/kdis/kdiswiki/CAIM-GPU.

Akl SG (1990) Parallel sorting algorithms., Notes and reports in computer science and applied mathematicsAcademic Press, Orlando

Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. Analysis framework. J Mult Valued Logic Soft Comput 17:255–287

Angiulli F, Pizzuti C (2005) Outlier mining in large high-dimensional data sets. IEEE Trans Knowl Data Eng 17(2):203–215CrossRefMathSciNet

Bentley JL, McIlroy MD (1993) Engineering a sort function. Softw Pract Exp 23(11):1249–1265CrossRef

Bernaschi M, Bisson M, Fatica M, Phillips E (2012) An introduction to multi-GPU programming for physicists. Eur Phys J Special Top 210:17–31CrossRef

Brodtkorb AR, Hagen TR, Stra ML (2013) Graphics processing unit (GPU) programming strategies and trends in GPU computing. J Parallel Distrib Comput 73(1):4–13CrossRef

Cannataro M, Talia D, Srimani P (2002) Parallel data intensive computing in scientific and commercial applications. Parallel Comput 28(5):673–704CrossRef

Cano A, Luna JM, Ventura S (2013) High performance evaluation of evolutionary-mined association rules on GPUs. J Supercomput 66(3):1438–1461CrossRef

Cano A, Zafra A, Ventura S (2012) Speeding up the evaluation phase of GP classification algorithms on GPUs. Soft Comput 16(2):187–202CrossRef

10.

Cederman D, Tsigas P (2010) GPU-quicksort: a practical quicksort algorithm for graphics processors. J Exp Algorithm 14:4–24

11.

Cerquides J, Mantaras RLD (1997) Proposal and empirical comparison of a parallelizable distance-based discretization method. In: Proceedings of the international conference on knowledge discovery and data mining, pp 139–142

12.

Che S, Boyer M, Meng J, Tarjan D, Sheaffer JW, Skadron K (2008) A performance study of general-purpose applications on graphics processors using CUDA. J Parallel Distrib Comput 68(10):1370–1380CrossRef

13.

Cios KJ, Pedrycz W, Swiniarski RW, Kurgan LA (2007) Data mining: a knowledge discovery approach. Springer

14.

Cormen TH, Stein C, Rivest RL, Leiserson CE (2001) Introduction to Algorithms. In: 2nd edn. McGraw-Hill

15.

Davidson A, Tarjan D, Garland M, Owens JD (2012) Efficient parallel merge sort for fixed and variable length keys. In: Proceedings of international conference on innovative parallel computing, pp 1–9

16.

Frank A, Asuncion A (2010) UCI machine learning repository

17.

Freitas AA, Lavington SH (1998) Mining very large databases with parallel processing. In: Kluwer international series on advances in database systems, vol 8. Kluwer

18.

García S, Luengo J, Saez J, Lopez V, Herrera F (2013) A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans Knowl Data Eng 25(4):734–750CrossRef

19.

Garland M, Le Grand S, Nickolls J, Anderson J, Hardwick J, Morton S, Phillips E, Zhang Y, Volkov V (2008) Parallel computing experiences with CUDA. IEEE Micro 28(4):13–27CrossRef

20.

Gómez-Luna J, González-Linares J, Benavides J, Guil N (2012) Performance models for asynchronous data transfers on consumer graphics processing units. J Parallel Distrib Comput 72(9):1117–1126CrossRef

21.

Green I, Robert C, Wang L, Alam M, Formato RA (2012) Central force optimization on a GPU: a case study in high performance metaheuristics. J Supercomput 62(1):378–398CrossRef

22.

Hoare CAR (1962) Quicksort. Comput J 5(1):10–16CrossRefMATHMathSciNet

23.

Hoberock J, Bell N (2011) Thrust: a productivity-oriented library for CUDA. In: Chapter 26, Morgan Kaufmann, pp 359–372

24.

Jian L, Wang C, Liu Y, Liang S, Yi W, Shi Y (2013) Parallel data mining techniques on graphics processing unit with compute unified device architecture (CUDA). J Supercomput 64(3):942–967

25.

Khan FG, Khan OU, Montrucchio B, Giaccone P (2011) Analysis of fast parallel sorting algorithms for GPU architectures. In: Proceedings of the international conference on frontiers of information technology, pp 173–178

26.

Kirk DB, Hwu W-MW (2010) Programming massively parallel processors: a hands-on approach. Morgan Kaufmann

27.

Knuth DE (1998) The art of computer programming. In: Sorting and searching, vol 3. 2nd edn. Addison Wesley

28.

Kurgan LA, Cios KJ (2004) CAIM discretization algorithm. IEEE Trans Knowl Data Eng 16(2):145–153CrossRef

29.

Li M, Deng S, Feng S, Fan J (2011) An effective discretization based on class-attribute coherence maximization. Pattern Recognit Lett 32(15):1962–1973CrossRef

30.

Merrill D, Grimshaw A (2010) Revisiting sorting for GPGPU stream architectures. In: Proceedings of the international conference on parallel architectures and compilation techniques, pp 545–546

31.

Merrill D, Grimshaw A (2010) Revisiting sorting for GPGPU stream architectures. In: Technical report CS2010-03, University of Virginia, Department of Computer Science, Charlottesville

32.

Merrill D, Grimshaw A (2011) High performance and scalable radix sorting: a case study of implementing dynamic parallelism for GPU computing. Parallel Process Lett 21(2):245–272CrossRefMathSciNet

33.

Navarro CA, Hitschfeld-Kahler N, Mateu L (2014) A survey on parallel computing and its applications in data-parallel problems using GPU architectures. Commun Comput Phys 15(2):285–329MathSciNet

34.

NVIDIA Corporation (2013) NVIDIA CUDA programming and best practices guide. http://www.nvidia.com/cuda

35.

Owens JD, Luebke D, Govindaraju N, Harris M, Krüger J, Lefohn AE, Purcell TJ (2007) A survey of general-purpose computation on graphics hardware. Comput Graph Forum 26(1):80–113CrossRef

36.

Parthasarathy S, Ramakrishnan A (2002) Parallel incremental 2d-discretization on dynamic datasets. In: Proceedings of the international conference on parallel and distributed processing systems, pp 247–254

37.

Peters H, Schulz-Hildebrandt O, Luttenberger N (2011) Fast in-place, comparison-based sorting with CUDA: a study with bitonic sort. Concur Comput Pract Exp 23(7):681–693CrossRef

38.

Rajaraman A, Ullman JD (2011) Mining of massive datasets. In: Cambridge University Press

39.

Satish N, Harris M, Garland M (2009) Designing efficient sorting algorithms for manycore GPUs. In: Proceedings of the IEEE international symposium on parallel & distributed processing, pp 1–10

40.

Schellmann M, Gorlatch S, Meilnder D, Ksters T, Schfers K, Wbbeling F, Burger M (2011) Parallel medical image reconstruction: from graphics processing units (GPU) to grids. J Supercomput 57(2):151–160CrossRef

41.

Shams R, Sadeghi P, Kennedy R, Hartley R (2010) A survey of medical image registration on multicore and the GPU. IEEE Signal Process Mag 27(2):50–60CrossRef

42.

Sintorn E, Assarsson U (2008) Fast parallel GPU-sorting using a hybrid algorithm. J Parallel Distrib Comput 68(10):1381–1388CrossRefMATH

43.

Sriwanna K, Puntumapon K, Waiyamai K (2012) An enhanced class-attribute interdependence maximization discretization algorithm. In: Proceedings of the 8th international conference on advanced data mining and applications, vol 7713. LNAI, pp 465–476

44.

Tatarchuk N, Shopf J, DeCoro C (2008) Advanced interactive medical visualization on the GPU. J Parallel Distrib Comput 68(10):1319–1328CrossRef

45.

Upadhyaya SR (2013) Parallel approaches to machine learning-a comprehensive survey. J Parallel Distrib Comput 73(3):284–292CrossRef

46.

Wittek P, Darnyi S (2013) Accelerating text mining workloads in a mapreduce-based distributed GPU environment. J Parallel Distrib Comput 73(2):198–206CrossRef

47.

Yang Y, Webb GI, Wu X (2010) Discretization methods. In: Data mining and knowledge discovery handbook, pp 101–116

48.

Yulong X, Xiaopeng W, Dawei X (2012) A two step parallel discretization algorithm based on dynamic clustering. In: Proceedings of the international conference on computer science and electronics engineering, vol 3. pp 192–196

49.

Zaki MJ, Ho CT (2000) Large-scale parallel data mining. In: State of the art survey. Lecture notes in artificial intelligence, Springer

50.

Zhang Y, Mueller F, Cui X, Potok T (2011) Data-intensive document clustering on graphics processing unit (GPU) clusters. J Parallel Distrib Comput 71(2):211–224CrossRef

51.

Zhao Y, Niu Z, Peng X, Dai L (2011) A discretization algorithm of numerical attributes for digital library evaluation based on data mining technology. In: Digital libraries: for cultural heritage, knowledge dissemination, and future creation, vol 7008. LNCS, pp 70–76

Titel: Scalable CAIM discretization on multiple GPUs using concurrent kernels
verfasst von: Alberto Cano
Sebastián Ventura
Krzysztof J. Cios
Publikationsdatum: 01.07.2014
Verlag: Springer US
Erschienen in: The Journal of Supercomputing / Ausgabe 1/2014
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-014-1151-8

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 1/2014

Concurrent warp execution: improving performance of GPU-likely SIMD architecture by increasing resource utilization

A localization algorithm for large scale mobile wireless sensor networks: a learning approach

The decycling problem in hierarchical cubic networks

Evaluating the SAT problem on P systems for different high-performance architectures

All-to-all communication with cellular automata agents in 2 grids: topologies, streets and performances

Improved extra group network: a new fault-tolerant multistage interconnection network