Top

The Journal of Supercomputing

Published in:

04-01-2021

Mille Cheval: a GPU-based in-memory high-performance computing framework for accelerated processing of big-data streams

Authors: Vivek Kumar, Dilip Kumar Sharma, Vinay Kumar Mishra

Published in: The Journal of Supercomputing | Issue 7/2021

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Streams are temporally ordered, rapid changing, ample in volume, and infinite in nature. It is nearly impossible to store the entire data stream due to its large volume and high velocity. In this work, the principle of parallelism is employed to accelerate stream data computing. GPU-based high-performance computing (HPC) framework is proposed for accelerated processing of big-data streams using the in-memory data structure. We have implemented three parallel algorithms to prove the viability of the framework. The contributions of Mille Cheval are: (1) the viability of streaming on accelerators to increase throughput, (2) carefully chosen hash algorithms to achieve low collision rate and high randomness, and (3) memory sketches for approximation. The objective is to leverage the power of a single node using in-memory computing and hybrid computing. HPC does not always require high-end hardware but well-designed algorithms. Achievements of Mille Cheval are: (1) relative error is 1.32 when error rate and overestimate rate are chosen as 0.001 and (2) the host memory space requirement is just 63 MB for 1 terabyte of data. The proposed algorithms are pragmatic. It is evident from experimental results that the framework demonstrates 10X speed-up as compared with CPU implementations and 3X speed-up as compared with GPU implementations.

previous article On the performance of a GPU-based SoC in a distributed spatial audio system

next article Electrooculogram-aided intelligent sensing and high-performance communication control system for massive ALS individuals

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Zhang H, Chen G, Ooi BC, Tan KL, Zhang M (2015) In-memory big data management and processing: a survey. IEEE Trans Knowl Data Eng 27(7):1920–1948CrossRef

Tran DH, Gaber MM, Sattler KU (2014) Change detection in streaming data in the era of big data: models and issues. ACM SIGKDD Explor Newsl 16(1):30–38CrossRef

(2013) Android 4.2 APIs—Android Developers (Online). developer.android.com

NVIDIA (2018) NVIDIA (Online). https://developer.nvidia.com/cuda-gpus

Wikipedia (2018) Wikipedia (Online). https://en.wikipedia.org/wiki/CUDA

Ian Boyd (2012) Stack Exchange (Online). https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed?answertab=active&newreg=5184e3c8172345458e0ef577b4a73c34

Karlsson K, Lans T (2013) Big data algorithm optimization. Chalmers University of Technology, Goteborg, Master of Science Thesis

Zhuoyang YS (2018) GitHub (Online). https://github.com/YSZhuoyang/count-min-parallel

Cormode G, Muthukrishnan M (2011) Approximating data with the count-min sketch. IEEE Softw 29(1):64–69CrossRef

10.

Graham C (2011) Sketch techniques for approximate query. Found Trends Databases

11.

Mittal S, Vetter J (2015) A survey of CPU-GPU heterogeneous computing techniques. ACM Comput Surv 47(4):1–35. https://doi.org/10.1145/2788396CrossRef

12.

Curtis et al AR (2011) DevoFlow: scaling flow management for high-performance. In: ACM SIGCOMM, pp. 254–265

13.

Debasish G (2014) Count-min sketch: a data structure for stream mining applications. DZone

14.

Wang H, Li N, Wang Z, Li J (2020) GPU-based efficient join algorithms on Hadoop. J Supercomput. arXiv:1904.11201, April 2020

15.

Pinnecke M, Broneske D, Saake G (2015) Toward GPU accelerated data stream processing. Genius Vision Digital, pp 78–83

16.

Rathore MM, Son H, Ahmad A, Paul A, Jeon G (2018) Real-time big data stream processing using GPU with spark over hadoop ecosystem. Int J Parallel Prog 46(3):630–646CrossRef

17.

Singh H, Venkat RS, Swagatika S, Saxena S (2020) GPU and CUDA in hard computing approaches: analytical review. Springer, Cham, pp 177–196

18.

Verner U, Schuster A, Mendelson A (2015) Processing real-time data streams on GPU-based systems. Technion, Haifa, Israel, Doctoral Dissertation

19.

Mencagli G, Torquati M, Lucattini F, Cuomo S, Aldinucci M (2018) Harnessing sliding-window execution semantics for parallel stream processing. J Parall Distrib Comput 116:74–88CrossRef

20.

Reuter Klaus, Köfinger Jürgen (2019) CADISHI: fast parallel calculation of particle-pair distance histograms on CPUs and GPUs. ScienceDirect 236:274–284

21.

Xu J, Ding W, Hu X, Gong Q (2019) VATE: a trade-off between memory and preserving time for high accurate cardinality estimation under sliding time window. Comput Commun 138:20–31CrossRef

22.

Guo R, Xue E, Zhang F, Zhao G, Qu G (2019) Optimizing the confidence bound of count-min sketches to estimate the streaming big data query results more precisely. Computing 1–27

23.

Bhattacharyya Shilpi, Katramatos Dimitrios, Yoo Shinjae (2018) Why wait? Let us start computing while the data is still on the wire. Fut Gen Comput Syst 89:563–574CrossRef

24.

Mandal A, Jiang H, Shrivastava A, Sarkar V (2018) Topkapi: parallel and fast sketches for finding top-K frequent elements. Adv Neural Inf Process Syst 10898–10908

25.

Wentao W, Yongjian Y, En W (2019) A distributed hierarchical heavy hitter detection method in software-defined networking. IEEE Access

26.

Epicoco I, Cafaro M, Pulimeno M (2018) Fast and accurate mining of correlated heavy hitters. Data Min Knowl Disc 32(1):162–186MathSciNetCrossRef

27.

Cafaro M, Epicoco I, Pulimeno M (2019) CMSS: sketching based reliable tracking of large network flows. Fut Gen Comput Syst 101:770–784CrossRef

28.

Yu X, Xu H, Yao D, Wang H, Huang L (2018) CountMax: a lightweight and cooperative sketch measurement for software-defined networks. IEEE/ACM Trans Netw 26(6):2774–2786CrossRef

29.

Tang Rui, Fong Simon (2018) Clustering big IoT data by metaheuristic optimized mini-batch and parallel partition-based DGC in Hadoop. Fut Gen Comput Syst 86:1395–1412CrossRef

30.

Zheng Z, Wang Z, Lipasti M (2015) Adaptive cache and concurrency allocation on GPGPUs. IEEE Comput Archit Lett 14(2):90–93CrossRef

31.

Mittal S (2015) A survey of techniques for managing and leveraging caches in GPUs. JCSC 23(8):1

32.

Ashkiani S, Li S, Farach-Colton M, Amenta N, Owens JD (2018) GPU LSM: a dynamic dictionary data structure for the GPU. In: IEEE international parallel and distributed processing symposium, Vancouver, pp 430–440

33.

Kim Mincheol, Liu Ling, Choi Wonik (2018) A GPU-aware parallel index for processing high-dimensional big data. IEEE Trans Comput 67(10):1388–1402MathSciNetCrossRef

34.

Astorga DR, Dolz MF, Fernández J, García JD (2018) Paving the way towards high-level parallel pattern interfaces for data stream processing. Fut Gen Comput Syst 87:228–241CrossRef

35.

Petrovič Filip et al (2020) A benchmark set of highly-efficient CUDA and OpenCL kernels and its dynamic autotuning with Kernel Tuning Toolkit. Future Generation Computer Systems 108:161–177CrossRef

36.

Peng Du et al (2012) From CUDA to OpenCL: towards a performance-portable solution for multi-platform GPU programming. Parallel Comput 38(8):391–407CrossRef

37.

Karthik P, Banu JS (2020) Frequent item set mining of large datasets using CUDA computing. In: Soft computing for problem solving. Singapore, pp 739–747

38.

Malyshkin VE (2019) Parallel computing technologies 2018. J Supercomput 75(12):7747–7749CrossRef

39.

Do CT, Choi HJ, Chung SW, Kim CH (2019) A novel warp scheduling scheme considering long-latency operations for high-performance GPUs. J Supercomput 1:1–20

40.

Tarditi D, Puri S, Oglesby J (2006) Accelerator: using data parallelism to program GPUs for general-purpose uses. ACM SIGARCH Comput Archit News 34(5):1CrossRef

41.

Constantinescu DA, Navarro A, Corbera F, Fernández-Madrigal JA, Asenjo RC (2020) Efficiency and productivity for decision making on low-power heterogeneous CPU + GPU SoCs. J Supercomput 1–22

42.

Cai Lin, Qi Yong, Wei Wei, Jinsong Wu, Li Jinwei (2019) mrMoulder: a recommendation-based adaptive parameter tuning approach for big data processing platform. Fut Gen Comput Syst 93:570–582CrossRef

43.

FIMI. Frequent Itemset Mining Dataset Repository. http://fimi.uantwerpen.be/data/

44.

UCI. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/QtyT40I10D100K

45.

Kaggle. Yelp Dataset. https://www.kaggle.com/yelp-dataset/yelp-dataset

46.

Vivek Kumar. (2020) Mille Cheval. https://vivekrobotics.github.io/MilleCheval/

47.

Zhu Haiting, Yuan Zhang Lu, Zhang Gaofeng He, Liu Linfeng (2019) CBFSketch: A scalable sketch framework for high speed network in Conference Publishing Services. China, Suzhou, pp 357–362

Title: Mille Cheval: a GPU-based in-memory high-performance computing framework for accelerated processing of big-data streams
Authors: Vivek Kumar
Dilip Kumar Sharma
Vinay Kumar Mishra
Publication date: 04-01-2021
Publisher: Springer US
Published in: The Journal of Supercomputing / Issue 7/2021
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-020-03508-3

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 7/2021

Reconfigurable edge as a service: enhancing edges using quality-based solutions

Diagnostic efficacy of ultrasound combined with magnetic resonance imaging in diagnosis of deep pelvic endometriosis under deep learning

Joint energy optimization on the server and network sides for geo-distributed data centers

Dynamic swarm class rebalancing for the process mining of rare events

Parallel simulation of drift–diffusion–recombination by cellular automata and global random walk algorithm

PEPS: predictive energy-efficient parallel scheduler for multi-core processors

Premium Partner