Skip to main content
Top
Published in: The Journal of Supercomputing 7/2021

04-01-2021

Mille Cheval: a GPU-based in-memory high-performance computing framework for accelerated processing of big-data streams

Authors: Vivek Kumar, Dilip Kumar Sharma, Vinay Kumar Mishra

Published in: The Journal of Supercomputing | Issue 7/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Streams are temporally ordered, rapid changing, ample in volume, and infinite in nature. It is nearly impossible to store the entire data stream due to its large volume and high velocity. In this work, the principle of parallelism is employed to accelerate stream data computing. GPU-based high-performance computing (HPC) framework is proposed for accelerated processing of big-data streams using the in-memory data structure. We have implemented three parallel algorithms to prove the viability of the framework. The contributions of Mille Cheval are: (1) the viability of streaming on accelerators to increase throughput, (2) carefully chosen hash algorithms to achieve low collision rate and high randomness, and (3) memory sketches for approximation. The objective is to leverage the power of a single node using in-memory computing and hybrid computing. HPC does not always require high-end hardware but well-designed algorithms. Achievements of Mille Cheval are: (1) relative error is 1.32 when error rate and overestimate rate are chosen as 0.001 and (2) the host memory space requirement is just 63 MB for 1 terabyte of data. The proposed algorithms are pragmatic. It is evident from experimental results that the framework demonstrates 10X speed-up as compared with CPU implementations and 3X speed-up as compared with GPU implementations.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Zhang H, Chen G, Ooi BC, Tan KL, Zhang M (2015) In-memory big data management and processing: a survey. IEEE Trans Knowl Data Eng 27(7):1920–1948CrossRef Zhang H, Chen G, Ooi BC, Tan KL, Zhang M (2015) In-memory big data management and processing: a survey. IEEE Trans Knowl Data Eng 27(7):1920–1948CrossRef
2.
go back to reference Tran DH, Gaber MM, Sattler KU (2014) Change detection in streaming data in the era of big data: models and issues. ACM SIGKDD Explor Newsl 16(1):30–38CrossRef Tran DH, Gaber MM, Sattler KU (2014) Change detection in streaming data in the era of big data: models and issues. ACM SIGKDD Explor Newsl 16(1):30–38CrossRef
3.
go back to reference (2013) Android 4.2 APIs—Android Developers (Online). developer.android.com (2013) Android 4.2 APIs—Android Developers (Online). developer.android.com
7.
go back to reference Karlsson K, Lans T (2013) Big data algorithm optimization. Chalmers University of Technology, Goteborg, Master of Science Thesis Karlsson K, Lans T (2013) Big data algorithm optimization. Chalmers University of Technology, Goteborg, Master of Science Thesis
9.
go back to reference Cormode G, Muthukrishnan M (2011) Approximating data with the count-min sketch. IEEE Softw 29(1):64–69CrossRef Cormode G, Muthukrishnan M (2011) Approximating data with the count-min sketch. IEEE Softw 29(1):64–69CrossRef
10.
go back to reference Graham C (2011) Sketch techniques for approximate query. Found Trends Databases Graham C (2011) Sketch techniques for approximate query. Found Trends Databases
12.
go back to reference Curtis et al AR (2011) DevoFlow: scaling flow management for high-performance. In: ACM SIGCOMM, pp. 254–265 Curtis et al AR (2011) DevoFlow: scaling flow management for high-performance. In: ACM SIGCOMM, pp. 254–265
13.
go back to reference Debasish G (2014) Count-min sketch: a data structure for stream mining applications. DZone Debasish G (2014) Count-min sketch: a data structure for stream mining applications. DZone
14.
15.
go back to reference Pinnecke M, Broneske D, Saake G (2015) Toward GPU accelerated data stream processing. Genius Vision Digital, pp 78–83 Pinnecke M, Broneske D, Saake G (2015) Toward GPU accelerated data stream processing. Genius Vision Digital, pp 78–83
16.
go back to reference Rathore MM, Son H, Ahmad A, Paul A, Jeon G (2018) Real-time big data stream processing using GPU with spark over hadoop ecosystem. Int J Parallel Prog 46(3):630–646CrossRef Rathore MM, Son H, Ahmad A, Paul A, Jeon G (2018) Real-time big data stream processing using GPU with spark over hadoop ecosystem. Int J Parallel Prog 46(3):630–646CrossRef
17.
go back to reference Singh H, Venkat RS, Swagatika S, Saxena S (2020) GPU and CUDA in hard computing approaches: analytical review. Springer, Cham, pp 177–196 Singh H, Venkat RS, Swagatika S, Saxena S (2020) GPU and CUDA in hard computing approaches: analytical review. Springer, Cham, pp 177–196
18.
go back to reference Verner U, Schuster A, Mendelson A (2015) Processing real-time data streams on GPU-based systems. Technion, Haifa, Israel, Doctoral Dissertation Verner U, Schuster A, Mendelson A (2015) Processing real-time data streams on GPU-based systems. Technion, Haifa, Israel, Doctoral Dissertation
19.
go back to reference Mencagli G, Torquati M, Lucattini F, Cuomo S, Aldinucci M (2018) Harnessing sliding-window execution semantics for parallel stream processing. J Parall Distrib Comput 116:74–88CrossRef Mencagli G, Torquati M, Lucattini F, Cuomo S, Aldinucci M (2018) Harnessing sliding-window execution semantics for parallel stream processing. J Parall Distrib Comput 116:74–88CrossRef
20.
go back to reference Reuter Klaus, Köfinger Jürgen (2019) CADISHI: fast parallel calculation of particle-pair distance histograms on CPUs and GPUs. ScienceDirect 236:274–284 Reuter Klaus, Köfinger Jürgen (2019) CADISHI: fast parallel calculation of particle-pair distance histograms on CPUs and GPUs. ScienceDirect 236:274–284
21.
go back to reference Xu J, Ding W, Hu X, Gong Q (2019) VATE: a trade-off between memory and preserving time for high accurate cardinality estimation under sliding time window. Comput Commun 138:20–31CrossRef Xu J, Ding W, Hu X, Gong Q (2019) VATE: a trade-off between memory and preserving time for high accurate cardinality estimation under sliding time window. Comput Commun 138:20–31CrossRef
22.
go back to reference Guo R, Xue E, Zhang F, Zhao G, Qu G (2019) Optimizing the confidence bound of count-min sketches to estimate the streaming big data query results more precisely. Computing 1–27 Guo R, Xue E, Zhang F, Zhao G, Qu G (2019) Optimizing the confidence bound of count-min sketches to estimate the streaming big data query results more precisely. Computing 1–27
23.
go back to reference Bhattacharyya Shilpi, Katramatos Dimitrios, Yoo Shinjae (2018) Why wait? Let us start computing while the data is still on the wire. Fut Gen Comput Syst 89:563–574CrossRef Bhattacharyya Shilpi, Katramatos Dimitrios, Yoo Shinjae (2018) Why wait? Let us start computing while the data is still on the wire. Fut Gen Comput Syst 89:563–574CrossRef
24.
go back to reference Mandal A, Jiang H, Shrivastava A, Sarkar V (2018) Topkapi: parallel and fast sketches for finding top-K frequent elements. Adv Neural Inf Process Syst 10898–10908 Mandal A, Jiang H, Shrivastava A, Sarkar V (2018) Topkapi: parallel and fast sketches for finding top-K frequent elements. Adv Neural Inf Process Syst 10898–10908
25.
go back to reference Wentao W, Yongjian Y, En W (2019) A distributed hierarchical heavy hitter detection method in software-defined networking. IEEE Access Wentao W, Yongjian Y, En W (2019) A distributed hierarchical heavy hitter detection method in software-defined networking. IEEE Access
26.
go back to reference Epicoco I, Cafaro M, Pulimeno M (2018) Fast and accurate mining of correlated heavy hitters. Data Min Knowl Disc 32(1):162–186MathSciNetCrossRef Epicoco I, Cafaro M, Pulimeno M (2018) Fast and accurate mining of correlated heavy hitters. Data Min Knowl Disc 32(1):162–186MathSciNetCrossRef
27.
go back to reference Cafaro M, Epicoco I, Pulimeno M (2019) CMSS: sketching based reliable tracking of large network flows. Fut Gen Comput Syst 101:770–784CrossRef Cafaro M, Epicoco I, Pulimeno M (2019) CMSS: sketching based reliable tracking of large network flows. Fut Gen Comput Syst 101:770–784CrossRef
28.
go back to reference Yu X, Xu H, Yao D, Wang H, Huang L (2018) CountMax: a lightweight and cooperative sketch measurement for software-defined networks. IEEE/ACM Trans Netw 26(6):2774–2786CrossRef Yu X, Xu H, Yao D, Wang H, Huang L (2018) CountMax: a lightweight and cooperative sketch measurement for software-defined networks. IEEE/ACM Trans Netw 26(6):2774–2786CrossRef
29.
go back to reference Tang Rui, Fong Simon (2018) Clustering big IoT data by metaheuristic optimized mini-batch and parallel partition-based DGC in Hadoop. Fut Gen Comput Syst 86:1395–1412CrossRef Tang Rui, Fong Simon (2018) Clustering big IoT data by metaheuristic optimized mini-batch and parallel partition-based DGC in Hadoop. Fut Gen Comput Syst 86:1395–1412CrossRef
30.
go back to reference Zheng Z, Wang Z, Lipasti M (2015) Adaptive cache and concurrency allocation on GPGPUs. IEEE Comput Archit Lett 14(2):90–93CrossRef Zheng Z, Wang Z, Lipasti M (2015) Adaptive cache and concurrency allocation on GPGPUs. IEEE Comput Archit Lett 14(2):90–93CrossRef
31.
go back to reference Mittal S (2015) A survey of techniques for managing and leveraging caches in GPUs. JCSC 23(8):1 Mittal S (2015) A survey of techniques for managing and leveraging caches in GPUs. JCSC 23(8):1
32.
go back to reference Ashkiani S, Li S, Farach-Colton M, Amenta N, Owens JD (2018) GPU LSM: a dynamic dictionary data structure for the GPU. In: IEEE international parallel and distributed processing symposium, Vancouver, pp 430–440 Ashkiani S, Li S, Farach-Colton M, Amenta N, Owens JD (2018) GPU LSM: a dynamic dictionary data structure for the GPU. In: IEEE international parallel and distributed processing symposium, Vancouver, pp 430–440
33.
go back to reference Kim Mincheol, Liu Ling, Choi Wonik (2018) A GPU-aware parallel index for processing high-dimensional big data. IEEE Trans Comput 67(10):1388–1402MathSciNetCrossRef Kim Mincheol, Liu Ling, Choi Wonik (2018) A GPU-aware parallel index for processing high-dimensional big data. IEEE Trans Comput 67(10):1388–1402MathSciNetCrossRef
34.
go back to reference Astorga DR, Dolz MF, Fernández J, García JD (2018) Paving the way towards high-level parallel pattern interfaces for data stream processing. Fut Gen Comput Syst 87:228–241CrossRef Astorga DR, Dolz MF, Fernández J, García JD (2018) Paving the way towards high-level parallel pattern interfaces for data stream processing. Fut Gen Comput Syst 87:228–241CrossRef
35.
go back to reference Petrovič Filip et al (2020) A benchmark set of highly-efficient CUDA and OpenCL kernels and its dynamic autotuning with Kernel Tuning Toolkit. Future Generation Computer Systems 108:161–177CrossRef Petrovič Filip et al (2020) A benchmark set of highly-efficient CUDA and OpenCL kernels and its dynamic autotuning with Kernel Tuning Toolkit. Future Generation Computer Systems 108:161–177CrossRef
36.
go back to reference Peng Du et al (2012) From CUDA to OpenCL: towards a performance-portable solution for multi-platform GPU programming. Parallel Comput 38(8):391–407CrossRef Peng Du et al (2012) From CUDA to OpenCL: towards a performance-portable solution for multi-platform GPU programming. Parallel Comput 38(8):391–407CrossRef
37.
go back to reference Karthik P, Banu JS (2020) Frequent item set mining of large datasets using CUDA computing. In: Soft computing for problem solving. Singapore, pp 739–747 Karthik P, Banu JS (2020) Frequent item set mining of large datasets using CUDA computing. In: Soft computing for problem solving. Singapore, pp 739–747
38.
go back to reference Malyshkin VE (2019) Parallel computing technologies 2018. J Supercomput 75(12):7747–7749CrossRef Malyshkin VE (2019) Parallel computing technologies 2018. J Supercomput 75(12):7747–7749CrossRef
39.
go back to reference Do CT, Choi HJ, Chung SW, Kim CH (2019) A novel warp scheduling scheme considering long-latency operations for high-performance GPUs. J Supercomput 1:1–20 Do CT, Choi HJ, Chung SW, Kim CH (2019) A novel warp scheduling scheme considering long-latency operations for high-performance GPUs. J Supercomput 1:1–20
40.
go back to reference Tarditi D, Puri S, Oglesby J (2006) Accelerator: using data parallelism to program GPUs for general-purpose uses. ACM SIGARCH Comput Archit News 34(5):1CrossRef Tarditi D, Puri S, Oglesby J (2006) Accelerator: using data parallelism to program GPUs for general-purpose uses. ACM SIGARCH Comput Archit News 34(5):1CrossRef
41.
go back to reference Constantinescu DA, Navarro A, Corbera F, Fernández-Madrigal JA, Asenjo RC (2020) Efficiency and productivity for decision making on low-power heterogeneous CPU + GPU SoCs. J Supercomput 1–22 Constantinescu DA, Navarro A, Corbera F, Fernández-Madrigal JA, Asenjo RC (2020) Efficiency and productivity for decision making on low-power heterogeneous CPU + GPU SoCs. J Supercomput 1–22
42.
go back to reference Cai Lin, Qi Yong, Wei Wei, Jinsong Wu, Li Jinwei (2019) mrMoulder: a recommendation-based adaptive parameter tuning approach for big data processing platform. Fut Gen Comput Syst 93:570–582CrossRef Cai Lin, Qi Yong, Wei Wei, Jinsong Wu, Li Jinwei (2019) mrMoulder: a recommendation-based adaptive parameter tuning approach for big data processing platform. Fut Gen Comput Syst 93:570–582CrossRef
47.
go back to reference Zhu Haiting, Yuan Zhang Lu, Zhang Gaofeng He, Liu Linfeng (2019) CBFSketch: A scalable sketch framework for high speed network in Conference Publishing Services. China, Suzhou, pp 357–362 Zhu Haiting, Yuan Zhang Lu, Zhang Gaofeng He, Liu Linfeng (2019) CBFSketch: A scalable sketch framework for high speed network in Conference Publishing Services. China, Suzhou, pp 357–362
Metadata
Title
Mille Cheval: a GPU-based in-memory high-performance computing framework for accelerated processing of big-data streams
Authors
Vivek Kumar
Dilip Kumar Sharma
Vinay Kumar Mishra
Publication date
04-01-2021
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 7/2021
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-020-03508-3

Other articles of this Issue 7/2021

The Journal of Supercomputing 7/2021 Go to the issue

Premium Partner