Top

Published in:

2019 | OriginalPaper | Chapter

HiPower: A High-Performance RDMA Acceleration Solution for Distributed Transaction Processing

Authors : Runhua Zhang, Yang Cheng, Jinkun Geng, Shuai Wang, Kaihui Gao, Guowei Shen

Published in: Network and Parallel Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The increasing complex tasks and growing size of data have necessitated the application of distributed transaction processing (DTP), which decouples tasks and data among multiple nodes for jointly processing. However, compared with the revolutionary development of computation power, the network capability falls relatively behind, leaving communication as a more distinct bottleneck. This paper focuses on the recent emerging RDMA technology, which can greatly improve communication performance but cannot be well exploited in many cases due to improper interactive design between the requester and responder. Our research finds that the typical implementation of confirming per work request (CPWR) triggers considerable CPU involvement, which further degrades the overall performance of RDMA communication. Targeting at this, we propose HiPower, which leverages a batched confirmation scheme with lower CPU utilization, to improve high-frequency communication efficiency. Our experiments show that, compared with CPWR, HiPower can improve the communication efficiency by up to 75% and reduce CPU cost by up to 79%, which speeds up the overall FCT (Flow Completion Time) by up to 14% on real workflow (Resnet-152).

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter MMSR: A Multi-model Super Resolution Framework

next chapter LDAPRoam: A Generic Solution for Both Web-Based and Non-Web-Based Federate Access

The cost includes the time to register MRs and exchange necessary information of MRs.

C is a constant, which denotes the number of MRs pre-registered. n denotes the number of executions of RDMA_READ.

PFC (Priority-based Flow Control), priority-based flow control. The upstream device is notified to suspend the delivery by sending a Pause frame to prevent the buffer from overflowing.

It is derived from NVcaffe rc 0.17 with GPU supported. The input size is 224 * 224 * 224 * 3 and the batch size is 16. It is trained by P100. The forward stage cost 0.1 s, and backward cost about 0.15 s.

qperf - measure RDMA and IP performance. Technical report, Johann George (2009). https://linux.die.net/man/1/qperf

How to compile, use and configure rdma-enabled tensorflow. Technical report, HKUST and Tensorflow community (2018). https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/verbs

Chen, H., et al.: Fast in-memory transaction processing using RDMA and HTM. In: TOCS 2017 (2017) CrossRef

Dragojevic, A., Narayanan, D., Castro, M.: RDMA reads: to use or not to use? IEEE Data Eng. Bull. (2017)

Dragojević, A., Narayanan, D., Hodson, O., et al.: FaRM: fast remote memory. In: NSDI 2014 (2014)

Frey, P.W., Alonso, G.: Minimizing the hidden cost of RDMA. In: 2009 29th IEEE International Conference on Distributed Computing Systems (2009)

Geng, J.: CODE: incorporating correlation and dependency for task scheduling in data center. In: ISPA 2017 (2017)

Guo, C., et al.: RDMA over commodity ethernet at scale. In: SIGCOMM 2016 (2016)

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR 2016 (2016)

10.

Kalia, A., Kaminsky, M., Andersen, D.G.: FaSST: fast, scalable and simple distributed transactions with two-sided (RDMA) datagram RPCs. In: OSDI 2016 (2016)

11.

Kalia, A., Kaminsky, M., Andersen, D.G.: Using RDMA efficiently for key-value services. In: SIGCOMM 2015 (2015)

12.

Kaminsky, A.K.M., Andersen, D.G.: Design guidelines for high performance RDMA systems. In: ATC 2016 (2016)

13.

Kim, D., et al.: HyperLoop: group-based NIC-offloading to accelerate replicated transactions in multi-tenant storage systems. In: SIGCOMM 2018 (2018)

14.

Li, M., Andersen, D.G., Smola, A.J., Yu, K.: Communication efficient distributed machine learning with the parameter server. In: Advances in Neural Information Processing Systems, pp. 19–27 (2014)

15.

Lu, X., Rahman, M.W.U., Islam, N., Shankar, D., Panda, D.K.: Accelerating spark with RDMA for big data processing: early experiences. In: Hot Interconnects 2014 (2014)

16.

Luo, L., Nelson, J., Ceze, L., Phanishayee, A., Krishnamurthy, A.: Parameter hub: a rack-scale parameter server for distributed deep neural network training. In: SOCC 2018 (2018)

17.

Mitchell, C., Geng, Y., Li, J.: Using one-sided \(\{\)RDMA\(\}\) reads to build a fast, CPU-efficient key-value store. In: ATC 2013 (2013)

18.

Wei, J., et al.: Managed communication and consistency for fast data-parallel iterative analytics. In: Proceedings of the Sixth ACM Symposium on Cloud Computing

19.

Wei, X., Dong, Z., Chen, R., Chen, H.: Deconstructing RDMA-enabled distributed transactions: hybrid is better! In: OSDI 2018 (2018)

Title: HiPower: A High-Performance RDMA Acceleration Solution for Distributed Transaction Processing
Authors: Runhua Zhang
Yang Cheng
Jinkun Geng
Shuai Wang
Kaihui Gao
Guowei Shen
Publisher: Springer International Publishing
Book: Network and Parallel Computing
Print ISBN: 978-3-030-30708-0

Electronic ISBN: 978-3-030-30709-7

Copyright Year: 2019
DOI: https://doi.org/10.1007/978-3-030-30709-7_17

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner