skip to main content
research-article

A Hybrid I/O Virtualization Framework for RDMA-capable Network Interfaces

Published:14 March 2015Publication History
Skip Abstract Section

Abstract

DMA-capable interconnects, providing ultra-low latency and high bandwidth, are increasingly being used in the context of distributed storage and data processing systems. However, the deployment of such systems in virtualized data centers is currently inhibited by the lack of a flexible and high-performance virtualization solution for RDMA network interfaces.

In this work, we present a hybrid virtualization architecture which builds upon the concept of separation of paths for control and data operations available in RDMA. With hybrid virtualization, RDMA control operations are virtualized using hypervisor involvement, while data operations are set up to bypass the hypervisor completely. We describe HyV (Hybrid Virtualization), a virtualization framework for RDMA devices implementing such a hybrid architecture. In the paper, we provide a detailed evaluation of HyV for different RDMA technologies and operations. We further demonstrate the advantages of HyV in the context of a real distributed system by running RAMCloud on a set of HyV-enabled virtual machines deployed across a 6-node RDMA cluster. All of the performance results we obtained illustrate that hybrid virtualization enables bare-metal RDMA performance inside virtual machines while retaining the flexibility typically associated with paravirtualization.

References

  1. Adit Ranadive and Bhavesh Davda. Toward a Paravirtual vRDMA Device for VMware ESXi Guests. VMware, 2012.Google ScholarGoogle Scholar
  2. Ardalan Amiri Sani, Kevin Boos, Shaopu Qin, and Lin Zhong. I/O Paravirtualization at the Device File Boundary. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '14, pages 319--332, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Nadav Amit, Dan Tsafrir, and Assaf Schuster. VSwapper: A Memory Swapper for Virtualized Environments. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '14, pages 349--366, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Fabrice Bellard. QEMU, a Fast and Portable Dynamic Translator. In Proceedings of USENIX Annual Technical Conference, pages 41--46, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Aleksandar Dragojević, Dushyanth Narayanan, Miguel Castro, and Orion Hodson. FaRM: Fast Remote Memory. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 401--414, Seattle, WA, April 2014. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Thorsten Von Eicken, Anindya Basu, Vineet Buch, and Werner Vogels. U-net: A user-level network interface for parallel and distributed computing. In In Fifteenth ACM Symposium on Operating System Principles, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Keir Fraser, Steven H, Rolf Neugebauer, Ian Pratt, Andrew Warfield, and Mark Williamson. Safe hardware access with the Xen virtual machine monitor. In In 1st Workshop on Operating System and Architectural Support for the on demand IT InfraStructure (OASIS), 2004.Google ScholarGoogle Scholar
  8. Abel Gordon, Nadav Amit, Nadav Har'El, Muli Ben-Yehuda, Alex Landau, Assaf Schuster, and Dan Tsafrir. ELI: Baremetal Performance for I/O Virtualization. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 411--422, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. InfiniBand Trade Association. InfiniBand Architectur Specification, Volume 1, Release 1.2.1. 2007.Google ScholarGoogle Scholar
  10. InfiniBand Trade Association. Annex A16: RDMA over Converged Ethernet (RoCE). 2010.Google ScholarGoogle Scholar
  11. J. Pinkerton J. Hilland, P. Culley and R. Recio. RDMA Protocol Verbs Specification. http://www.rdmaconsortium. org/home/draft-hilland-iwarp-verbs-v1.0-RDMAC. pdf, 2003.Google ScholarGoogle Scholar
  12. Hwanju Kim, Sangwook Kim, Jinkyu Jeong, Joonwon Lee, and Seungryoul Maeng. Demand-based Coordinated Scheduling for SMP VMs. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '13, pages 369--380, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hwanju Kim, Hyeontaek Lim, Jinkyu Jeong, Heeseung Jo, and Joonwon Lee. Task-aware Virtual Machine Scheduling for I/O Performance. In Proceedings of the 2009 ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE '09, pages 101--110, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Avi Kivity, Yaniv Kamay, Dor Laor, Uri Lublin, and Anthony Liguori. kvm: the Linux Virtual Machine Monitor. In Proceedings of the Linux Symposium, volume 1, pages 225--230, Ottawa, Ontario, Canada, June 2007.Google ScholarGoogle Scholar
  15. L. Lamport. Proving the correctness of multiprocess programs. IEEE Trans. Softw. Eng., 3(2):125--143, March 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jiuxing Liu, Wei Huang, Bulent Abali, and Dhabaleswar K. Panda. High Performance VMM-bypass I/O in Virtual Machines. In Proceedings of the Annual Conference on USENIX '06 Annual Technical Conference, ATEC '06, pages 3--3, Berkeley, CA, USA, 2006. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Matthew Wilcox. I'll Do It Later: Softirqs, Tasklets, Bottom Halves, Task Queues, Work Queues and Timers. In Linux.Conf.Au, 2003.Google ScholarGoogle Scholar
  18. Christopher Mitchell, Yifeng Geng, and Jinyang Li. Using One-sided RDMA Reads to Build a Fast, CPU-efficient Keyvalue Store. In Proceedings of the 2013 USENIX Conference on Annual Technical Conference, USENIX ATC'13, pages 103--114, Berkeley, CA, USA, 2013. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. OFED. The Open Fabric Alliance, at https://www. openfabrics.org/.Google ScholarGoogle Scholar
  20. Diego Ongaro, Alan L. Cox, and Scott Rixner. Scheduling I/O in Virtual Machine Monitors. In Proceedings of the Fourth ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE '08, pages 1--10, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout, and Mendel Rosenblum. Fast Crash Recovery in RAMCloud. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, SOSP '11, pages 29--41, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazi'eres, Subhasish Mitra, Aravind Narayanan, Diego Ongaro, Guru Parulkar, Mendel Rosenblum, Stephen M. Rumble, Eric Stratmann, and Ryan Stutsman. The Case for RAMCloud. Commun. ACM, 54(7):121--130, July 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazi'eres, Subhasish Mitra, Aravind Narayanan, Guru Parulkar, Mendel Rosenblum, Stephen M. Rumble, Eric Stratmann, and Ryan Stutsman. The Case for RAMClouds: Scalable High-performance Storage Entirely in DRAM. SIGOPS Oper. Syst. Rev., 43(4):92--105, January 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Zhenhao Pan, Yaozu Dong, Yu Chen, Lei Zhang, and Zhijiao Zhang. CompSC: Live Migration with Pass-through Devices. In Proceedings of the 8th ACM SIGPLAN/SIGOPS Conference on Virtual Execution Environments, VEE '12, pages 109--120, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. PCI SIG. Single Root I/O Virtualization, at https://www.pcisig.com/specifications/iov/single_root/.Google ScholarGoogle Scholar
  26. A Ranadive, A Gavrilovska, and K. Schwan. FaReS: Fair Resource Scheduling for VMM-Bypass InfiniBand Devices. In Cluster, Cloud and Grid Computing (CCGrid), 2010 10th IEEE/ACM International Conference on, pages 418--427, May 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Recio, B. Metzler, P. Culley, J. Hilland, and D. Garcia. A Remote Direct Memory Access Protocol Specification. RFC 5040, October 2007.Google ScholarGoogle Scholar
  28. S. A. Reinemo, T. Skeie, T. Sodring, O. Lysne, and O. Trudbakken. An Overview of QoS Capabilities in Infiniband, Advanced Switching Interconnect, and Ethernet. Comm. Mag., 44(7):32--38, September 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Rusty Russell. virtio: Towards a De-facto Standard for Virtual I/O Devices. SIGOPS Oper. Syst. Rev., 42(5):95--103, July 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Animesh Trivedi, Bernard Metzler, and Patrick Stuedi. A case for RDMA in clouds: turning supercomputer networking into commodity. In Proceedings of the Second Asia-Pacific Workshop on Systems, APSys '11, pages 17:1--17:5, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Hybrid I/O Virtualization Framework for RDMA-capable Network Interfaces

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 50, Issue 7
      VEE '15
      July 2015
      221 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2817817
      • Editor:
      • Andy Gill
      Issue’s Table of Contents
      • cover image ACM Conferences
        VEE '15: Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments
        March 2015
        238 pages
        ISBN:9781450334501
        DOI:10.1145/2731186

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 14 March 2015

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader