ABSTRACT
With the tremendous popularity gained by container technology, many applications are being containerized: splitting into numerous containers connected by networks. However, current container networking solutions have either bad performance or poor portability, which undermines the advantages of containerization. In this paper, we propose FreeFlow, a container networking solution which achieves both high performance and good portability. FreeFlow is designed according to the observation that strict isolations are unnecessary among containers trusting each other, and it can significantly boost the communication quality of containers by compromising isolation a little bit. Specifically, we enable containers on the same physical machine to communicate via shared-memory and the ones on different physical machines communicate via high performance networking options, e.g. RDMA and DPDK. Naively wrapping up all the solutions together will result in poor potability of containers and huge complexity in application development. Instead, FreeFlow leverages a network abstraction which supports all common network APIs and a centralized network orchestrator which decides how to deliver data transparently to applications in the containers.
- Docker. http://www.docker.com/.Google Scholar
- Kubernetes. http://kubernetes.io/.Google Scholar
- Project calico. https://www.projectcalico.org/.Google Scholar
- Apache Hadoop. http://hadoop.apache.org/, accessed 2016.Google Scholar
- CoreOS. https://coreos.com/, accessed 2016.Google Scholar
- Data plane development kit (DPDK). http://dpdk.org/, accessed 2016.Google Scholar
- Weave Net. https://www.weave.works/, accessed 2016.Google Scholar
- G. Ananthanarayanan, S. Kandula, A. G. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris. Reining in the outliers in map-reduce clusters using mantri. In USENIX OSDI, 2010. Google ScholarDigital Library
- P. Balaji, S. Narravula, K. Vaidyanathan, S. Krishnamoorthy, J. Wu, and D. K. Panda. Sockets direct protocol over infiniband in clusters: is it beneficial? In IEEE ISPASS, 2004. Google ScholarDigital Library
- J. Brandeburg. Reducing network latency in linux. In Linux Plumbers Conference, 2012.Google Scholar
- M. Chowdhury and I. Stoica. Efficient coflow scheduling without prior knowledge. In ACM SIGCOMM, 2015. Google ScholarDigital Library
- M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica. Managing data transfers in computer clusters with orchestra. In ACM SIGCOMM, 2011. Google ScholarDigital Library
- M. Chowdhury, Y. Zhong, and I. Stoica. Efficient coflow scheduling with varys. In ACM SIGCOMM, 2014. Google ScholarDigital Library
- J. Claassen, R. Koning, and P. Grosso. Linux containers networking: Performance and scalability of kernel modules. In IEEE/IFIP NOMS, 2016.Google ScholarCross Ref
- Datadog. 8 suprising facts about real Docker adoption. https://www.datadoghq.com/docker-adoption/, 2016.Google Scholar
- Docker. Docker community passes two billion pulls. https://blog.docker.com/2016/02/docker-hub-two-billion-pulls/, 2016.Google Scholar
- A. Dragojevic, D. Narayanan, M. Castro, and O. Hodson. Farm: fast remote memory. In USENIX NSDI, 2014. Google ScholarDigital Library
- M. Fox, C. Kassimis, and J. Stevens. IBM's Shared Memory Communications over RDMA (SMC-R) Protocol. RFC 7609 (Informational), 2015.Google Scholar
- FreeBDS. chroot - FreeBDS Man Pages. http://www.freebsd.org/cgi/man.cgi, FreeBDS 10.3 Rel.Google Scholar
- B. Goglin and S. Moreaud. Knem: A generic and scalable kernel-assisted intra-node MPI communication framework. Journal of Parallel and Distributed Computing, 73(2), 2013. Google ScholarDigital Library
- D. Goldenberg, M. Kagan, R. Ravid, and M. S. Tsirkin. Zero copy sockets direct protocol over infiniband-preliminary implementation and performance analysis. In 13th Symposium on High Performance Interconnects (HOTI'05), pages 128-137. IEEE, 2005. Google ScholarDigital Library
- P. S. I. Group. Single root I/O virtualization. http://pcisig.com/specifications/iov/single_root/, accessed 2016.Google Scholar
- S. Hefty. Rsockets. In OpenFabris International Workshop, 2012.Google Scholar
- B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. H. Katz, S. Shenker, and I. Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In USENIX NSDI, 2011. Google ScholarDigital Library
- T. Hoefler, J. Dinan, D. Buntinas, P. Balaji, B. Barrett, R. Brightwell, W. D. Gropp, V. Kale, and R. Thakur. Mpi + mpi: A new hybrid approach to parallel programming with mpi plus shared memory. Computing, 95, 2013. Google ScholarDigital Library
- J. Hwang, K. Ramakrishnan, and T. Wood. Netvm: high performance and flexible networking using virtualization on commodity platforms. IEEE Transactions on Network and Service Management, 12(1), 2015.Google ScholarCross Ref
- Iron.io. Docker in production - what we've learned launching over 300 million containers. https://www.iron.io/docker-in-production-what-weve-learned/, 2014.Google Scholar
- M. Li, D. G. Andersen, A. J. Smola, and K. Yu. Communication efficient distributed machine learning with the parameter server. In Advances in Neural Information Processing Systems, 2014. Google ScholarDigital Library
- L. Liss. Containing RDMA and high performance computing. In ContainerCon, 2015.Google Scholar
- J. Liu, J. Wu, and D. K. Panda. High performance RDMA-based MPI implementation over infiniband. International Journal of Parallel Programming, 32(3), 2004. Google ScholarDigital Library
- A. Madhavapeddy, T. Leonard, M. Skjegstad, T. Gazagnaire, et al. Jitsu: Just-in-time summoning of unikernels. In USENIX NSDI, 2015. Google ScholarDigital Library
- J. Martins, M. Ahmed, C. Raiciu, V. Olteanu, M. Honda, R. Bifulco, and F. Huici. Clickos and the art of network function virtualization. In USENIX NSDI, 2014. Google ScholarDigital Library
- Mellanox. RDMA aware networks programming user manual. http://www.mellanox.com/, Rev 1.7.Google Scholar
- D. Merkel. Docker: Lightweight linux containers for consistent development and deployment. Linux J., 2014(239), 2014. Google ScholarDigital Library
- R. Rabenseifner, G. Hager, and G. Jost. Hybrid mpi/openmp parallel programming on clusters of multi-core smp nodes. In Euromicro International Conference on Parallel, Distributed and Network-based Processing, 2009. Google ScholarDigital Library
- A. Ranadive and B. Davda. Toward a paravirtual vRDMA device for VMware ESXi guests. VMware Technical Journal, Winter 2012, 1(2), 2012.Google Scholar
- L. Rizzo. Netmap: a novel framework for fast packet i/o. In USENIX Security Symposium, 2012. Google ScholarDigital Library
- L. Rizzo and G. Lettieri. Vale, a switched ethernet for virtual machines. In ACM CoNEXT, 2012. Google ScholarDigital Library
- V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, et al. Apache hadoop yarn: Yet another resource negotiator. In ACM Symposium on Cloud Computing, 2013. Google ScholarDigital Library
- J. Wang, K.-L. Wright, and K. Gopalan. Xenloop: A transparent high performance inter-vm network loopback. In ACM HPDC, 2008. Google ScholarDigital Library
- Y. Zhu, H. Eran, D. Firestone, C. Guo, et al. Congestion control for large-scale RDMA deployments. In ACM SIGCOMM, volume 45, 2015. Google ScholarDigital Library
Recommendations
Freeflow: software-based virtual RDMA networking for containerized clouds
NSDI'19: Proceedings of the 16th USENIX Conference on Networked Systems Design and ImplementationMany popular large-scale cloud applications are increasingly using containerization for high resource efficiency and lightweight isolation. In parallel, many data-intensive applications (e.g., data analytics and deep learning frameworks) are adopting or ...
SRVM: Hypervisor Support for Live Migration with Passthrough SR-IOV Network Devices
VEE '16Single-Root I/O Virtualization (SR-IOV) is a specification that allows a single PCI Express (PCIe) device (ysical function or PF) to be used as multiple PCIe devices (virtual functions or VF). In a virtualization system, each VF can be directly assigned ...
SRVM: Hypervisor Support for Live Migration with Passthrough SR-IOV Network Devices
VEE '16: Proceedings of the12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution EnvironmentsSingle-Root I/O Virtualization (SR-IOV) is a specification that allows a single PCI Express (PCIe) device (ysical function or PF) to be used as multiple PCIe devices (virtual functions or VF). In a virtualization system, each VF can be directly assigned ...
Comments