Elsevier

Computer Networks

Volume 51, Issue 7, 16 May 2007, Pages 1777-1799
Computer Networks

UDT: UDP-based data transfer for high-speed wide area networks

https://doi.org/10.1016/j.comnet.2006.11.009Get rights and content

Abstract

In this paper, we summarize our work on the UDT high performance data transport protocol over the past four years. UDT was designed to effectively utilize the rapidly emerging high-speed wide area optical networks. It is built on top of UDP with reliability control and congestion control, which makes it quite easy to install. The congestion control algorithm is the major internal functionality to enable UDT to effectively utilize high bandwidth. Meanwhile, we also implemented a set of APIs to support easy application implementation, including both reliable data streaming and partial reliable messaging. The original UDT library has also been extended to Composable UDT, which can support various congestion control algorithms. We will describe in detail the design and implementation of UDT, the UDT congestion control algorithm, Composable UDT, and the performance evaluation.

Introduction

The rapid increase of network bandwidth and the emergence of new distributed applications are the two driving forces for networking research and development. On the one hand, network bandwidth today has been expanded to 10 Gb/s with 100 Gb/s emerging, which enables many data intensive applications that were impossible in the past. On the other hand, new applications, such as scientific data distribution, expedite the deployment of high-speed wide area networks.

Today, national or international high-speed networks have connected most developed regions in the world with fiber [8], [10]. Data can be moved at up to 10 Gb/s among these networks and often at a higher speed inside the networks themselves. For example, in the United States, there are national multi-10 Gb/s networks, such as National Lambda Rail, Internet2/Abilene, Teragrid, ESNet, etc. They can connect to many international networks such as CA*Net 4 of Canada, SurfNet of the Netherlands, and JGN2 of Japan.

Meanwhile, we are living in a world of exponentially increasing data. The old way of storing data in disk or tape storage and delivering them manually by transport vehicles is no longer efficient. In many situations, the old fashioned method of shipping disks with data on them makes it impossible to meet the applications’ requirements (e.g., online data analysis and processing).

Researchers in high-energy physics, astronomy, earth science, and other high performance computing areas have started to use these high-speed wide area optical networks to transfer terabytes of data. We expect that home Internet users will also be able to make use of the high-speed networks in the near future for applications with high-resolution streaming video, for example. In fact, an experiment between two ISPs in the USA and Korea has demonstrated an effective 80 Mb/s data transfer speed.

Unfortunately, high-speed networks have not been efficiently used by applications with large amounts of data. The Transmission Control Protocol (TCP), the de facto transport protocol of the Internet, substantially underutilizes network bandwidth over high-speed connections with long delays [8], [25]. For example, a single TCP flow with default parameter settings on Linux 2.4 can only reach about 5 Mb/s over a 1 Gb/s link between Chicago and Amsterdam; with careful parameter tuning the throughput still only reaches about 70 Mb/s. A new transport protocol is required to address this challenge. The new protocol is expected to be easily deployed and easily integrated with the applications, in addition to utilizing the bandwidth efficiently and fairly.

Network researchers have proposed quite a few solutions to this problem, most of which are new TCP congestion control algorithms [5], [12], [13], [24], [26], [33], [35], [42] and application level libraries using UDP [14], [38], [40], [41], [45]. Parallel TCP [1], [36] and XCP [25] are two special cases: the former tries to start multiple concurrent TCP flows to obtain more bandwidth, whereas the latter represents a radical change by introducing a new transport layer protocol involving changes in routers.

In UDT we have a unique approach to address the problem of transferring large volumetric datasets over high bandwidth-delay product (BDP) networks. While UDT is a UDP-based approach, to the best of our knowledge, it is the only UDP-based protocol that employs a congestion control algorithm targeting shared networks. Furthermore, UDT is not only a new control algorithm, but also a new application level protocol with support for user configurable control algorithms and more powerful APIs.

This paper summarizes our work on UDT over the past four years. Section 2 gives an overview of the UDT protocol and describes its design and implementation. Section 3 explains its congestion control algorithm. Section 4 introduces Composable UDT that supports configurability of congestion control algorithms. Section 5 gives an experimental evaluation of the UDT performance. Section 6 concludes the paper.

Section snippets

Overview

UDT adapts itself into the layered network protocol architecture (Fig. 1). UDT uses UDP through the socket interface provided by operating systems. Meanwhile, it provides a UDT socket interface to applications. Applications can call the UDT socket API in the same way they call the system socket API. An application can also provide a congestion control class instance (CC in Fig. 1) for UDT to process the control events, thus a customized congestion control scheme will be used, otherwise the

The DAIMD and UDT algorithm

We consider a general class of the following AIMD (additive increase multiplicative decrease) rate control algorithm:

For every rate control interval, if there is no negative feedback from the receiver (loss, increasing delay, etc.), but there are positive feedbacks (acknowledgments), then the packet-sending rate (x) is increased by α(x)xx+α(x),α(x) is non-increasing and it approaches 0 as x increases, i.e., limx→+∞α(x) = 0.

For any negative feedback, the sending rate is decreased by a constant

Overview

While UDT has been successful for bulk data transfer over high-speed networks, we feel that it could have benefited a much broader audience. We expanded UDT so that it can be easily configurable to satisfy more requirements for both network research and application development. We call this Composable UDT.

However, we emphasize here that this framework is not a replacement for, but a complement to, the kernel space network stacks. General protocols like UDP, TCP, DCCP, and SCTP should still

Performance evaluation

In this section, we evaluate UDT’s performance using several experiments on real high-speed networks. While we have also done extensive simulations covering the majority of network situations, we choose real world experiments here because they give us more insight into UDT’s performance.

We use TCP as the baseline to compare against UDT. While there are many new protocols and congestion control algorithms, it is difficult to choose a mature one as the baseline; complete comparison of all these

TCP modifications

Researchers have continually worked to improve TCP. A straightforward approach is to use a larger increase parameter and smaller decrease factor in the AIMD algorithm than those used in the standard TCP algorithm. Scalable TCP [26] and HighSpeed TCP [12] are the two typical examples of this class.

Scalable TCP increases its sending rate proportional to the current value, whereas it only decreases the sending rate by 1/8 when there is packet loss. HighSpeed TCP uses logarithmic increase and

Conclusions

Scalability has been one of the major research problems of the Internet community ever since the emergence of the World Wide Web (WWW). The insufficient number of IP addresses may be the most commonly known scalability problem. However, in many high-speed networks researchers have also found that as a network’s bandwidth-delay product increases TCP, the major Internet data transport protocol, does not scale well either.

As an effective, timely, and practical solution to this BDP scalability

Acknowledgments

The work was support in part by US National Science Foundation, US Department of Energy, and the US Army Pantheon Project.

Yunhong Gu is a research scientist at the National Center for Data Mining. He received a B.E. with Honors in Computer Science from Hangzhou Institute of Electronic Engineering of China in 1998, an M.E. in Computer Science from Beijing University of Aeronautics and Astronautics of China in 2001, and a Ph.D. in Computer Science from University of Illinois at Chicago in 2005. His current research projects include high performance transport protocols and distributed data management. He is the

References (45)

  • William Allcock, John Bresnahan, Rajkumar Kettimuthu, Michael Link, Catalin Dumitrescu, Ioan Raicu, Ian Foster, The...
  • David G. Andersen, Deepak Bansal, Dorothy Curtis, Srinivasan Seshan, Hari Balakrishnan, System support for bandwidth...
  • Cosimo Anglano, Massimo Canonico, A comparative evaluation of high-performance file transfer systems for data-intensive...
  • Amitabha Banerjee, Wu-chun Feng, Biswanath Mukherjee, Dipak Ghosal, Routing and scheduling large file transfers over...
  • Sumitha Bhandarkar, Saurabh Jain, A.L. Narasimha Reddy, Improving TCP performance in high bandwidth high RTT links...
  • Robert Braden, Aaron Falk, Ted Faber, Aman Kapoor, Yuri Pryadkin, Studies of XCP deployment issues, in: Proceedings of...
  • L. Brakmo et al.

    TCP Vegas: End-to-end congestion avoidance on a global Internet

    IEEE Journal on Selected Areas in Communication

    (1995)
  • A. Chien et al.

    Transport protocols for high performance: Whither TCP?

    Communications of the ACM

    (2003)
  • J. Chu, Zero-copy TCP in Solaris, in: Proceedings of the USENIX Annual Conference’96, San Diego, CA, January...
  • Tom DeFanti et al.

    TransLight: a global-scale Lambda Grid for e-science

    Communications of the ACM

    (2003)
  • C. Dovrolis, P. Ramanathan, D. Moore, What do packet dispersion techniques measure? in: Proceedings of the IEEE...
  • S. Floyd, HighSpeed TCP for large congestion windows, IETF, RFC 3649, Experimental Standard, December...
  • M. Gerla et al.

    TCP Westwood: congestion window control using bandwidth estimation

    IEEE Globecom

    (2001)
  • Yunhong Gu et al.

    SABUL: a transport protocol for grid computing

    Journal of Grid Computing

    (2003)
  • Yunhong Gu, Robert L. Grossman, Optimizing UDP-based protocol implementations, in: Proceedings of the Third...
  • Yunhong Gu, Robert L. Grossman, Supporting configurable congestion control in data transport services, in: SC 05,...
  • Yunhong Gu, Robert L. Grossman, UDT: an application level transport protocol for grid computing, in: PFLDNet 2004, The...
  • Yunhong Gu, Xinwei Hong, Robert Grossman, An Analysis of AIMD Algorithms with Decreasing Increases, in: Gridnets 2004,...
  • Yunhong Gu, Xinwei Hong, Robert Grossman, Experiences in design and implementation of a high performance transport...
  • Haryadi S. Gunawi, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Deploying safe user-level network services with...
  • Sangtae Ha, Yusung Kim, Long Le, Injong Rhee, Lisong Xu, A step toward realistic performance evaluation of high-speed...
  • E. He, J. Leigh, O. Yu, T.A. DeFanti, Reliable blast UDP: predictable high performance bulk data transfer, in: IEEE...
  • Cited by (335)

    View all citing articles on Scopus

    Yunhong Gu is a research scientist at the National Center for Data Mining. He received a B.E. with Honors in Computer Science from Hangzhou Institute of Electronic Engineering of China in 1998, an M.E. in Computer Science from Beijing University of Aeronautics and Astronautics of China in 2001, and a Ph.D. in Computer Science from University of Illinois at Chicago in 2005. His current research projects include high performance transport protocols and distributed data management. He is the developer of UDT. He is a member of Sigma Xi, the IEEE, and the ACM.

    Robert L. Grossman is the Director of the Laboratory for Advanced Computing and the National Center for Data Mining at the University of Illinois at Chicago, where he has been a faculty member since 1988. He is also the spokesperson for the Data Mining Group (DMG), an industry consortium responsible for the Predictive Model Markup Language (PMML), an XML language for data mining and predictive modeling. He is the President of Open Data Partners, which provides consulting and outsourced services focused on data. He has published over one hundred papers in refereed journals and proceedings on Internet computing, data mining, high performance networking, business intelligence, and related areas, and lectured extensively at conferences and workshops.

    This paper is partly based upon five conference papers published on the proceedings of PFLDNet workshop 2003 and 2004, IEEE GridNets workshop 2004, and IEEE/ACM SC conference 2004 and 2005. See Refs. [15], [16], [17], [18], [19].

    View full text