UDT: UDP-based data transfer for high-speed wide area networks☆
Introduction
The rapid increase of network bandwidth and the emergence of new distributed applications are the two driving forces for networking research and development. On the one hand, network bandwidth today has been expanded to 10 Gb/s with 100 Gb/s emerging, which enables many data intensive applications that were impossible in the past. On the other hand, new applications, such as scientific data distribution, expedite the deployment of high-speed wide area networks.
Today, national or international high-speed networks have connected most developed regions in the world with fiber [8], [10]. Data can be moved at up to 10 Gb/s among these networks and often at a higher speed inside the networks themselves. For example, in the United States, there are national multi-10 Gb/s networks, such as National Lambda Rail, Internet2/Abilene, Teragrid, ESNet, etc. They can connect to many international networks such as CA*Net 4 of Canada, SurfNet of the Netherlands, and JGN2 of Japan.
Meanwhile, we are living in a world of exponentially increasing data. The old way of storing data in disk or tape storage and delivering them manually by transport vehicles is no longer efficient. In many situations, the old fashioned method of shipping disks with data on them makes it impossible to meet the applications’ requirements (e.g., online data analysis and processing).
Researchers in high-energy physics, astronomy, earth science, and other high performance computing areas have started to use these high-speed wide area optical networks to transfer terabytes of data. We expect that home Internet users will also be able to make use of the high-speed networks in the near future for applications with high-resolution streaming video, for example. In fact, an experiment between two ISPs in the USA and Korea has demonstrated an effective 80 Mb/s data transfer speed.
Unfortunately, high-speed networks have not been efficiently used by applications with large amounts of data. The Transmission Control Protocol (TCP), the de facto transport protocol of the Internet, substantially underutilizes network bandwidth over high-speed connections with long delays [8], [25]. For example, a single TCP flow with default parameter settings on Linux 2.4 can only reach about 5 Mb/s over a 1 Gb/s link between Chicago and Amsterdam; with careful parameter tuning the throughput still only reaches about 70 Mb/s. A new transport protocol is required to address this challenge. The new protocol is expected to be easily deployed and easily integrated with the applications, in addition to utilizing the bandwidth efficiently and fairly.
Network researchers have proposed quite a few solutions to this problem, most of which are new TCP congestion control algorithms [5], [12], [13], [24], [26], [33], [35], [42] and application level libraries using UDP [14], [38], [40], [41], [45]. Parallel TCP [1], [36] and XCP [25] are two special cases: the former tries to start multiple concurrent TCP flows to obtain more bandwidth, whereas the latter represents a radical change by introducing a new transport layer protocol involving changes in routers.
In UDT we have a unique approach to address the problem of transferring large volumetric datasets over high bandwidth-delay product (BDP) networks. While UDT is a UDP-based approach, to the best of our knowledge, it is the only UDP-based protocol that employs a congestion control algorithm targeting shared networks. Furthermore, UDT is not only a new control algorithm, but also a new application level protocol with support for user configurable control algorithms and more powerful APIs.
This paper summarizes our work on UDT over the past four years. Section 2 gives an overview of the UDT protocol and describes its design and implementation. Section 3 explains its congestion control algorithm. Section 4 introduces Composable UDT that supports configurability of congestion control algorithms. Section 5 gives an experimental evaluation of the UDT performance. Section 6 concludes the paper.
Section snippets
Overview
UDT adapts itself into the layered network protocol architecture (Fig. 1). UDT uses UDP through the socket interface provided by operating systems. Meanwhile, it provides a UDT socket interface to applications. Applications can call the UDT socket API in the same way they call the system socket API. An application can also provide a congestion control class instance (CC in Fig. 1) for UDT to process the control events, thus a customized congestion control scheme will be used, otherwise the
The DAIMD and UDT algorithm
We consider a general class of the following AIMD (additive increase multiplicative decrease) rate control algorithm:
For every rate control interval, if there is no negative feedback from the receiver (loss, increasing delay, etc.), but there are positive feedbacks (acknowledgments), then the packet-sending rate (x) is increased by α(x)α(x) is non-increasing and it approaches 0 as x increases, i.e., limx→+∞α(x) = 0.
For any negative feedback, the sending rate is decreased by a constant
Overview
While UDT has been successful for bulk data transfer over high-speed networks, we feel that it could have benefited a much broader audience. We expanded UDT so that it can be easily configurable to satisfy more requirements for both network research and application development. We call this Composable UDT.
However, we emphasize here that this framework is not a replacement for, but a complement to, the kernel space network stacks. General protocols like UDP, TCP, DCCP, and SCTP should still
Performance evaluation
In this section, we evaluate UDT’s performance using several experiments on real high-speed networks. While we have also done extensive simulations covering the majority of network situations, we choose real world experiments here because they give us more insight into UDT’s performance.
We use TCP as the baseline to compare against UDT. While there are many new protocols and congestion control algorithms, it is difficult to choose a mature one as the baseline; complete comparison of all these
TCP modifications
Researchers have continually worked to improve TCP. A straightforward approach is to use a larger increase parameter and smaller decrease factor in the AIMD algorithm than those used in the standard TCP algorithm. Scalable TCP [26] and HighSpeed TCP [12] are the two typical examples of this class.
Scalable TCP increases its sending rate proportional to the current value, whereas it only decreases the sending rate by 1/8 when there is packet loss. HighSpeed TCP uses logarithmic increase and
Conclusions
Scalability has been one of the major research problems of the Internet community ever since the emergence of the World Wide Web (WWW). The insufficient number of IP addresses may be the most commonly known scalability problem. However, in many high-speed networks researchers have also found that as a network’s bandwidth-delay product increases TCP, the major Internet data transport protocol, does not scale well either.
As an effective, timely, and practical solution to this BDP scalability
Acknowledgments
The work was support in part by US National Science Foundation, US Department of Energy, and the US Army Pantheon Project.
Yunhong Gu is a research scientist at the National Center for Data Mining. He received a B.E. with Honors in Computer Science from Hangzhou Institute of Electronic Engineering of China in 1998, an M.E. in Computer Science from Beijing University of Aeronautics and Astronautics of China in 2001, and a Ph.D. in Computer Science from University of Illinois at Chicago in 2005. His current research projects include high performance transport protocols and distributed data management. He is the
References (45)
- William Allcock, John Bresnahan, Rajkumar Kettimuthu, Michael Link, Catalin Dumitrescu, Ioan Raicu, Ian Foster, The...
- David G. Andersen, Deepak Bansal, Dorothy Curtis, Srinivasan Seshan, Hari Balakrishnan, System support for bandwidth...
- Cosimo Anglano, Massimo Canonico, A comparative evaluation of high-performance file transfer systems for data-intensive...
- Amitabha Banerjee, Wu-chun Feng, Biswanath Mukherjee, Dipak Ghosal, Routing and scheduling large file transfers over...
- Sumitha Bhandarkar, Saurabh Jain, A.L. Narasimha Reddy, Improving TCP performance in high bandwidth high RTT links...
- Robert Braden, Aaron Falk, Ted Faber, Aman Kapoor, Yuri Pryadkin, Studies of XCP deployment issues, in: Proceedings of...
- et al.
TCP Vegas: End-to-end congestion avoidance on a global Internet
IEEE Journal on Selected Areas in Communication
(1995) - et al.
Transport protocols for high performance: Whither TCP?
Communications of the ACM
(2003) - J. Chu, Zero-copy TCP in Solaris, in: Proceedings of the USENIX Annual Conference’96, San Diego, CA, January...
- et al.
TransLight: a global-scale Lambda Grid for e-science
Communications of the ACM
(2003)
TCP Westwood: congestion window control using bandwidth estimation
IEEE Globecom
SABUL: a transport protocol for grid computing
Journal of Grid Computing
Cited by (335)
Transport in the IP-based Internet of Things: Status report
2023, Procedia Computer ScienceGigabit Modbus user datagram protocol fieldbus network integrated with industrial vision communication
2022, Microprocessors and MicrosystemsLiteFlow: Toward High-Performance Adaptive Neural Networks for Kernel Datapath
2024, IEEE/ACM Transactions on Networking3RE-Net: Joint Loss-REcovery and Super-REsolution Neural Network for REal-Time Video
2024, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)Experimental Study of TCP Throughput Profiles and Dynamics Over Dedicated Connections
2023, ACM International Conference Proceeding SeriesN-Redundant UDP - a fault tolerance scheme for real-time notification
2023, Research Square
Yunhong Gu is a research scientist at the National Center for Data Mining. He received a B.E. with Honors in Computer Science from Hangzhou Institute of Electronic Engineering of China in 1998, an M.E. in Computer Science from Beijing University of Aeronautics and Astronautics of China in 2001, and a Ph.D. in Computer Science from University of Illinois at Chicago in 2005. His current research projects include high performance transport protocols and distributed data management. He is the developer of UDT. He is a member of Sigma Xi, the IEEE, and the ACM.
Robert L. Grossman is the Director of the Laboratory for Advanced Computing and the National Center for Data Mining at the University of Illinois at Chicago, where he has been a faculty member since 1988. He is also the spokesperson for the Data Mining Group (DMG), an industry consortium responsible for the Predictive Model Markup Language (PMML), an XML language for data mining and predictive modeling. He is the President of Open Data Partners, which provides consulting and outsourced services focused on data. He has published over one hundred papers in refereed journals and proceedings on Internet computing, data mining, high performance networking, business intelligence, and related areas, and lectured extensively at conferences and workshops.