Abstract
We systematically evaluate the performance of five implementations of a single, user-level communication interface. Each implementation makes different architectural assumptions about the reliability of the network hardware and the capabilities of the network interface. The implementations differ accordingly in their division of protocol tasks between host software, network-interface firmware, and network hardware. Using microbenchmarks, parallel-programming systems, and parallel applications, we assess the performance impact of different protocol decompositions. We show how moving protocol tasks to a relatively slow network interface yields both performance advantages and disadvantages, depending on the characteristics of the application and the underlying parallel-programming system. In particular, we show that a communication system that assumes highly reliable network hardware and that uses network-interface support to process multicast traffic performs best for all applications.
- 1 S. Araki, A. Bilas, C. Dubnicki, J. Edler, K. Konishi, and J. Philbin. User-Space Communication: A Quantitative Study. In Supercomputing'98, Orlando, FL, Nov. 1998. Google ScholarDigital Library
- 2 M. Aron and P. Druschel. Soft Timers: Efficient Microsecond Software Timer Support for Network Processing. In Proc. of the }Tth Syrup. on Operating Systems Principles, pp. 232- 246, Kiawah Island Resort, SC, Dec. 1999. Google ScholarDigital Library
- 3 H. Bal, R. Bhoedjang, R. Hofman, C. Jacobs, K. Langendoen, T. RiJhl, and M. Kaashoek. Performance Evaluation of the Orca Shared Object System. ACM Trans. on Computer Systems, 16(1):1-40, Feb. 1998. Google ScholarDigital Library
- 4 R. Bhoedjang. Communication Architectures for Parallel- Programming Systems. PhD thesis, Dept. of Computer Science, Vrije Universiteit, Amsterdam, The Netherlands, June 2000.Google Scholar
- 5 R. Bhoedjang, T. Riihl, and H. Bal. Efficient Multicast on Myrinet Using Link-Level Flow Control. In Proc. of the Int. Conf. on Parallel Processing, pp. 381-390, Minneapolis, MN, Aug. 1998. Google ScholarDigital Library
- 6 R. Bhoedjang, T. R~ihl, and H. Bal. User-Level Network Interface Protocols. IEEE Computer, 31(11):53-60, Nov. 1998. Google ScholarDigital Library
- 7 A. Bilas, D. Jiang, Y. Zhou, and J. Singh. Limits to the Performance of Software Shared Memory: A Layered Approach. In Proc. of the 5th Int. Symp. on High-Performance Computer Architecture, pp. 193-202, Orlando, FL, Jan. 1999. Google ScholarDigital Library
- 8 N. Boden, D. Cohen, R. Felderrnan, A. Kulawik, C. Seitz, J. Seizovic, and W. Su. Myrinet: A Gigabit-per-second Local Area Network. IEEE Micro, 15(1):29-36, Feb. 1995. Google ScholarDigital Library
- 9 B. Chun, A. Mainwaring, and D. Culler. Virtual Network Transport Protocols for Myrinet. In Hot Interconnects'97, Stanford, CA, Apr. 1997.Google Scholar
- 10 D. Culler, L. Liu, R. Martin, and C. Yoshikawa. Assessing Fast Network Interfaces. IEEE Micro, 16(1):35--43, February 1996. Google ScholarDigital Library
- 11 C. Dubnicki, A. Bilas, Y. Chen, S. Damianakis, and K. Li. VMMC-2: Efficient Support for Reliable, Connection- Oriented Communication. In Hot Interconnects'97, Stanford, CA, Apr. 1997.Google Scholar
- 12 M. Gerla, P. Palnati, and S. Walton. Multicasting Protocols for High-Speed, Wormhole-Routing Local Area Networks. In Proc. of the 1996 Conf. on Communications Architectures, Protocols, and Applications (SIGCOMM), pp. 184-193, Stanford University, CA, Aug. 1996. Google ScholarDigital Library
- 13 W. Gropp, E. Lusk, N. Doss, and A. Skjellum. A High- Performance, Portable Implementation of the MPI Message Passing Interface Standard. Parallel Computing, 22(6):789- 828, Sept. 1996. Google ScholarDigital Library
- 14 Y. Huang and P. McKinley. Efficient Collective Operations with ATM Network Interface Support. In Proc. of the Int. Conf. on Parallel Processing, pp. 34--43, Bloomingdale, IL, Aug. 1996.Google ScholarCross Ref
- 15 K. Johnson, M. Kaashoek, and D. Wallach. CRL: High- Performance All-Software Distributed Shared Memory. In Proc. of the 15th Syrup. on Operating Systems Principles, pp. 213-226, Copper Mountain, CO, Dec. 1995. Google ScholarDigital Library
- 16 V. Karamcheti and A. Chien. Software Overhead in Messaging Layers: Where Does the Time Go? In Proc. of the 6th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 51-60, San Jose, CA, Oct. 1994. Google ScholarDigital Library
- 17 R. Kesavan and D. Panda. Optimal Multicast with Packetization and Network Interface Support. In Proc. of the Int. Conf. on Parallel Processing, pp. 370-377, Bloomingdale, IL, Aug. 1997. Google ScholarDigital Library
- 18 A. Krishnamurthy, K. Schauser, C. Scheiman, R. Wang, D. Culler, and K. Yelick. Evaluation of Architectural Support for Global Address-Based Communication in Large-Scale Parallel Machines. In Proc. of the 7th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 37--48, Cambridge, MA, Oct. 1996. Google ScholarDigital Library
- 19 O. Maquelin, G. Gao, H. Hum, K. Theobald, and X. Tian. Polling Watchdog: Combining Polling and Interrupts for Efficient Message Handling. In Proc. of the 23rd Int. Syrup. on Computer Architecture, pp. 179-188, Philadelphia, PA, May 1996. Google ScholarDigital Library
- 20 R. Martin, A. Vahdat, D. Culler, and T. Anderson. Effects of Communication Latency, Overhead, and Bandwidth in a Cluster Architecture. In Proc. of the 24th Int. Symp. on ComputerArchitecture, pp. 85-97, Denver, CO, June 1997. Google ScholarDigital Library
- 21 D. Mosberger and L. Peterson. Careful Protocols or How to Use Highly Reliable Networks. In Proc. of the Fourth Workshop on Workstation Operating Systems, pp. 80-84, Napa, CA, Oct. 1993.Google ScholarCross Ref
- 22 S. Pakin, M. Lauria, and A. Chien. High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet. In Supercomputing "95, San Diego, CA, Dec. 1995. Google ScholarDigital Library
- 23 J. Romein, A. Plaat, H. Bal, and J. Schaeffer. Transposition Driven Work Scheduling in Distributed Search. In AAAI National Conference, pp. 725-731, Orlando, FL, July 1999. Google ScholarDigital Library
- 24 H. Tezuka, A. Hod, Y. Ishikawa, and M. Sato. PM: An Operating System Coordinated High-Performance Communication Library. In High-Performance Computing and Networking (LNCS 1225), pp. 708-717, Vienna, Austria, Apr. 1997. Google ScholarDigital Library
- 25 K. Verstoep, K. Langendoen, and H. Bal. Efficient Reliable Multicast on Myrinet. In Proc. of the Int. Conf. on Parallel Processing, pp. 156-165, Bloomingdale, IL, Aug. 1996. Google ScholarDigital Library
- 26 T. von Eicken, A. Basu, V. Buch, and W. Vogels. U-Net: A User-Level Network Interface for Parallel and Distributed Computing. In Proc. of the 15th Symp. on Operating Systems Principles, pp. 303-316, Copper Mountain, CO, Dec. 1995. Google ScholarDigital Library
Index Terms
- Evaluating design alternatives for reliable communication on high-speed networks
Recommendations
Evaluating design alternatives for reliable communication on high-speed networks
We systematically evaluate the performance of five implementations of a single, user-level communication interface. Each implementation makes different architectural assumptions about the reliability of the network hardware and the capabilities of the ...
Evaluating design alternatives for reliable communication on high-speed networks
Special Issue: Proceedings of the ninth international conference on Architectural support for programming languages and operating systems (ASPLOS '00)We systematically evaluate the performance of five implementations of a single, user-level communication interface. Each implementation makes different architectural assumptions about the reliability of the network hardware and the capabilities of the ...
Evaluating design alternatives for reliable communication on high-speed networks
ASPLOS IX: Proceedings of the ninth international conference on Architectural support for programming languages and operating systemsWe systematically evaluate the performance of five implementations of a single, user-level communication interface. Each implementation makes different architectural assumptions about the reliability of the network hardware and the capabilities of the ...
Comments