skip to main content
10.1145/511334.511364acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
Article

Dynamic statistical profiling of communication activity in distributed applications

Published:01 June 2002Publication History

ABSTRACT

Performance analysis of communication activity for a terascale application with traditional message tracing can be overwhelming in terms of overhead, perturbation, and storage. We propose a novel alternative that enables dynamic statistical profiling of an application's communication activity using message sampling. We have implemented an operational prototype, named PHOTON, and our evidence shows that this new approach can provide an accurate, low-overhead, tractable alternative for performance analysis of communication activity. PHOTON consists of two components: a Message Passing Interface (MPI) profiling layer that implements sampling and analysis, and a modified MPI runtime that appends a small but necessary amount of information to individual messages. More importantly, this alternative enables an assortment of runtime analysis techniques so that, in contrast to post-mortem, trace-based techniques, the raw performance data can be jettisoned immediately after analysis. Our investigation shows that message sampling can reduce overhead to imperceptible levels for many applications. Experiments on several applications demonstrate the viability of this approach. For example, with one application, our technique reduced the analysis overhead from 154% for traditional tracing to 6% for statistical profiling. We also evaluate different sampling techniques in this framework. The coverage of the sample space provided by purely random sampling is superior to counter- and timer-based sampling. Also, PHOTON's design reveals that frugal modifications to the MPI runtime system could facilitate such techniques on production computing systems, and it suggests that this sampling technique could execute continuously for long-running applications.

References

  1. G. S. Almasi, C. Cascaval et al., "Demonstrating the scalability of a molecular dynamics application on a Petaflop computer," Proc. Int'l Conf. Supercomputing, 2001, pp. 393-406.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. M. Anderson, L. M. Berc et al., "Continuous profiling: where have all the cycles gone?," ACM Trans. Computer Systems, 15(4):357-90, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. E. Anderson and E. D. Lazowska, "Quartz: A Tool for Tuning Parallel Program Performance," Proc. 1990 SIGMETRICS Conf. Measurement and Modeling Computer Systems, 1990, pp. 115-25.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Bosch, C. Stolte et al., "Rivet: a flexible environment for computer systems visualization," Computer Graphics, 34(1):68-73, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. N. Brown, R. D. Falgout, and J. E. Jones, "Semicoarsening multigrid on distributed memory machines," SIAM Journal on Scientific Computing, 21(5):1823-34, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Caubet, J. Gimenez et al., "A Dynamic Tracing Mechanism for Performance Analysis of OpenMP Applications," Proc. Workshop on OpenMP Applications and Tools (WOMPAT), 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. C. Claffy, G. C. Polyzos, and H.-W. Braun, "Application of sampling methodologies to network traffic characterization," Proc. SIGCOMM: Communications architectures, protocols and applications, 1993, pp. 194-203.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. A. Geist, M. T. Heath et al., "A Users' Guide to PICL - A Portable Instrumented Communication Library," Oak Ridge National Laboratory, P.O.Box 2009, Bldg. 9207-A, Oak Ridge, TN 37831-8083 1991.]]Google ScholarGoogle Scholar
  9. S. L. Graham, P. B. Kessler, and M. K. McKusick, "Gprof: A Call Graph Execution Profiler," SIGPLAN Notices (SIGPLAN '82 Symp. Compiler Construction), 17(6):120-6, 1982.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. Gropp, E. Lusk, and A. Skjellum, Using MPI: portable parallel programming with the message-passing interface, 2nd ed. Cambridge, MA: MIT Press, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. D. Gropp, E. Lusk, and D. Swider, "Improving the Performance of MPI Derived Datatypes," Proc. MPI Developers and Users Conference (MPIDC), 1999.]]Google ScholarGoogle Scholar
  12. W. Gu, G. Eisenhauer et al., "Falcon: On-line Monitoring and Steering of Parallel Programs," Concurrency: Practice and Experience, 10(9):699-736, 1998.]]Google ScholarGoogle ScholarCross RefCross Ref
  13. M. T. Heath, A. D. Malony, and D. T. Rover, "Parallel performance visualization: from practice to theory," IEEE Parallel & Distributed Technology: Systems & Applications, 3(4):44-60, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Hoeflinger, B. Kuhn et al., "An Integrated Performance Visualizer for OpenMP/MPI Programs," Proc. Workshop on OpenMP Applications and Tools (WOMPAT), 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Hoisie, O. Lubeck et al., "A General Predictive Performance Model for Wavefront Algorithms on Clusters of SMPs," Proc. ICPP 2000, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. K. R. Koch, R. S. Baker, and R. E. Alcouffe, "Solution of the First-Order Form of the 3-D Discrete Ordinates Equation on a Massively Parallel Processor," Trans. Amer. Nuc. Soc., 65(198), 1992.]]Google ScholarGoogle Scholar
  17. J. Labarta, S. Girona et al., "DiP: A Parallel Program Development Environment," CEPBA, Barcelona, Spain 1996.]]Google ScholarGoogle Scholar
  18. A. D. Malony and D. A. Reed, "Visualizing Parallel Computer System Performance," in Parallel Computer Systems: Performance Instrumentation and Visualization, M. S. Bucher, Ed. New York: ACM, 1990.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. A. Mirin, R. H. Cohen et al., "Very High Resolution Simulation of Compressible Turbulence on the IBM-SP System," Proc. SC99: High Performance Networking and Computing Conf. (electronic publication), 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. A. Reed, P. C. Roth et al., "Scalable performance analysis: the Pablo performance analysis environment," Proc. Scalable Parallel Libraries Conf., 1994, pp. 104-13.]]Google ScholarGoogle ScholarCross RefCross Ref
  21. S. Shende, A. D. Malony et al., "Portable profiling and tracing for parallel, scientific applications using C++," Proc. SIGMETRICS Symp. Parallel and Distributed Tools (SPDT), 1998, pp. 134-45.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Snir, S. Otto et al., Eds., MPI--the complete reference, 2nd ed. Cambridge, MA: MIT Press, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Stasko, J. Domingue et al., Eds., Software Visualization: Programming as a Multimedia Experience,. Cambridge, MA: MIT Press, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. S. Vetter, "Performance Analysis of Distributed Applications using Automatic Classification of Communication Inefficiencies," Proc. ACM Int'l Conf. Supercomputing (ICS), 2000, pp. 245 - 54.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. S. Vetter and F. Mueller, "Communication Characteristics of Large-Scale Scientific Applications for Contemporary Cluster Architectures," Proc. International Parallel and Distributed Processing Symposium (IPDPS), 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. C. E. Wu, A. Bolmarcich et al., "From Trace Generation to Visualization: A Performance Framework for Distributed Parallel Systems," Proc. SC2000: High Performance Networking and Computing, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Dynamic statistical profiling of communication activity in distributed applications

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SIGMETRICS '02: Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
            June 2002
            299 pages
            ISBN:1581135319
            DOI:10.1145/511334
            • cover image ACM SIGMETRICS Performance Evaluation Review
              ACM SIGMETRICS Performance Evaluation Review  Volume 30, Issue 1
              Measurement and modeling of computer systems
              June 2002
              286 pages
              ISSN:0163-5999
              DOI:10.1145/511399
              Issue’s Table of Contents

            Copyright © 2002 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 1 June 2002

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            SIGMETRICS '02 Paper Acceptance Rate23of170submissions,14%Overall Acceptance Rate459of2,691submissions,17%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader