skip to main content
10.1145/1065944.1065971acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
Article

Trust but verify: monitoring remotely executing programs for progress and correctness

Published:15 June 2005Publication History

ABSTRACT

The increased popularity of grid systems and cycle sharing across organizations requires scalable systems that provide facilities to locate resources, to be fair in the use of those resources, and to monitor jobs executing on remote systems. This paper describes the GridCop system which allows a computation on a remote, and potentially fraudulent, host system to be monitored for progress and execution correctness. A novel feature of our system is that it constructs cooperating submitter and host programs from the original program, and these programs allow both progress and execution correctness to be monitored with negligible overhead while providing protection against common fraudulent behaviors. Experimental results show that the overhead of this monitoring is low on both the submitting and host machines. We describe compiler algorithms that allow the required monitoring code to be automatically generated.

References

  1. P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xen and the Art of Virtualization. In Proc. 19th ACM Symposium on Operating Systems Principles, October 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. R. Butt, S. Adabala, N. H. Kapadia, R. J. Figueiredo, and J. A. B. Fortes. Grid-computing Portals and Security Issues. Journal of Parallel and Distributed Computing: Special issue on Scalable Web Services and Architecture, 63(10), October 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. R. Butt, X. Fang, Y. C. Hu, and S. Midkiff. Java, Peer-to-Peer, and Accountability: Building Blocks for Distributed Cycle Sharing. In Proceedings of the 3rd USENIX Virtual Machines Research and Technology Syposium (VM '04), May 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Castro, P. Druschel, Y. C. Hu, and A. Rowstron. Exploiting Network Proximity in Peer-to-Peer OverlayNetworks. Technical report, Technical Report MSR-TR-2002-82, 2002.]]Google ScholarGoogle Scholar
  5. D. Cheng and R. Hood. A Portable Debugger for Parallel and Distributed Programs. In Proceedings of the 1994 ACM/IEEE conference on Supercomputing (SC'94), November 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. J. Clement and M. J. Quinn. Analytical Performance Prediction on Multicomputers. In Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, November 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. P. Cox and B. D. Noble. Samsara: Honor Among Thieves in Peer-to-Peer Storage. In Proc. 19th ACM Symposium on Operating Systems Principles, October 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. P. David. BOINC:A System for Public-Resource Computing and Storage. In Proc. 5th IEEE/ACM International Workshop on Grid Computing, November 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. Du, J. Jia, M. Mangal, and M. Murugesan. Uncheatable Grid Computing. In Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS'04), March 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. I. Foster, A. Roy, and V. Sander. A Quality of Service Architecture that Combines Resource Reservation and Application Adaptation. In Proc. 8th International Workshop on Quality of Service, June 2000.]]Google ScholarGoogle ScholarCross RefCross Ref
  11. Genome@home. Genome at home. http://www.stanford.edu/group/pandegroup/genome/index.html (December 16, 2004).]]Google ScholarGoogle Scholar
  12. M. Gupta, S. Midkiff, E. Schonberg, V. Seshadri, D. Shields, K.-Y. Wang, W.-M. Ching, and T. Ngo. An HPF Compiler for the IBM SP2. In Proceedings of the 1995 ACM/IEEE Conference on Supercomputing (CDROM). ACM Press, December 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. X. Jiang and D. Xu. Collapsar: A VM-Based Architecture for Network Attack Detention Center. In Proceedings of the 13th USENIX Security Symposium (Security'04), August 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Kannan, M. Roberts, P. Mayes, D. Brelsford, and J. F. Skovira. Workload Management with LoadLeveler. IBM International Technical Support Organization, 2001. http://www.ibm.com/redbooks (Dec. 17, 2004), publication number SG24-6038-00.]]Google ScholarGoogle Scholar
  15. K. Knobe and V. Sarkar. Array SSA Form and Its Use in Parallelization. In Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of Programming Languages (POPL), January 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Koelbel, D. Loveman, R. Schreiber, G. Steele, and M. Zosel. The High Performance Fortran Handbook. MIT Press, 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Koelbel and P. Mehrotra. Compiling Global Name-space Parallel Loops for Distributed Execution. IEEE Transactions on Parallel and Distributed Systems, 2(4):440--451, October 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y.-J. Lee and M. Hall. A Code Isolator: Isolating Code Fragments from Large Programs. In 17th Workshop on Languages and Compilers for Parallel Computing (LCPC '04), September 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S.-W. Liao, A. Diwan, R. P. Bosch, Jr., A. Ghuloum, and M. S. Lam. SUIF Explorer: an Interactive and Interprocedural Parallelizer. In Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM Press, May 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Litzkow, M. Livny, and M. Mutka. Condor - A Hunter of Idle Workstations. In Proc. 8th International Conference on Distributed Computing Systems (ICDCS 1988), June 1988.]]Google ScholarGoogle ScholarCross RefCross Ref
  21. M. Mock, D. C. Atkinson, C. Chambers, and S. J. Eggers. Improving Program Slicing with Dynamic Points-to Data. In Proceedings of the tenth ACM SIGSOFT Symposium on Foundations of software engineering. ACM Press, November 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Nile. Scalable Solution for Distributed Processing of Independant Data. http://www.nile.cornell.edu/index.html (September 29, 2003).]]Google ScholarGoogle Scholar
  23. D. D. Redell. Experience with Topaz Teledebugging. In Proceedings SIGPLAN/SIGOPS Workshop on Parallel and Distributed Debugging, May 1988.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. L. Rivest. RFC 1321 --MD5 Message-Digest Algorithm, 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Rowstron and P. Druschel. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Proc. IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), November 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. F. Sarmenta. Sabotage Tolerance Mechanism for Volunteer Computing Systems. In CCGrid'01, May 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. SETI@home. Search for extraterrestrial intelligence at home. http://setiathome.ssl.berkeley.edu/index.html (December 16, 2004).]]Google ScholarGoogle Scholar
  28. L. A. Smith, J. M. Bull, and J. Obdrzalek. A Parallel Java Grande Benchmark Suite. In Proceedings of the 2001 ACM/IEEE Conference on Supercomputing (SC2001), November 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. K.-Y. Wang. Precise compile-time performance prediction for superscalar-based computers. In PLDI '94: Proceedings of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation, June 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Weiser. Program slicing. In Proceedings of the 5th international conference on Software engineering. IEEE Press, 1981.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Trust but verify: monitoring remotely executing programs for progress and correctness

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              PPoPP '05: Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
              June 2005
              310 pages
              ISBN:1595930809
              DOI:10.1145/1065944

              Copyright © 2005 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 15 June 2005

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • Article

              Acceptance Rates

              Overall Acceptance Rate230of1,014submissions,23%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader