ABSTRACT
The increased popularity of grid systems and cycle sharing across organizations requires scalable systems that provide facilities to locate resources, to be fair in the use of those resources, and to monitor jobs executing on remote systems. This paper describes the GridCop system which allows a computation on a remote, and potentially fraudulent, host system to be monitored for progress and execution correctness. A novel feature of our system is that it constructs cooperating submitter and host programs from the original program, and these programs allow both progress and execution correctness to be monitored with negligible overhead while providing protection against common fraudulent behaviors. Experimental results show that the overhead of this monitoring is low on both the submitting and host machines. We describe compiler algorithms that allow the required monitoring code to be automatically generated.
- P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xen and the Art of Virtualization. In Proc. 19th ACM Symposium on Operating Systems Principles, October 2003.]] Google ScholarDigital Library
- A. R. Butt, S. Adabala, N. H. Kapadia, R. J. Figueiredo, and J. A. B. Fortes. Grid-computing Portals and Security Issues. Journal of Parallel and Distributed Computing: Special issue on Scalable Web Services and Architecture, 63(10), October 2003.]] Google ScholarDigital Library
- A. R. Butt, X. Fang, Y. C. Hu, and S. Midkiff. Java, Peer-to-Peer, and Accountability: Building Blocks for Distributed Cycle Sharing. In Proceedings of the 3rd USENIX Virtual Machines Research and Technology Syposium (VM '04), May 2004.]] Google ScholarDigital Library
- M. Castro, P. Druschel, Y. C. Hu, and A. Rowstron. Exploiting Network Proximity in Peer-to-Peer OverlayNetworks. Technical report, Technical Report MSR-TR-2002-82, 2002.]]Google Scholar
- D. Cheng and R. Hood. A Portable Debugger for Parallel and Distributed Programs. In Proceedings of the 1994 ACM/IEEE conference on Supercomputing (SC'94), November 1994.]] Google ScholarDigital Library
- M. J. Clement and M. J. Quinn. Analytical Performance Prediction on Multicomputers. In Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, November 1993.]] Google ScholarDigital Library
- L. P. Cox and B. D. Noble. Samsara: Honor Among Thieves in Peer-to-Peer Storage. In Proc. 19th ACM Symposium on Operating Systems Principles, October 2003.]] Google ScholarDigital Library
- A. P. David. BOINC:A System for Public-Resource Computing and Storage. In Proc. 5th IEEE/ACM International Workshop on Grid Computing, November 2004.]] Google ScholarDigital Library
- W. Du, J. Jia, M. Mangal, and M. Murugesan. Uncheatable Grid Computing. In Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS'04), March 2004.]] Google ScholarDigital Library
- I. Foster, A. Roy, and V. Sander. A Quality of Service Architecture that Combines Resource Reservation and Application Adaptation. In Proc. 8th International Workshop on Quality of Service, June 2000.]]Google ScholarCross Ref
- Genome@home. Genome at home. http://www.stanford.edu/group/pandegroup/genome/index.html (December 16, 2004).]]Google Scholar
- M. Gupta, S. Midkiff, E. Schonberg, V. Seshadri, D. Shields, K.-Y. Wang, W.-M. Ching, and T. Ngo. An HPF Compiler for the IBM SP2. In Proceedings of the 1995 ACM/IEEE Conference on Supercomputing (CDROM). ACM Press, December 1995.]] Google ScholarDigital Library
- X. Jiang and D. Xu. Collapsar: A VM-Based Architecture for Network Attack Detention Center. In Proceedings of the 13th USENIX Security Symposium (Security'04), August 2004.]] Google ScholarDigital Library
- S. Kannan, M. Roberts, P. Mayes, D. Brelsford, and J. F. Skovira. Workload Management with LoadLeveler. IBM International Technical Support Organization, 2001. http://www.ibm.com/redbooks (Dec. 17, 2004), publication number SG24-6038-00.]]Google Scholar
- K. Knobe and V. Sarkar. Array SSA Form and Its Use in Parallelization. In Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of Programming Languages (POPL), January 1998.]] Google ScholarDigital Library
- C. Koelbel, D. Loveman, R. Schreiber, G. Steele, and M. Zosel. The High Performance Fortran Handbook. MIT Press, 1993.]] Google ScholarDigital Library
- C. Koelbel and P. Mehrotra. Compiling Global Name-space Parallel Loops for Distributed Execution. IEEE Transactions on Parallel and Distributed Systems, 2(4):440--451, October 1991.]] Google ScholarDigital Library
- Y.-J. Lee and M. Hall. A Code Isolator: Isolating Code Fragments from Large Programs. In 17th Workshop on Languages and Compilers for Parallel Computing (LCPC '04), September 2004.]] Google ScholarDigital Library
- S.-W. Liao, A. Diwan, R. P. Bosch, Jr., A. Ghuloum, and M. S. Lam. SUIF Explorer: an Interactive and Interprocedural Parallelizer. In Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM Press, May 1999.]] Google ScholarDigital Library
- M. Litzkow, M. Livny, and M. Mutka. Condor - A Hunter of Idle Workstations. In Proc. 8th International Conference on Distributed Computing Systems (ICDCS 1988), June 1988.]]Google ScholarCross Ref
- M. Mock, D. C. Atkinson, C. Chambers, and S. J. Eggers. Improving Program Slicing with Dynamic Points-to Data. In Proceedings of the tenth ACM SIGSOFT Symposium on Foundations of software engineering. ACM Press, November 2002.]] Google ScholarDigital Library
- Nile. Scalable Solution for Distributed Processing of Independant Data. http://www.nile.cornell.edu/index.html (September 29, 2003).]]Google Scholar
- D. D. Redell. Experience with Topaz Teledebugging. In Proceedings SIGPLAN/SIGOPS Workshop on Parallel and Distributed Debugging, May 1988.]] Google ScholarDigital Library
- R. L. Rivest. RFC 1321 --MD5 Message-Digest Algorithm, 1992.]] Google ScholarDigital Library
- A. Rowstron and P. Druschel. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Proc. IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), November 2001.]] Google ScholarDigital Library
- L. F. Sarmenta. Sabotage Tolerance Mechanism for Volunteer Computing Systems. In CCGrid'01, May 2001.]] Google ScholarDigital Library
- SETI@home. Search for extraterrestrial intelligence at home. http://setiathome.ssl.berkeley.edu/index.html (December 16, 2004).]]Google Scholar
- L. A. Smith, J. M. Bull, and J. Obdrzalek. A Parallel Java Grande Benchmark Suite. In Proceedings of the 2001 ACM/IEEE Conference on Supercomputing (SC2001), November 2001.]] Google ScholarDigital Library
- K.-Y. Wang. Precise compile-time performance prediction for superscalar-based computers. In PLDI '94: Proceedings of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation, June 1994.]] Google ScholarDigital Library
- M. Weiser. Program slicing. In Proceedings of the 5th international conference on Software engineering. IEEE Press, 1981.]] Google ScholarDigital Library
Index Terms
- Trust but verify: monitoring remotely executing programs for progress and correctness
Recommendations
Trustworthy remote compiling services for grid-based scientific applications
Grid computing, which is characterized by large-scale sharing and collaboration of dynamic resources, is becoming an emerging computing platform on a global scale for data-intensive and computation-intensive scientific application. However, the ...
VisualGrid: enabling runtime applications monitoring in grid environments
Scientific visualisation represents an important tool to help researchers in understanding huge and complex results produced in the research laboratories. The purpose of this article is to describe how scientific visualisation techniques can be used in ...
DEISA--Distributed European Infrastructure for Supercomputing Applications
The paper presents an overview of the current research and achievements of the DEISA project, with a focus on the general concept of the infrastructure, the operational model, application projects and science communities, the DEISA Extreme Computing ...
Comments