Abstract
Ensembles of distributed, heterogeneous resources, or Computational Grids, have emerged as popular platforms for deploying large-scale and resource-intensive applications. Large collaborative efforts are currently underway to provide the necessary software infrastructure. Grid computing raises challenging issues in many areas of computer science, and especially in the area of distributed computing, as Computational Grids cover increasingly large networks and span many organizations. In this paper we briefly motivate Grid computing and introduce its basic concepts. We then highlight a number of distributed computing research questions, and discuss both the relevance and the short-comings of previous research results when applied to Grid computing. We choose to focus on issues concerning the dissemination and retrieval of information and data on Computational Grid platforms. We feel that these issues are particularly critical at this time, and as we can point to preliminary ideas, work, and results in the Grid community and the distributed computing community. This paper is of interest to distributing computing researchers because Grid computing provides new challenges that need to be addressed, as well as actual platforms for experimentation and research.
- M. Aguilera, R. Strom, D. Sturman, M. Astley, and T. Chandra. Matching Events in a Content-based Subscription System. In Proceedings of the 18th Annual ACM Symposium on Principles of Distributed Computing (PODC 1999), pages 53-61, Atlanta, Georgia, May 1999.]] Google ScholarDigital Library
- B. Allcock, J. Bester, J. Bresnahn, A. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnel, and S. Tuecke. Data Management and Transfer in High-Performance Computational Grid Environments. Parallel Computing, 2002. to appear.]] Google ScholarDigital Library
- A. Amoroso, K. Marzullo, and A. Ricciardi. Wide-Area Nile: A Case Study of a Wide-Area Data-Parallel Application. In Proceedings of the 18th International Conference on Distributed Computing Systems (ICDCS), Amsterdam, Netherlands, pages 506-515, May 1998.]] Google ScholarDigital Library
- P. Avery and I. Foster. The GriPhyN Project: Towards Petascale Virtual Data Grids. http://www.griphyn.org, 2001.]]Google Scholar
- P. Avery, I. Foster, R. Gardner, H. Newman, and A. Szalay. An International Virtual-Data Grid Laboratory for Data Intensive Science. http://www.griphyn.org, 2001.]]Google Scholar
- G. Banavar, T. Chandra, B. Mukherjee, J. Nagarajarao, R. Strom, and D. Sturman. An Efficient Multicast Protocol for Content-Based Publish-Subscribe Systems. In Proceedings of the 19th IEEE International Conference on Distributed Computing Systems (ICDCS), 1998.]]Google ScholarDigital Library
- G. Banavar, M. Kaplan, K. Shaw, R. Strom, D. Sturman, and W. Tao. Information Flow Based Event Distribution Middleware. In Proceedings of the 19th IEEE International Conference on Distributed Computing Systems, Workshops on Electronic Commerce and Web-based Applications, 1999.]]Google ScholarCross Ref
- C. Baru, R. Moore, and M. Rajasekar, A. Wan. The SDSC Storage Resource Broker. In Proceedings of CASCON'98, Toronto, Canada, Nov. 1998.]] Google ScholarDigital Library
- F. Berman. The Grid, Blueprint for a New computing Infrastructure, chapter 12. Morgan Kaufmann Publishers, Inc., 1998. Edited by Ian Foster and Carl Kesselman.]]Google Scholar
- F. Berman, R. Wolski, S. Figueira, J. Schopf, and G. Shao. Application Level Scheduling on Distributed Heterogeneous Networks. In Proceedings of Supercomputing '96, November 1996.]] Google ScholarDigital Library
- R. Butler, D. Engert, I. Foster, C. Kesselman, and S. Tuecke. Design and Deployment of a National-Scale Authentication Infrastructure. IEEE Computers, 33(12):60-66, 2000.]] Google ScholarDigital Library
- A. Carzaniga, D. Rosenblum, and A. Wolf. Challenges for Distributed Event Services: Scalability vs. Expressiveness. In Proceedings of the ICSE'99 Workshop on Engineering Distributed Objects (EDO'99), 1999.]]Google Scholar
- A. Carzaniga, D. Rosenblum, and A. Wolf. Interfaces and Algorithms for a Wide-Area Event Notification Service. Technical Report CU-CS-888-99, Department of Computer Science, University of Colorado, Oct. 1999.]]Google Scholar
- A. Carzaniga, D. Rosenblum, and A. Wolf. Achieving Scalability and Expressiveness in an Internet-Scale Event Notification Service. In Proceedings of the 19th Annual Symposium on Principles of Distributed Computing (PODC 2000), pages 219-227, Portland, Oregon, Jul. 2000.]] Google ScholarDigital Library
- H. Casanova, A. Legrand, D. Zagorodnov, and F. Berman. Heuristics for Scheduling Parameter Sweep Applications in Grid Environments. In Proceedings of the 9th Heterogeneous Computing Workshop (HCW'00), pages 349-363, May 2000.]] Google ScholarDigital Library
- Y. Chen, R. Katz, and J. Kubiatowicz. Dynamic Replica Placement for Scalable Content Delivery. In Proceedings of the First International Workshop on Peer-to-Peer Systems (IPTPS 2002), March 2002.]] Google ScholarDigital Library
- A. Chervenak, E. Deelman, I. Foster, A. Iamnitchi, C. Kesselman, W. Hoschek, P. Kunszt, M. Ripeanu, B. Schwartzkopf, H. Stockinger, K. Stockinger, and B. Tierney. Giggle: A Framework for Constructing Scalable Replica Location Service. In Proceedings of Supercomputing 02, Nov 2002.]] Google ScholarDigital Library
- A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke. The Data Grid: Towards and Architecture for the Distributed Management and Analysis of Large Scientific Data Sets. Journal of Network and Computer Applications, 23(3):187-200, 2000.]]Google ScholarDigital Library
- Common Information Model, Distributed Management Task Force, Inc. http://www.dmtf.org/standards/standard_cim.php.]]Google Scholar
- G. Cugola, E. Di Nitto, and A. Fuggetta. Exploiting an Event-Based Infrastructure to Develop Complex Distributed Systems. In Proceedings of the 20th International Conference on Software Engineering (ICSE'98), Apr. 1998.]] Google ScholarDigital Library
- K. Czajkowski, S. Fitzgerald, I. Foster, and C. Kesselman. Grid Information Services for Distributed Resource Sharing. In Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing (HPDC-10), August 2001.]] Google ScholarDigital Library
- K. Czajkowski, I. Foster, C. Kesselman, V. Sanger, and S. Tuecke. SNAP: A Protocol for Negociating Service Level Agreements and Coordinating Resource Management in Distributed Systems. In Proceedings of the 8th Workshop on Job scheduling Strategies for Parallel Processing, July 2002.]] Google ScholarDigital Library
- The D0 Experiment. http://www-d0.fnal.gov.]]Google Scholar
- D. Düllman, W. Hoschek, J. Jean-Martinez, A. Samar, B. Segal, H. Stockinger, and K. Stockinger. Models for Replica Synchronisation and Consistency in a Data Grid. In Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing (HPDC-10), August 2001.]] Google ScholarDigital Library
- European Datagrid Webpage. http://eu-datagrid.web.cern.ch.]]Google Scholar
- Z. Fei. A Novel Approach to Managing Consistency in Content Distribution Networks. In Proceedings of Web Caching and Content Distribution Workshop (WCW'01), Boston, MA, June 2001.]]Google Scholar
- S. Fitzgerald, I. Foster, C. Kesselman, G. von Laszewski, W. Smith, and S. Tuecke. A Directory Service for Configuring High-Performance Distributed Computations. In Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing (HPDC-6), August 1997.]] Google ScholarDigital Library
- I. Foster. The Grid: A New Infrastructure for 21st Century Science. Physics Today, 55(2):42, February 2002.]]Google ScholarCross Ref
- I. Foster and C. Kesselman, editors. The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers, Inc., San Francisco, USA, 1999.]] Google ScholarDigital Library
- I. Foster, C. Kesselman, J. Nick, and S. Tuecke. The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. Available at http://www.globus.org, 2002.]]Google Scholar
- I. Foster, C. Kesselman, and S. Tuecke. The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International Journal of High Performance Computing Applications, 15(3), 2001.]] Google ScholarDigital Library
- I. Foster, J. Vöckler, M. Wilde, and Y. Zhao. Chimera: A Virtual Data system for Representing, Querying, and Automating Data Derivation. In Proceedings of the 14th International Conference on Scientific and Statistical Database Management, Edinburgh, July 2002.]] Google ScholarDigital Library
- Global Grid Forum. http://www.gridforum.org/.]]Google Scholar
- Working Group on Grid Information Services at the Global Grid Forum. http://www.gridforum.org/1_GIS/GIS.htm.]]Google Scholar
- Research Group on Grid Notification at the Global Grid Forum. http://www.gridforum.org/1_GIS/GNF.htm.]]Google Scholar
- Globus Project. http://www.globus.org.]]Google Scholar
- J. Gray, P. Helland, O. O'Neil, and D. Shasha. The Dangers of Replication and a Solution. In Proceedings of ACM SIGMOD, pages 173-182, 1996.]] Google ScholarDigital Library
- O. M. Group. CORBA Services: Common Object Service Specification. Technical report, Object Management Group, July 1998.]]Google Scholar
- The Gryphon Project. http://www.research.ibm.com/gryphon.]]Google Scholar
- International Symposium on High Performance Distributed Computing (HPDC). http://www.hpdc.org.]]Google Scholar
- A. Iamnitchi and I. Foster. On Fully decentralized Resource Discovery in Grid Environments. In Proceedings of the International Workshop on Grid Computing, Denver, Colorado, November 2001.]] Google ScholarDigital Library
- A. Iamnitchi and I. Foster. Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications. In Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11), July 2002.]] Google ScholarDigital Library
- A. Iamnitchi, M. Ripeanu, and I. Foster. Locating Data in (Small-World?) Peer-to-Peer Scientific Collaborations. In Proceedings of the First International Workshop on Peer-to-Peer Systems, Cambridge, Massachusetts, March 2002.]] Google ScholarDigital Library
- J. Kangasharju, J. Roberts, and K. Ross. Object Replication Strategies in Content Distribution Networks. In Proceedings of Web Caching and Content Distribution Workshop (WCW'01), Boston, MA, June 2001.]]Google Scholar
- A.-M. Kermarrec, A. Rowston, M. Shapiro, and P. Druschel. The IceCube approach to the reconciliation of divergent replicas. In Proceedings of the 20th Annual ACM Symposium on Principles of Distributed Computing (PODC 2001), August 2001.]] Google ScholarDigital Library
- C. Krintz and R. Wolski. NWSAlarm: A Tool for Accurately Detecting Degradation in Expected Performance of Grid Resources. In Proceedings of CCGrid'01, May 2001.]] Google ScholarDigital Library
- J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer, C. Wells, and B. Zhao. OceanStore: An Architecture for Global-Scale Persistent Storage. In Proceedings of the Ninth international Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2000), 2000.]] Google ScholarDigital Library
- S. Maffeis. iBus: The Java Intranet Software Bus. Technical report, SoftWired AG, Zuric, Switzerland, Feb. 1997.]]Google Scholar
- M. Mansouri-Samani and M. Sloman. GEM: A Generalized Event Monitoring Language for Distributed Systems. IEE/IOP/BCS Distributed Systems Engineering Journal, 4(2):96-108, June 1997.]]Google ScholarCross Ref
- Network for Earthquake Engineering Simulations. http://www.eng.nsf.gov/nees.]]Google Scholar
- L. Opyrchal, M. Astley, J. Auerbach, G. Banavar, R. Strom, and D. Sturman. Exploiting IP Multicast in Content-Based Publish-Subscribe Systems. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC 2001), pages 219-228, 2001.]]Google Scholar
- Particle Physics Data Grid. http://www.ppdg.net.]]Google Scholar
- C. Partridge. Data Communications vs. Distributed Computing, 2000. Invited talk at PODC 2000.]] Google ScholarDigital Library
- K. Petersen, J. Spreitzer, D. Terry, M. Theimer, and A. Demers. Flexible Update Propagation for Weakly Consistent Replication. In Proceedings on the 16th ACM Symposium on Operating Systems Principles (SOSP-16), Saint Malo, France, 1997.]] Google ScholarDigital Library
- G. Pierre, I. Kuz, M. van Steen, and A. Tanenbaum. Differentiated Strategies for Replicating Web Documents. Computer Communications, 24(2):232-240, 2000.]]Google Scholar
- C. Plaxton, R. Rajaraman, and A. Richa. Accessing Nearby Copies of Replicated Objects in a Distributed System. In Proceedings of the Symposium of Parallel Algorithms and Architectures (SPAA '97), pages 311-320, June 1997.]] Google ScholarDigital Library
- P. Radoslavov, R. Govindan, and D. Estrin. Topology-Informed Internet Replica Placement. In Proceedings of the Web Caching and Content Distribution Workshop (WCW'01), Boston, MA, June 2001.]]Google Scholar
- R. Raman, M. Livny, and M. Solomon. Matchmaking: Distributed Resource Management for High Throughput Computing. In 7th IEEE International Symposium on High Performance Distributed Computing (HPDC-7), July 1998.]] Google ScholarDigital Library
- K. Ranganathan and I. Foster. Identifying Dynamic Replication Strategies for a High Performance Data Grid. In Proceedings of the International Workshop on Grid Computing, Denver, Colorado, November 2001.]] Google ScholarDigital Library
- S. Ratsanamy, P. Francis, M. Handley, R. Karp, and S. Shenker. A Scalable Content Addressable Network. In Proceedings of SIGCOMM 2001, 2001.]] Google ScholarDigital Library
- M. Ripeanu and I. Foster. A Decentralized, Adaptive Replica Location Mechanism. In Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11), July 2002.]] Google ScholarDigital Library
- A. Rowstron and P. Druschel. Pastry: Scalable, Distributed Object Location and Routing for Large-Scale Peer-to-Peer Systems. In Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, pages 329-350, Nov. 2001.]] Google ScholarDigital Library
- A. Rowstron, A.-M. Kermarrec, M. Castro, and P. Druschel. SCRIBE: The Design of a Large-Scale Event Notification Infrastructure. In Proceedings of the Third International Workshop on Networked Group Communication, pages 30-43, 2001.]] Google ScholarDigital Library
- The International Conference for High Performance Computing and Communications (SC). http://www.supercomp.org.]]Google Scholar
- B. Segall and D. Arnold. Elvin has left the building: A publish/subscribe notification service with quenching. In Proceedings of AUUG'97, Brisbane, Australia, Sept. 1997.]]Google Scholar
- The SIENA Project. http://www.cs.colorado.edu/users/carzanig/siena/.]]Google Scholar
- I. Stanoi, D. Agrawal, and A. Abbadi. Using Broadcast Primitives in Replicated Databases. In Proceedings of the International Conference on Distributed Computing Systems (ICSDS'98), pages 148-155, Amsterdam, The Netherlands, May 1998.]] Google ScholarDigital Library
- I. Stoica, R. Morris, D. Karger, M. Kaashoek, and H. Balakrishnan. Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications. In Proceedings of SIGCOMM 2001, 2001.]] Google ScholarDigital Library
- I. Sun Microsystems. Java Distributed Event Specification. Technical report, Sun Microsystems, Inc., Mountain View, CA, U.S.A., Nov. 1998.]]Google Scholar
- B. Tierney, W. Johnston, B. Crowley, H. Hoo, C. Brooks, and D. Gunter. The NetLogger Methodology for High Performance Distributed Systems Performance Analysis. In Proceedings of 7th IEEE International Symposium on High Performance Distributed Computing (HPDC-7), July 1998.]] Google ScholarDigital Library
- A. Venkataramani, P. Weidmann, and M. Dahlin. Bandwidth Constrained Placement in a WAN. In Proceedings of the 19th Annual ACM Symposium on Principles of Distributed Computing (PODC 2000), pages 53-61, 2000.]]Google Scholar
- D. Watts. Small Worlds. The Dynamics of Networks between Order and Randomness. Princeton University Press, Princeton, New Jersey, U.S.A., 1999.]] Google ScholarDigital Library
- M. Wiesmann, F. Pedone, A. Schiper, B. Kemme, and G. Alonso. Understanding Replication in Databases and Distributed Systems. In Proceedings of the 20th International Conference on Distributed Computing Systems (ICDCS 2000), 2000.]] Google ScholarDigital Library
- R. Wolski, J. Plank, J. Bervik, and T. Bryan. Analyzing Market-based Resource Allocation Strategies for the Computational Grid. International Journal of High-performance Computing Applications, 15(3), 2001.]] Google ScholarDigital Library
- R. Wolski, N. Spring, and J. Hayes. The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing. Journal of Future Generation Computing Systems, 15(5-6):757-768, 1999.]] Google ScholarDigital Library
- M. Wray and R. Hawkes. Distributed virtual environments and VRML: an event-based architecture. In Proceedings of the Seventh International WWW Conference (WWW7), Brisbane, Australia, 1998.]] Google ScholarDigital Library
- B. Zhao, J. Kubiatowicz, and A. Joseph. Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and Routing. Technical Report UCB/CSD-01-1141, University of California, Berkeley, 2001.]] Google ScholarDigital Library
- Y. Zhao and R. Strom. Exploiting Event Stream Interpretation in Publish-Subscribe Systems. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC 2001), 2001.]] Google ScholarDigital Library
Index Terms
- Distributed computing research issues in grid computing
Recommendations
MGC middleware for grid computing: the Globus Toolkit
ACAI '11: Proceedings of the International Conference on Advances in Computing and Artificial IntelligenceGrid computing has made substantial advances during the last decade. A major concern in Grid environments is dealing with the high degree of heterogeneity of resources that can range from laptops and PCs to supercomputers. The unified virtual view of ...
Comments