Abstract
We present a Scalable Distributed Information Management System (SDIMS) that aggregates information about large-scale networked systems and that can serve as a basic building block for a broad range of large-scale distributed applications by providing detailed views of nearby information and summary views of global information. To serve as a basic building block, a SDIMS should have four properties: scalability to many nodes and attributes, flexibility to accommodate a broad range of applications, administrative isolation for security and availability, and robustness to node and network failures. We design, implement and evaluate a SDIMS that (1) leverages Distributed Hash Tables (DHT) to create scalable aggregation trees, (2) provides flexibility through a simple API that lets applications control propagation of reads and writes, (3) provides administrative isolation through simple extensions to current DHT algorithms, and (4) achieves robustness to node and network reconfigurations through lazy reaggregation, on-demand reaggregation, and tunable spatial replication. Through extensive simulations and micro-benchmark experiments, we observe that our system is an order of magnitude more scalable than existing approaches, achieves isolation properties at the cost of modestly increased read latency in comparison to flat DHTs, and gracefully handles failures.
- K. Albrecht, R. Arnold, M. Gahwiler, and R. Wattenhofer. Join and Leave in Peer-to-Peer Systems: The DASIS approach. Technical report, CS, ETH Zurich, 2003.Google Scholar
- G. Back, W. H. Hsieh, and J. Lepreau. Processes in KaffeOS: Isolation, Resource Management, and Sharing in Java. In Proc. OSDI, Oct 2000. Google ScholarDigital Library
- G. Banga, P. Druschel, and J. Mogul. Resource Containers: A New Facility for Resource Management in Server Systems. In OSDI99, Feb. 1999. Google ScholarDigital Library
- R. Bhagwan, P. Mahadevan, G. Varghese, and G. M. Voelker. Cone: A Distributed Heap-Based Approach to Resource Selection. Technical Report CS2004-0784, UCSD, 2004.Google Scholar
- K. P. Birman. The Surprising Power of Epidemic Communication. In Proceedings of FuDiCo, 2003. Google ScholarDigital Library
- B. Bloom. Space/time tradeoffs in hash coding with allowable errors. Comm. of the ACM, 13(7):422--425, 1970. Google ScholarDigital Library
- M. Castro, P. Druschel, Y. C. Hu, and A. Rowstron. Exploiting Network Proximity in Peer-to-Peer Overlay Networks. Technical Report MSR-TR-2002-82, MSR.Google Scholar
- M. Castro, P. Druschel, A.-M. Kermarrec, A. Nandi, A. Rowstron, and A. Singh. SplitStream: High-bandwidth Multicast in a Cooperative Environment. In SOSP, 2003. Google ScholarDigital Library
- M. Castro, P. Druschel, A.-M. Kermarrec, and A. Rowstron. SCRIBE: A Large-scale and Decentralised Application-level Multicast Infrastructure. IEEE JSAC (Special issue on Network Support for Multicast Communications), 2002. Google ScholarDigital Library
- R. Cox, A. Muthitacharoen, and R. T. Morris. Serving DNS using a Peer-to-Peer Lookup Service. In IPTPS, 2002. Google ScholarDigital Library
- M. Dahlin, L. Gao, A. Nayate, A. Venkataramani, P. Yalagandula, and J. Zheng. PRACTI replication for large-scale systems. Technical Report TR-04-28, The University of Texas at Austin, 2004.Google Scholar
- C. Estan, G. Varghese, and M. Fisk. Bitmap algorithms for counting active flows on high speed links. In Internet Measurement Conference 2003, 2003. Google ScholarDigital Library
- Y. Fu, J. Chase, B. Chun, S. Schwab, and A. Vahdat. SHARP: An architecture for secure resource peering. In Proc. SOSP, Oct. 2003. Google ScholarDigital Library
- Ganglia: Distributed Monitoring and Execution System. http://ganglia.sourceforge.net.Google Scholar
- S. Gribble, A. Halevy, Z. Ives, M. Rodrig, and D. Suciu. What Can Peer-to-Peer Do for Databases, and Vice Versa? In Proceedings of the WebDB, 2001.Google Scholar
- K. Gummadi, R. Gummadi, S. D. Gribble, S. Ratnasamy, S. Shenker, and I. Stoica. The Impact of DHT Routing Geometry on Resilience and Proximity. In SIGCOMM, 2003. Google ScholarDigital Library
- N. J. A. Harvey, M. B. Jones, S. Saroiu, M. Theimer, and A. Wolman. SkipNet: A Scalable Overlay Network with Practical Locality Properties. In USITS, March 2003. Google ScholarDigital Library
- R. Huebsch, J. M. Hellerstein, N. Lanham, B. T. Loo, S. Shenker, and I. Stoica. Querying the Internet with PIER. In Proceedings of the VLDB Conference, May 2003. Google ScholarDigital Library
- C. Intanagonwiwat, R. Govindan, and D. Estrin. Directed diffusion: a scalable and robust communication paradigm for sensor networks. In MobiCom, 2000. Google ScholarDigital Library
- S. R. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. TAG: a Tiny AGgregation Service for Ad-Hoc Sensor Networks. In OSDI, 2002. Google ScholarDigital Library
- D. Malkhi. Dynamic Lookup Networks. In FuDiCo, 2002.Google Scholar
- M. L. Massie, B. N. Chun, and D. E. Culler. The ganglia distributed monitoring system: Design, implementation, and experience. In submission.Google Scholar
- P. Maymounkov and D. Mazieres. Kademlia: A Peer-to-peer Information System Based on the XOR Metric. In Proceesings of the IPTPS, March 2002. Google ScholarDigital Library
- C. Olston and J. Widom. Offering a precision-performance tradeoff for aggregation queries over replicated data. In VLDB, pages 144--155, Sept. 2000. Google ScholarDigital Library
- K. Petersen, M. Spreitzer, D. Terry, M. Theimer, and A. Demers. Flexible Update Propagation for Weakly Consistent Replication. In Proc. SOSP, Oct. 1997. Google ScholarDigital Library
- Planetlab. http://www.planet-lab.org.Google Scholar
- C. G. Plaxton, R. Rajaraman, and A. W. Richa. Accessing Nearby Copies of Replicated Objects in a Distributed Environment. In ACM SPAA, 1997. Google ScholarDigital Library
- S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. A Scalable Content Addressable Network. In Proceedings of ACM SIGCOMM, 2001. Google ScholarDigital Library
- S. Ratnasamy, S. Shenker, and I. Stoica. Routing Algorithms for DHTs: Some Open Questions. In IPTPS, March 2002. Google ScholarDigital Library
- T. Roscoe, R. Mortier, P. Jardetzky, and S. Hand. InfoSpect: Using a Logic Language for System Health Monitoring in Distributed Systems. In Proceedings of the SIGOPS European Workshop, 2002. Google ScholarDigital Library
- A. Rowstron and P. Druschel. Pastry: Scalable, Distributed Object Location and Routing for Large-scale Peer-to-peer Systems. In Middleware, 2001. Google ScholarDigital Library
- S.Ratnasamy, M.Handley, R.Karp, and S.Shenker. Application-level Multicast using Content-addressable Networks. In Proceedings of the NGC, November 2001. Google ScholarDigital Library
- W. Stallings. SNMP, SNMPv2, and CMIP. Addison-Wesley, 1993.Google Scholar
- I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan. Chord: A scalable Peer-To-Peer lookup service for internet applications. In ACM SIGCOMM, 2001. Google ScholarDigital Library
- S.Zhuang, B.Zhao, A.Joseph, R.Katz, and J.Kubiatowicz. Bayeux: An Architecture for Scalable and Fault-tolerant Wide-Area Data Dissemination. In NOSSDAV, 2001. Google ScholarDigital Library
- IBM Tivoli Monitoring. www.ibm.com/software/tivoli/products/monitor.Google Scholar
- R. VanRenesse, K. P. Birman, and W. Vogels. Astrolabe: A Robust and Scalable Technology for Distributed System Monitoring, Management, and Data Mining. TOCS, 2003. Google ScholarDigital Library
- R. VanRenesse and A. Bozdog. Willow: DHT, Aggregation, and Publish/Subscribe in One Protocol. In IPTPS, 2004.Google Scholar
- A. Venkataramani, P. Weidmann, and M. Dahlin. Bandwidth constrained placement in a wan. In PODC, Aug. 2001. Google ScholarDigital Library
- A. Venkataramani, P. Yalagandula, R. Kokku, S. Sharif, and M. Dahlin. Potential costs and benefits of long-term prefetching for content-distribution. Elsevier Computer Communications, 25(4):367--375, Mar. 2002. Google ScholarDigital Library
- M. Wawrzoniak, L. Peterson, and T. Roscoe. Sophia: An Information Plane for Networked Systems. In HotNets-II, 2003.Google Scholar
- R. Wolski, N. Spring, and J. Hayes. The network weather service: A distributed resource performance forecasting service for metacomputing. Journal of Future Generation Computing Systems, 15(5-6):757--768, Oct 1999. Google ScholarDigital Library
- P. Yalagandula and M. Dahlin. SDIMS: A scalable distributed information management system. Technical Report TR-03-47, Dept. of Computer Sciences, UT Austin, Sep 2003.Google Scholar
- Z. Zhang, S.-M. Shi, and J. Zhu. SOMO: Self-Organized Metadata Overlay for Resource Management in P2P DHT. In IPTPS, 2003.Google ScholarCross Ref
- B. Y. Zhao, J. D. Kubiatowicz, and A. D. Joseph. Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and Routing. Technical Report UCB/CSD-01-1141, UC Berkeley, Apr. 2001. Google ScholarDigital Library
Index Terms
- A scalable distributed information management system
Recommendations
A scalable distributed information management system
SIGCOMM '04: Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communicationsWe present a Scalable Distributed Information Management System (SDIMS) that aggregates information about large-scale networked systems and that can serve as a basic building block for a broad range of large-scale distributed applications by providing ...
Enabling Dynamic Querying over Distributed Hash Tables
Dynamic querying (DQ) is a search technique used in unstructured peer-to-peer (P2P) networks to minimize the number of nodes that is necessary to visit to reach the desired number of results. In this paper, we introduce the use of the DQ technique in ...
An effective single-hop distributed hash table with high lookup performance and low traffic overhead
Distributed hash tables DHTs have been used in several applications, but most DHTs have opted to solve lookups with multiple hops, to minimize bandwidth costs while sacrificing lookup latency. This paper presents D1HT, an original DHT that has a peer-to-...
Comments