Abstract
CoMon is an evolving, mostly-scalable monitoring system for PlanetLab that has the goal of presenting environment-tailored information for both the administrators and users of the PlanetLab global testbed. In addition to passively reporting metrics provided by the operating system, CoMon also actively gathers a number of metrics useful for developers of networked systems. Using CoMon, PlanetLab administrators and users can easily spot problematic machines, where the problem may arise from the machine itself, local configuration/environment problems, or the workload running on the machine. Furthermore, users can easily observe many properties of all of the experiments running across multiple PlanetLab nodes, facilitating not only their own experiment monitoring and debugging, but also helping scale the task of finding PlanetLab problems.In this paper we describe CoMon's design and operation, including what kinds of data are gathered, the scale of the processing involved, and the approaches we have taken to keep CoMon running. Our goal is not only to illustrate the kinds of problems faced in this environment, but also to invite others to participate, either by experimenting with the data generated by CoMon, or by building on the CoMon system itself.
- HP OpenView products. http://www.managementsoftware.hp.com/products/.]]Google Scholar
- Jabber Software Foundation. http://www.jabber.org/about/overview.shtml.]]Google Scholar
- PlanetLab Application Manager. http://appmanager.berkeley.intel-research.net/.]]Google Scholar
- PLuSH. http://sysnet.ucsd.edu/projects/plush/.]]Google Scholar
- RRDTool. http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/.]]Google Scholar
- Vserver. http://linux-vserver.org/.]]Google Scholar
- P. Brett, R. Knauerhase, M. Bowman, R. Adams, A. Nataraj, J. Sedayao, and M. Spinde. A shared global event propagation system to enable next generation distributed services. In Proceedings of First Workshop on Real, Large Distributed Systems(WORDLS), December 2004.]]Google Scholar
- J. Case, M. Fedor, M. Schoffstall, and J. Davin. A simple network management protocol (SNMP), RFC 1157, May 1990.]] Google ScholarDigital Library
- M. L. Massie, B. N. Chun, and D. E. Culler. The Ganglia distributed monitoring system: Design, implementation, and experience. Parallel Computing, 30(7), July 2004.]]Google Scholar
- S. Muir, L. Peterson, M. Fiuczynski, J. Cappos, and J. Hartman. Proper: Privileged operations in a virtualised system environment. In Proceedings of the USENIX Annual Technical Conference, April 2005.]] Google ScholarDigital Library
- D. Oppenheimer, J. Albrecht, D. Patterson, and A. Vahdat. Distributed resource discovery on PlanetLab with SWORD. In Proceedings of First Workshop on Real, Large Distributed Systems(WORDLS), December 2004.]]Google Scholar
- S. Rhea, B. Godfrey, B. Karp, J. Kubiatowicz, S. Ratnasamy, S. Shenker, I. Stoica, and H. Yu. OpenDHT: A public DHT service and its uses. In Proceedings of ACM SIGCOMM, August 2005.]] Google ScholarDigital Library
- L. Wang, K. Park, R. Pang, V. Pai, and L. Peterson. Reliability and security in the CoDeeN content distribution network. In Proceedings of the USENIX Annual Technical Conference, 2004.]] Google ScholarDigital Library
Index Terms
- CoMon: a mostly-scalable monitoring system for PlanetLab
Recommendations
An Efficient Hybrid Peer-to-Peer System for Distributed Data Sharing
Peer-to-peer overlay networks are widely used in distributed systems. Based on whether a regular topology is maintained among peers, peer-to-peer networks can be divided into two categories: structured peer-to-peer networks in which peers are connected ...
Performance analysis of structured peer-to-peer overlays for mobile networks
Distributed Hash Table DHT based Peer-to-Peer P2P overlays have been widely researched and deployed in many applications such as file sharing, IP telephony, content distribution and media streaming applications. However, their deployment has largely ...
Comments