skip to main content
article
Free Access

Manageability, availability, and performance in porcupine: a highly scalable, cluster-based mail service

Published:01 August 2000Publication History
Skip Abstract Section

Abstract

This paper describes the motivation, design and performance of Porcupine, a scalable mail server. The goal of Porcupine is to provide a highly available and scalable electronic mail service using a large cluster of commodity PCs. We designed Porcupine to be easy to manage by emphasizing dynamic load balancing, automatic configuration, and graceful degradation in the presence of failures. Key to the system's manageability, availability, and performance is that sessions, data, and underlying services are distributed homogeneously and dynamically across nodes in a cluster.

References

  1. AGRAWAL, D., ABBADI,A.E.,AND STEIKE, R. C. 1997. Epidemic algorithms in replicated databases. In 16th ACM Symp. on Princ. of Database Systems (Tucson, AZ, May 1997), pp. 161-172. ACM.]] Google ScholarGoogle Scholar
  2. ANDERSON, T., DAHLIN, M., NEEFE, J., PATTERSON, D., ROSELLI, D., AND WANG, R. 1995. Serverless network file systems. In 15th Symposium on Operating Systems Principles (Copper Mountain, CO, Dec. 1995). ACM.]] Google ScholarGoogle Scholar
  3. BIRRELL,A.D.,HISGEN, A., JERIAN, C., MANN, T., AND SWART, G. 1993. The Echo distributed file system. Technical Report 111 (September), Compaq Systems Research Center.]]Google ScholarGoogle Scholar
  4. BRISCO, T. P. 1995. RFC1794: DNS support for load balancing. http://www.cis.ohio-state. edu/htbin/rfc/rfc1794.html.]] Google ScholarGoogle Scholar
  5. CHANKHUNTHOD, A., DANZIG, P., NEERDAELS, C., SCHWARTZ, M., AND WORRELL, K. 1996. A hierarchical internet object cache. In Winter USENIX Technical Conference (Jan. 1996).]] Google ScholarGoogle Scholar
  6. CHEN,P.M.,LEE,E.K.,GIBSON,G.A.,KATZ,R.H.,AND PATTERSON, D. A. 1994. RAID: High-performance, reliable secondary storage. ACM Computing Surveys 26, 2 (June), 145-185.]] Google ScholarGoogle Scholar
  7. CHRISTENSON, N., BOSSERMAN, T., AND BECKEMEYER, D. 1997. A highly scalable electronic mail service using open systems. In Symposium on Internet Technologies and Systems (Monterey, CA, Dec. 1997). USENIX.]] Google ScholarGoogle Scholar
  8. CHRISTIAN,F.AND SCHMUCK, F. 1995. Agreeing on processor group membership in asynchro-nous distributed systems. Technical Report CSE95-428, UC San Diego.]]Google ScholarGoogle Scholar
  9. CISCO SYSTEMS. 1999. Local director. http://www.cisco.com/warp/public/751/lodir/index. html.]]Google ScholarGoogle Scholar
  10. CRISPIN, M. 1996. RFC2060: Internet message access protocol version 4 rev 1. http:// www.cis.ohio-state.edu/htbin/rfc/rfc2060.html.]] Google ScholarGoogle Scholar
  11. DAHLIN, M. 1999. Interpreting stale load information. In The 19th International Conference on Distributed Computing Systems (ICDCS) (Austin, TX, May 1999). IEEE.]] Google ScholarGoogle Scholar
  12. DEROEST, J. 1996. Clusters help allocate computing resources. http://www.washington.edu/ tech.home/-windows/issue18/clusters.html.]]Google ScholarGoogle Scholar
  13. EAGER,D.L.,LAZOWSKA,E.D.,AND ZAHORJAN, J. 1986. Adaptive load sharing in homoge-neous distributed systems. IEEE Trans. on Software Engineering 12, 5 (May), 662-675.]] Google ScholarGoogle Scholar
  14. FEELEY,M.M.,MORGAN,W.E.,PIGHIN,F.H.,KARLIN,A.R.,LEVY,H.M.,AND THEKKATH,C.A. 1995. Implementing global memory management in a workstation cluster. In 15th Sympo-sium on Operating Systems Principles (Copper Mountain, CO, 1995), pp. 130-146. ACM.]] Google ScholarGoogle Scholar
  15. FOUNDRY NETWORKS. 1999. ServerIron Switch. http://www.foundrynet.com/serverironfspec. html.]]Google ScholarGoogle Scholar
  16. FOX, A., GRIBBLE,S.D.,CHAWATHE, Y., BREWER,E.A.,AND GAUTHIER, P. 1997. Cluster-based scalable network services. In 16th Symposium on Operating Systems Principles (St. Malo, France, Oct. 1997), pp. 78-91. ACM.]] Google ScholarGoogle Scholar
  17. GRAY,J.AND REUTER, A. 1993. Transaction Processing: Concepts and Techniques. Morgan-Kaufmann.]] Google ScholarGoogle Scholar
  18. IBM. 1998. High Availability Cluster Multi-Processing for AIX. Available at http:// www.rs6000.ibm-.com/doc_link/en_US/a_doc_lib/aixgen/hacmp_index.html.]]Google ScholarGoogle Scholar
  19. KARGER, D., LEHMAN, E., LEIGHTON, T., PANIGRAHY, R., LEVINE, M., AND LEWIN, D. 1997. Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web. In Symposium on Theory of Computing (El Paso, TX, 1997), pp. 654-663. ACM.]] Google ScholarGoogle Scholar
  20. KRONENBERG,N.P.,LEVY,H.M.,AND STRECKER, W. D. 1986. VAXclusters: A closely-coupled distributed system. ACM Trans. on Computer Systems 2, 4, 130-146.]] Google ScholarGoogle Scholar
  21. LAMPORT, L. 1978. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM 21, 7 (July), 558-565.]] Google ScholarGoogle Scholar
  22. LEE,E.K.AND THEKKATH, C. 1996. Petal: Distributed virtual disks. In 7th International Conf. on Architectural Support for Prog. Lang. and Operating Systems (Cambridge, MA, Oct. 1996), pp. 84-92. ACM.]] Google ScholarGoogle Scholar
  23. LISKOV, B., GHEMAWAT, S., GRUBER, R., JOHNSON, P., SHRIRA, L., AND WILLIAMS, M. 1991a. Replication in the Harp file system. In 13th Symposium on Operating Systems Principles (Pacific Grove, CA, Oct. 1991), pp. 226-238. ACM.]] Google ScholarGoogle Scholar
  24. LISKOV, B., SHRIRA, L., AND WROCLAWSKI, J. 1991b. Efficient at-most-once messages based on synchronized clocks. ACM Trans. on Computer Systems 9, 2, 125-142.]] Google ScholarGoogle Scholar
  25. MILLS, D. L. 1992. RFC1305: Network time protocol (version 3). http://www.cis.ohio-state. edu/htbin/rfc/rfc1305.html.]]Google ScholarGoogle Scholar
  26. MILLS, D. L. 1994. Improved algorithms for synchronizing computer network clocks. In SIGCOMM (London, UK, Sept. 1994), pp. 317-327. ACM.]] Google ScholarGoogle Scholar
  27. MITZENMACHER, M. 1998. How useful is old information? Technical Report 98-002 (Feb.), Compaq Systems Research Center.]]Google ScholarGoogle Scholar
  28. MYERS,J.G.AND ROSE, M. T. 1996 RFC1939: Post office protocol version 3. http://www.cis. ohio-state.edu/htbin/rfc/rfc1939.html.]] Google ScholarGoogle Scholar
  29. PAI,V.S.,ARON, M., BANGA, G., SVENDSEN, M., DRUSCHEL, P., ZWAENEPOEL, W., AND NAHUM,E. 1998. Locality-aware request distribution in cluster-based network servers. In 8th Interna-tional Conf. on Architectural Support for Prog. Lang. and Operating Systems (San Jose, CA, Oct. 1998), pp. 206-216, ACM.]] Google ScholarGoogle Scholar
  30. PLATFORM COMPUTING. 1999. LSF. http://www.platform.com.]]Google ScholarGoogle Scholar
  31. POSTEL, J. 1982. RFC821: Simple mail transfer protocol. http://www.cis.ohio-state.edu/ htbin/rfc/rfc821.html.]] Google ScholarGoogle Scholar
  32. RESONATE. 1998. Central Dispatch. http://www.resonate.com/products/central_dispatch/. Resonate, Inc.]]Google ScholarGoogle Scholar
  33. RODEHEFFER,T.AND SCHROEDER, M. D. 1991. Automatic reconfiguration in Autonet. In 13th Symposium on Operating Systems Principles (Pacific Grove, CA, 1991), pp. 183-187. ACM.]] Google ScholarGoogle Scholar
  34. SCHROEDER,M.D.,BIRRELL,A.D.,AND NEEDHAM, R. M. 1984. Experience with Grapevine: The growth of a distributed system. ACM Transactions on Computer Systems 2, 1 (Febru-ary), 3-23.]] Google ScholarGoogle Scholar
  35. SNAMAN,W.E.AND THIEL, D. W. 1987. The VAX/VMS distributed lock manager. Digital Technical Journal 5.]]Google ScholarGoogle Scholar
  36. SUN MICROSYSTEMS. 1999. Sun Cluster Architecture. Available at http://www.sun.com/ cluster/wp-clusters-arch.pdf.]]Google ScholarGoogle Scholar
  37. THEKKATH, C., MANN, T., AND LEE, E. 1997. Frangipani: A scalable distributed file system. In 16th Symposium on Operating Systems Principles (St. Malo, France, Oct. 1997), pp. 224-237. ACM.]] Google ScholarGoogle Scholar
  38. TS'O, T. 1999. Ext2 home page. http://web.mit.edu/tytso/www/linux/ext2.html.]]Google ScholarGoogle Scholar
  39. VALLOPPILLIL,V.AND ROSS, K. W. 1998. Cache array routing protocol v1.0. Internet draft. http://www.ircache.net/Cache/ICP/carp.txt.]]Google ScholarGoogle Scholar
  40. VOGELS, W., DUMITRIU, D., BIRMAN, K., GAMACHE, R., MASSA, M., SHORT, R., VERT, J., BARRERA, J., AND GRAY, J. 1998. The design and architecture of the Microsoft cluster service. In 28th International Symposium on Fault-Tolerant Computing (Munich, Germany, June 1998), pp. 422-431. IEEE.]] Google ScholarGoogle Scholar
  41. WILKES, J., GOLDING, R., STAELIN, C., AND SULLIVAN, T. 1995. The HP AutoRAID hierarchi-cal storage system. In 15th Symp. on Operating Systems Principles (Copper Mountain, CO, Dec. 1995), pp. 96-108. ACM.]] Google ScholarGoogle Scholar
  42. WUU,G.T.J.AND BERNSTEIN, A. J. 1984. Efficient solutions to the replicated log and dictionary problems. In Proceedings of the 3rd Symposium on Principles of Distributed Computing (Vancouver, Canada, August 1984), pp. 233-242. ACM.]] Google ScholarGoogle Scholar

Index Terms

  1. Manageability, availability, and performance in porcupine: a highly scalable, cluster-based mail service

                            Recommendations

                            Comments

                            Login options

                            Check if you have access through your login credentials or your institution to get full access on this article.

                            Sign in

                            Full Access

                            PDF Format

                            View or Download as a PDF file.

                            PDF

                            eReader

                            View online with eReader.

                            eReader