skip to main content
10.1145/3380787.3393681acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article
Open Access

Paxos vs Raft: have we reached consensus on distributed consensus?

Published:27 April 2020Publication History

ABSTRACT

Distributed consensus is a fundamental primitive for constructing fault-tolerant, strongly-consistent distributed systems. Though many distributed consensus algorithms have been proposed, just two dominate production systems: Paxos, the traditional, famously subtle, algorithm; and Raft, a more recent algorithm positioned as a more understandable alternative to Paxos.

In this paper, we consider the question of which algorithm, Paxos or Raft, is the better solution to distributed consensus? We analyse both to determine exactly how they differ by describing a simplified Paxos algorithm using Raft's terminology and pragmatic abstractions.

We find that both Paxos and Raft take a very similar approach to distributed consensus, differing only in their approach to leader election. Most notably, Raft only allows servers with up-to-date logs to become leaders, whereas Paxos allows any server to be leader provided it then updates its log to ensure it is up-to-date. Raft's approach is surprisingly efficient given its simplicity as, unlike Paxos, it does not require log entries to be exchanged during leader election. We surmise that much of the understandability of Raft comes from the paper's clear presentation rather than being fundamental to the underlying algorithm being presented.

References

  1. Vaibhav Arora, Tanuj Mittal, Divyakant Agrawal, Amr El Abbadi, Xun Xue, Zhiyanan Zhiyanan, and Zhujianfeng Zhujianfeng. 2017. Leader or Majority: Why Have One When You Can Have Both? Improving Read Scalability in Raft-like Consensus Protocols. In Proceedings of the 9th USENIX Conference on Hot Topics in Cloud Computing (Santa Clara, CA) (HotCloud'17). USENIX Association, USA, 14.Google ScholarGoogle Scholar
  2. atomix [n.d.]. Atomix. https://atomix.io.Google ScholarGoogle Scholar
  3. Jason Baker, Chris Bond, James C. Corbett, JJ Furman, Andrey Khorlin, James Larson, Jean-Michel Leon, Yawei Li, Alexander Lloyd, and Vadim Yushprakh. 2011. Megastore: Providing Scalable, Highly Available Storage for Interactive Services. In Proceedings of the Conference on Innovative Data system Research (CIDR). 223--234. http://www.cidrdb. org/cidr2011/Papers/CIDR11_Paper32.pdfGoogle ScholarGoogle Scholar
  4. Romain Boichat, Partha Dutta, Svend Frølund, and Rachid Guerraoui. 2003. Deconstructing Paxos. SIGACT News 34, 1 (March 2003), 47--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Mike Burrows. 2006. The Chubby Lock Service for Loosely-Coupled Distributed Systems. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (Seattle, Washington) (OSDI '06). USENIX Association, USA, 335--350.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Lásaro Jonas Camargos, Rodrigo Malta Schmidt, and Fernando Pedone. 2007. Multicoordinated Paxos. In Proceedings of the Twenty-Sixth Annual ACM Symposium on Principles of Distributed Computing (Portland, Oregon, USA) (PODC '07). Association for Computing Machinery, New York, NY, USA, 316--317. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Tushar D. Chandra, Robert Griesemer, and Joshua Redstone. 2007. Paxos Made Live: An Engineering Perspective. In Proceedings of the Twenty-Sixth Annual ACM Symposium on Principles of Distributed Computing (Portland, Oregon, USA) (PODC '07). Association for Computing Machinery, New York, NY, USA, 398--407. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. cockroachdb [n.d.]. CockroachDB. https://www.cockroachlabs.com.Google ScholarGoogle Scholar
  9. consul [n.d.]. Consul by HashiCorp. https://www.consul.io.Google ScholarGoogle Scholar
  10. etcd [n.d.]. etcd. https://coreos.com/etcd/.Google ScholarGoogle Scholar
  11. Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. 1985. Impossibility of Distributed Consensus with One Faulty Process. J. ACM 32, 2 (April 1985), 374--382. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Eli Gafni and Leslie Lamport. 2000. Disk Paxos. In Proceedings of the 14th International Conference on Distributed Computing (DISC '00). Springer-Verlag, Berlin, Heidelberg, 330--344.Google ScholarGoogle Scholar
  13. Maurice P. Herlihy and Jeannette M. Wing. 1990. Linearizability: A Correctness Condition for Concurrent Objects. ACM Trans. Program. Lang. Syst. 12, 3 (July 1990), 463--492. 78972 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Tim Kraska, Gene Pang, Michael J. Franklin, Samuel Madden, and Alan Fekete. 2013. MDCC: Multi-Data Center Consistency. In Proceedings of the 8th ACM European Conference on Computer Systems (Prague, Czech Republic) (EuroSys '13). Association for Computing Machinery, New York, NY, USA, 113--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kubernetes [n.d.]. Kubernetes: Production-Grade Container Orchestration. https://kubernetes.io.Google ScholarGoogle Scholar
  16. Leslie Lamport. 1998. The Part-Time Parliament. ACM Trans. Comput. Syst. 16, 2 (May 1998), 133--169. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Leslie Lamport. 2001. Paxos Made Simple. ACM SIGACT News (Distributed Computing Column) 32, 4 (Whole Number 121, December 2001) (December 2001), 51--58. https://www.microsoft.com/enus/research/publication/paxos-made-simple/Google ScholarGoogle Scholar
  18. Leslie Lamport. 2005. Generalized Consensus and Paxos. Technical Report MSR-TR-2005-33. Microsoft. 60 pages. https://www.microsoft. com/en-us/research/publication/generalized-consensus-and-paxos/Google ScholarGoogle Scholar
  19. Leslie Lamport. 2006. Fast Paxos. Distributed Computing 19 (October 2006), 79--103. https://www.microsoft.com/en-us/research/publication/fast-paxos/Google ScholarGoogle Scholar
  20. Leslie Lamport and Mike Massa. 2004. Cheap Paxos. In Proceedings of the 2004 International Conference on Dependable Systems and Networks (DSN '04). IEEE Computer Society, USA, 307.Google ScholarGoogle Scholar
  21. Leslie Lamport, Robert Shostak, and Marshall Pease. 1982. The Byzantine Generals Problem. ACM Trans. Program. Lang. Syst. 4, 3 (July 1982), 382--401. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Butler Lampson. 2001. The ABCD's of Paxos. In Proceedings of the Twentieth Annual ACM Symposium on Principles of Distributed Computing (Newport, Rhode Island, USA) (PODC '01). Association for Computing Machinery, New York, NY, USA, 13. 383969 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Butler W. Lampson. 1996. How to Build a Highly Available System Using Consensus. In Proceedings of the 10th International Workshop on Distributed Algorithms (WDAG '96). Springer-Verlag, Berlin, Heidelberg, 1--17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. m3 [n.d.]. M3: Uber's Open Source, Large-scale Metrics Platform for Prometheus. https://eng.uber.com/m3/.Google ScholarGoogle Scholar
  25. Hein Meling and Leander Jehl. 2013. Tutorial Summary: Paxos Explained from Scratch. In Proceedings of the 17th International Conference on Principles of Distributed Systems - Volume 8304 (Nice, France) (OPODIS 2013). Springer-Verlag, Berlin, Heidelberg, 1--10. https: // Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Iulian Moraru, David G. Andersen, and Michael Kaminsky. 2013. There is More Consensus in Egalitarian Parliaments. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (Farminton, Pennsylvania) (SOSP '13). Association for Computing Machinery, New York, NY, USA, 358--372. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Iulian Moraru, David G. Andersen, and Michael Kaminsky. 2014. Paxos Quorum Leases: Fast Reads Without Sacrificing Writes. In Proceedings of the ACM Symposium on Cloud Computing (Seattle,WA, USA) (SOCC '14). Association for Computing Machinery, New York, NY, USA, 1--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Diego Ongaro and John Ousterhout. 2014. In Search of an Understandable Consensus Algorithm. In Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference (Philadelphia, PA) (USENIX ATC'14). USENIX Association, USA, 305--320.Google ScholarGoogle Scholar
  29. Roberto De Prisco, Butler W. Lampson, and Nancy A. Lynch. 1997. Revisiting the Paxos Algorithm. In Proceedings of the 11th International Workshop on Distributed Algorithms (WDAG '97). Springer-Verlag, Berlin, Heidelberg, 111--125.Google ScholarGoogle Scholar
  30. raft [n.d.]. The Raft Consensus Algorithm. https://raft.github.io.Google ScholarGoogle Scholar
  31. Raghu Ramakrishnan, Baskar Sridharan, John R. Douceur, Pavan Kasturi, Balaji Krishnamachari-Sampath, Karthick Krishnamoorthy, Peng Li, Mitica Manu, Spiro Michaylov, Rogério Ramos, and et al. 2017. Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics. In Proceedings of the 2017 ACM International Conference on Management of Data (Chicago, Illinois, USA) (SIGMOD '17). Association for Computing Machinery, New York, NY, USA, 51--63. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Fred B. Schneider. 1990. Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial. ACM Comput. Surv. 22, 4 (Dec. 1990), 299--319. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, and John Wilkes. 2013. Omega: Flexible, Scalable Schedulers for Large Compute Clusters. In Proceedings of the 8th ACM European Conference on Computer Systems (Prague, Czech Republic) (EuroSys '13). Association for Computing Machinery, New York, NY, USA, 351--364. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. trillian [n.d.]. Trillian: General Transparency. https://github.com/google/trillian/.Google ScholarGoogle Scholar
  35. Robbert Van Renesse and Deniz Altinbuken. 2015. Paxos Made Moderately Complex. ACM Comput. Surv. 47, 3, Article 42 (Feb. 2015), 36 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, and John Wilkes. 2015. Large-Scale Cluster Management at Google with Borg. In Proceedings of the Tenth European Conference on Computer Systems (Bordeaux, France) (EuroSys '15). Association for Computing Machinery, New York, NY, USA, Article 18, 17 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Yang Zhang, Eman Ramadan, Hesham Mekky, and Zhi-Li Zhang. 2017. When Raft Meets SDN: How to Elect a Leader and Reach Consensus in an Unruly Network. In Proceedings of the First Asia-Pacific Workshop on Networking (Hong Kong, China) (APNet'17). Association for Computing Machinery, New York, NY, USA, 1--7. https: // Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Jianjun Zheng, Qian Lin, Jiatao Xu, Cheng Wei, Chuwei Zeng, Pingan Yang, and Yunfan Zhang. 2017. PaxosStore: High-Availability Storage Made Practical in WeChat. Proc. VLDB Endow. 10, 12 (Aug. 2017), 1730--1741. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Paxos vs Raft: have we reached consensus on distributed consensus?

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          PaPoC '20: Proceedings of the 7th Workshop on Principles and Practice of Consistency for Distributed Data
          April 2020
          98 pages
          ISBN:9781450375245
          DOI:10.1145/3380787

          Copyright © 2020 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 27 April 2020

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          PaPoC '20 Paper Acceptance Rate14of23submissions,61%Overall Acceptance Rate34of47submissions,72%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader