Abstract
Building distributed services and applications is challenging due to the pitfalls of distribution such as process and communication failures. A natural solution to these problems is to detect potential failures, and retry the failed computation and/or resend messages. Ensuring correctness in such an environment requires distributed services and applications to be idempotent.
In this paper, we study the inter-related aspects of process failures, duplicate messages, and idempotence. We first introduce a simple core language (based on lambda calculus inspired by modern distributed computing platforms. This language formalizes the notions of a service, duplicate requests, process failures, data partitioning, and local atomic transactions that are restricted to a single store.
We then formalize a desired (generic) correctness criterion for applications written in this language, consisting of idempotence (which captures the desired safety properties) and failure-freedom (which captures the desired progress properties).
We then propose language support in the form of a monad that automatically ensures failfree idempotence. A key characteristic of our implementation is that it is decentralized and does not require distributed coordination. We show that the language support can be enriched with other useful constructs, such as compensations, while retaining the coordination-free decentralized nature of the implementation.
We have implemented the idempotence monad (and its variants) in F# and C# and used our implementation to build realistic applications on Windows Azure. We find that the monad has low runtime overheads and leads to more declarative applications.
Supplemental Material
- Azurescope: Benchmarking and Guidance for Windows Azure. http://azurescope.cloudapp.net/BenchmarkTestCases/#4f2bdbcc-7c23-4c06-9c00-f2cc12d2d2a7, June 2011.Google Scholar
- Bid Now Sample. http://bidnow.codeplex.com, June 2011.Google Scholar
- The Tailspin Scenario. http://msdn.microsoft.com/en-us/library/ff966486.aspx, June 2011.Google Scholar
- Windows Azure Patterns and Practices. http://wag.codeplex.com/, 2011.Google Scholar
- Roberto Bruni, Hernán Melgratti, and Ugo Montanari. Theoretical Foundations for Compensations in Flow Composition Languages. In Proceedings of POPL, pages 209--220, 2005. Google ScholarDigital Library
- Luis Caires, Carla Ferreira, and Hugo Vieira. A Process Calculus Analysis of Compensations. In Trustworthy Global Computing, volume 5474 of Lecture Notes in Computer Science, pages 87--103. 2009. Google ScholarDigital Library
- John Field and Carlos A. Varela. Transactors: A Programming Model for Maintaining Globally Consistent Distributed State in Unreliable Environments. In Proceedings of POPL, pages 195--208, 2005. Google ScholarDigital Library
- M.J. Fischer, N.A. Lynch, and M.S. Paterson. Impossibility of Distributed Consensus with one Faulty Process. Journal of the ACM(JACM), 32(2):374--382, 1985. Google ScholarDigital Library
- Svend Frolund and Rachid Guerraoui. X-Ability: A Theory of Replication. Distributed Computing, 14, 2000. Google ScholarDigital Library
- Hector Garcia-Molina and Kenneth Salem. Sagas. In Proc. of ICMD, pages 249--259, 1987. Google ScholarDigital Library
- Pat Helland. Idempotence is not a medical condition. ACM Queue, 10(4):30--46, 2012. Google ScholarDigital Library
- Mohan Kamath and Krithi Ramamritham. Correctness Issues in Workflow Management. Distributed Systems Engineering, 3(4):213, 1996.Google ScholarCross Ref
- Sheng Liang, Paul Hudak, and Mark Jones. Monad Transformers and Modular Interpreters. In In Proc. of POPL, pages 333--343, 1995. Google ScholarDigital Library
- Barbara Liskov. Distributed programming in argus. Communications of ACM, 31:300--312, March 1988. Google ScholarDigital Library
- J. Eliot B. Moss. Nested Transactions: An Approach to Reliable Distributed Computing, 1981. Google ScholarDigital Library
- Dan Pritchett. Base: An acid alternative. Queue, 6(3):48--55, May 2008. Google ScholarDigital Library
- Philip Wadler and Peter Thiemann. The Marriage of Effects and Monads. ACM Trans. Comput. Log., 4(1):1--32, 2003. Google ScholarDigital Library
- David Walker, Lester Mackey, Jay Ligatti, George A. Reis, and David I. August. Static Typing for a Faulty Lambda Calculus. In In ACM International Conference on Functional Programming, 2006. Google ScholarDigital Library
- Gerhard Weikum and Hans-J. Schek. Concepts and Applications of Multilevel Transactions and Open Nested Transactions. In Database Transaction Models for Advanced Applications, pages 515--553, 1992. Google ScholarDigital Library
- Gerhard Weikum and Gottfried Vossen. Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control. 2001. Google ScholarDigital Library
Index Terms
- Fault tolerance via idempotence
Recommendations
Fault tolerance via idempotence
POPL '13: Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesBuilding distributed services and applications is challenging due to the pitfalls of distribution such as process and communication failures. A natural solution to these problems is to detect potential failures, and retry the failed computation and/or ...
A non-commutative and non-idempotent theory of quantale sets
In fuzzy set theory non-idempotency arises when the conjunction is interpreted by arbitrary t-norms. There are many instances in mathematics where set theory ought to be non-commutative and/or non-idempotent. The purpose of this paper is to combine both ...
Monads and distributive laws for Rota-Baxter and differential algebras
In recent years, algebraic studies of the differential calculus and integral calculus in the forms of differential algebra and Rota-Baxter algebra have been merged together to reflect the close relationship between the two calculi through the First ...
Comments