skip to main content
research-article

Fault tolerance via idempotence

Published:23 January 2013Publication History
Skip Abstract Section

Abstract

Building distributed services and applications is challenging due to the pitfalls of distribution such as process and communication failures. A natural solution to these problems is to detect potential failures, and retry the failed computation and/or resend messages. Ensuring correctness in such an environment requires distributed services and applications to be idempotent.

In this paper, we study the inter-related aspects of process failures, duplicate messages, and idempotence. We first introduce a simple core language (based on lambda calculus inspired by modern distributed computing platforms. This language formalizes the notions of a service, duplicate requests, process failures, data partitioning, and local atomic transactions that are restricted to a single store.

We then formalize a desired (generic) correctness criterion for applications written in this language, consisting of idempotence (which captures the desired safety properties) and failure-freedom (which captures the desired progress properties).

We then propose language support in the form of a monad that automatically ensures failfree idempotence. A key characteristic of our implementation is that it is decentralized and does not require distributed coordination. We show that the language support can be enriched with other useful constructs, such as compensations, while retaining the coordination-free decentralized nature of the implementation.

We have implemented the idempotence monad (and its variants) in F# and C# and used our implementation to build realistic applications on Windows Azure. We find that the monad has low runtime overheads and leads to more declarative applications.

Skip Supplemental Material Section

Supplemental Material

r1d2_talk4.mp4

mp4

196.4 MB

References

  1. Azurescope: Benchmarking and Guidance for Windows Azure. http://azurescope.cloudapp.net/BenchmarkTestCases/#4f2bdbcc-7c23-4c06-9c00-f2cc12d2d2a7, June 2011.Google ScholarGoogle Scholar
  2. Bid Now Sample. http://bidnow.codeplex.com, June 2011.Google ScholarGoogle Scholar
  3. The Tailspin Scenario. http://msdn.microsoft.com/en-us/library/ff966486.aspx, June 2011.Google ScholarGoogle Scholar
  4. Windows Azure Patterns and Practices. http://wag.codeplex.com/, 2011.Google ScholarGoogle Scholar
  5. Roberto Bruni, Hernán Melgratti, and Ugo Montanari. Theoretical Foundations for Compensations in Flow Composition Languages. In Proceedings of POPL, pages 209--220, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Luis Caires, Carla Ferreira, and Hugo Vieira. A Process Calculus Analysis of Compensations. In Trustworthy Global Computing, volume 5474 of Lecture Notes in Computer Science, pages 87--103. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. John Field and Carlos A. Varela. Transactors: A Programming Model for Maintaining Globally Consistent Distributed State in Unreliable Environments. In Proceedings of POPL, pages 195--208, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M.J. Fischer, N.A. Lynch, and M.S. Paterson. Impossibility of Distributed Consensus with one Faulty Process. Journal of the ACM(JACM), 32(2):374--382, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Svend Frolund and Rachid Guerraoui. X-Ability: A Theory of Replication. Distributed Computing, 14, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hector Garcia-Molina and Kenneth Salem. Sagas. In Proc. of ICMD, pages 249--259, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Pat Helland. Idempotence is not a medical condition. ACM Queue, 10(4):30--46, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Mohan Kamath and Krithi Ramamritham. Correctness Issues in Workflow Management. Distributed Systems Engineering, 3(4):213, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  13. Sheng Liang, Paul Hudak, and Mark Jones. Monad Transformers and Modular Interpreters. In In Proc. of POPL, pages 333--343, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Barbara Liskov. Distributed programming in argus. Communications of ACM, 31:300--312, March 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Eliot B. Moss. Nested Transactions: An Approach to Reliable Distributed Computing, 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Dan Pritchett. Base: An acid alternative. Queue, 6(3):48--55, May 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Philip Wadler and Peter Thiemann. The Marriage of Effects and Monads. ACM Trans. Comput. Log., 4(1):1--32, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. David Walker, Lester Mackey, Jay Ligatti, George A. Reis, and David I. August. Static Typing for a Faulty Lambda Calculus. In In ACM International Conference on Functional Programming, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Gerhard Weikum and Hans-J. Schek. Concepts and Applications of Multilevel Transactions and Open Nested Transactions. In Database Transaction Models for Advanced Applications, pages 515--553, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Gerhard Weikum and Gottfried Vossen. Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fault tolerance via idempotence

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 48, Issue 1
        POPL '13
        January 2013
        561 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/2480359
        Issue’s Table of Contents
        • cover image ACM Conferences
          POPL '13: Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
          January 2013
          586 pages
          ISBN:9781450318327
          DOI:10.1145/2429069

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 23 January 2013

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader