research-article

Fault-tolerance in the borealis distributed stream processing system

Authors:
Magdalena Balazinska

University of Washington, Seattle, WA

University of Washington, Seattle, WA
View Profile

,
Hari Balakrishnan

Massachusetts Institute of Technology, Cambridge, MA

Massachusetts Institute of Technology, Cambridge, MA
View Profile

,
Samuel R. Madden

Massachusetts Institute of Technology, Cambridge, MA

Massachusetts Institute of Technology, Cambridge, MA
View Profile

,
Michael Stonebraker

Massachusetts Institute of Technology, Cambridge, MA

Massachusetts Institute of Technology, Cambridge, MA
View Profile

Authors Info & Claims

ACM Transactions on Database Systems Volume 33 Issue 1Article No.: 3pp 1–44https://doi.org/10.1145/1331904.1331907

Published:21 March 2008Publication History

ACM Transactions on Database Systems

Abstract

Over the past few years, Stream Processing Engines (SPEs) have emerged as a new class of software systems, enabling low latency processing of streams of data arriving at high rates. As SPEs mature and get used in monitoring applications that must continuously run (e.g., in network security monitoring), a significant challenge arises: SPEs must be able to handle various software and hardware faults that occur, masking them to provide high availability (HA). In this article, we develop, implement, and evaluate DPC (Delay, Process, and Correct), a protocol to handle crash failures of processing nodes and network failures in a distributed SPE.

Like previous approaches to HA, DPC uses replication and masks many types of node and network failures. In the presence of network partitions, the designer of any replication system faces a choice between providing availability or data consistency across the replicas. In DPC, this choice is made explicit: the user specifies an availability bound (no result should be delayed by more than a specified delay threshold even under failure if the corresponding input is available), and DPC attempts to minimize the resulting inconsistency between replicas (not all of which might have seen the input data) while meeting the given delay threshold. Although conceptually simple, the DPC protocol tolerates the occurrence of multiple simultaneous failures as well as any further failures that occur during recovery.

This article describes DPC and its implementation in the Borealis SPE. We show that DPC enables a distributed SPE to maintain low-latency processing at all times, while also achieving eventual consistency, where applications eventually receive the complete and correct output streams. Furthermore, we show that, independent of system size and failure location, it is possible to handle failures almost up-to the user-specified bound in a manner that meets the required availability without introducing any inconsistency.

References

Abadi, D. J., Ahmad, Y., Balazinska, M., Çetintemel, U., Cherniack, M., Hwang, J.-H., Lindner, W., Maskey, A. S., Rasin, A., Ryvkina, E., Tatbul, N., Xing, Y., and Zdonik, S. 2005. The design of the Borealis stream processing engine. In Proceedings of the 2nd Biennial Conference on Innovative Data Systems Research (CIDR).Google Scholar
Abadi, D. J., Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., and Zdonik, S. 2003. Aurora: A new model and architecture for data stream management. VLDB J. 12, 2. Google ScholarDigital Library
Aleri. http://www.aleri.com/index.html.Google Scholar
Alonso, G., Günthör, R., Kamath, M., Agrawal, D., El Abbadi, A., and Mohan, C. 1995. Exotica/FMDC: Handling disconnected clients in a workflow management system. In Proceedings of the 3rd International Conference on Cooperative Information Systems.Google Scholar
Alonso, G. and Mohan, C. 1997. WFMS: The next generation of distributed processing tools. S. Jajodia and L. Kerschberg, Eds. In Advanced Transaction Models and Architectures, Kluwer.Google Scholar
Alonso, G., Mohan, C., Günthör, R., Agrawal, D., El Abbadi, A., and Kamath, M. 1995. Exotica/FMQM: A persistent message-based architecture for distributed workflow management. In Proceedings of IFIP WG8.1 Working Conference on Information Systems for Decentralized Organizations.Google Scholar
Alonso, G., Reinwald, B., and Mohan, C. 1997. Distributed data management in workflow environments. In Proceedings of the 7th ACM RIDE. Google ScholarDigital Library
Balakrishnan, H., Balazinska, M., Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Galvez, E., Salz, J., Stonebraker, M., Tatbul, N., Tibbets, R., and Zdonik, S. 2004. Retrospective on Aurora. VLDB J. 13, 4. Google ScholarDigital Library
Balazinska, M. 2006. Fault-tolerance and load management in a distributed stream processing system. Ph.D. thesis, Massachusetts Institute of Technology. Google ScholarDigital Library
Balazinska, M., Balakrishnan, H., Madden, S., and Stonebraker, M. 2005. Fault-tolerance in the Borealis distributed stream processing system. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Google ScholarDigital Library
Bernstein, P. A., Hsu, M., and Mann, B. 1990. Implementing recoverable requests using queues. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Google ScholarDigital Library
Brewer, E. A. 2001. Lessons from giant-scale services. IEEE Internet Comput. 5, 4, 46--55. Google ScholarDigital Library
Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M. J., Hellerstein, J. M., Hong, W., Krishnamurthy, S., Madden, S., Raman, V., Reiss, F., and Shah, M. 2003. TelegraphCQ: Continuous dataflow processing for an uncertain world. In Proceedings of the 1st Biennial Conference on Innovative Data Systems Research (CIDR).Google Scholar
Cherniack, M., Balakrishnan, H., Balazinska, M., Carney, D., Çetintemel, U., Xing, Y., and Zdonik, S. 2003. Scalable distributed stream processing. In Proceedings of the 1st Biennial Conference on Innovative Data Systems Research (CIDR).Google Scholar
Coral8. http://coral8.com/.Google Scholar
Cranor, C., Johnson, T., Shkapenyuk, V., and Spatscheck, O. 2003. Gigascope: A stream database for network applications. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Google ScholarDigital Library
Elnozahy, E. N. M., Alvisi, L., Wang, Y.-M., and Johnson, D. B. 2002. A survey of rollback-recovery protocols in message-passing systems. ACM Comput. Surv. 34, 3, 375--408. Google ScholarDigital Library
Fekete, A., Gupta, D., Luchangco, V., Lynch, N., and Shvartsman, A. 1996. Eventually-serializable data services. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing (PODC'96). Google ScholarDigital Library
Garcia-Molina, H. and Barbara, D. 1985. How to assign votes in a distributed system. J. ACM 32, 4, 841--860. Google ScholarDigital Library
Gifford, D. K. 1979. Weighted voting for replicated data. In Proceedings of the 7th ACM Symposium on Operating Systems Principles (SOSP). Google ScholarDigital Library
Gilbert, S. and Lynch, N. 2002. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant Web services. ACM SIGACT News 33, 2. Google ScholarDigital Library
Gray, J., Helland, P., O'Neil, P., and Shasha, D. 1996. The dangers of replication and a solution. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Google ScholarDigital Library
Gray, J. and Reuters, A. 1993. Transaction Processing: Concepts and Techniques. Morgan Kaufmann. Google ScholarDigital Library
Hsu, M. 1995. Special issue on workflow systems. IEEE Data Eng. Bull. 18, 1.Google Scholar
Hwang, J.-H., Balazinska, M., Rasin, A., Çetintemel, U., Stonebraker, M., and Zdonik, S. 2005. High-availability algorithms for distributed stream processing. In Proceedings of the 21st International Conference on Data Engineering (ICDE). Google ScholarDigital Library
Kamath, M., Alonso, G., Guenthor, R., and Mohan, C. 1996. Providing high availability in very large workflow management systems. In Proceedings of the 5th International Conference on Extending Database Technology. Google ScholarDigital Library
Kao, B. and Garcia-Molina, H. 1995. An overview of real-time database systems. In Advances in Real-Time Systems. Prentice-Hall. Google ScholarDigital Library
Kawell, L., Beckhardt, S., Halvorsen, T., Ozzie, R., and Greif, I. 1988. Replicated document management in a group communication system. In Proceedings of the ACM Conference on Computer-Supported Cooperative Work (CSCW). Google ScholarDigital Library
Lam, K.-W., Son, S. H., Hung, S.-L., and Wang, Z. 2000. Scheduling transactions with stringent real-time constraints. Inform. Syst. 25, 6, 431--452. Google ScholarDigital Library
Lomet, D. and Tuttle, M. 2003. A theory of redo recovery. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Google ScholarDigital Library
Melliar-Smith, P. M. and Moser, L. E. 2004. Progress in real-time fault tolerance. In Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems (SRDS'04). Google ScholarDigital Library
Mossé, D., Melhem, R., and Ghosh, S. 2003. A nonpreemptive real-time scheduler with recovery from transient faults and its implementation. IEEE Trans. Softw. Engin. 29, 8. Google ScholarDigital Library
Motwani, R., Widom, J., Arasu, A., Babcock, B., Babu, S., Datar, M., Manku, G., Olston, C., Rosenstein, J., and Varma, R. 2003. Query processing, approximation, and resource management in a data stream management system. In Proceedings of the 1st Biennial Conference on Innovative Data Systems Research (CIDR).Google Scholar
Naughton et al. 2001. The Niagara Internet query system. IEEE Data Engin. Bull. 24, 2.Google Scholar
Olston, C. 2003. Approximate replication. Ph.D. thesis, Stanford University. Google ScholarDigital Library
Olston, C., Jiang, J., and Widom, J. 2003. Adaptive filters for continuous queries over distributed data streams. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Google ScholarDigital Library
Radoslavov, P., Govindan, R., and Estrin, D. 2001. Topology-informed Internet replica placement. In Proceedings of the 6th International Workshop on Web Caching and Content Distribution (WCW'01).Google Scholar
Raman, V. and Hellerstein, J. M. 2002. Partial results for online query processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Google ScholarDigital Library
Saito, Y. and Shapiro, M. 2005. Optimistic replication. ACM Comput. Surv. 37, 1, 42--81. Google ScholarDigital Library
Shah, M., Hellerstein, J., and Brewer, E. 2004. Highly-available, fault-tolerant, parallel dataflows. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Google ScholarDigital Library
Srivastava, U. and Widom, J. 2004. Flexible time management in data stream systems. In Proceedings of the 23rd ACM Symposium on Principles of Database Systems (PODS). Google ScholarDigital Library
StreamBase. http://www.streambase.com/.Google Scholar
Strom, R. E. 2004. Fault-tolerance in the SMILE stateful publish-subscribe system. In Proceedings of the 3rd International Workshop on Distributed Event-Based Systems (DEBS'04).Google ScholarCross Ref
Tang, X., Chi, H., and Chanson, S. T. 2007. Optimal replica placement under TTL-based consistency. IEEE Trans. Parall. Distrib. Syst. 18, 3, 351--363. Google ScholarDigital Library
Terry, D. B., Theimer, M., Petersen, K., Demers, A. J., Spreitzer, M., and Hauser, C. 1995. Managing update conflicts in Bayou, a weakly connected replicated storage system. In Proceedings of the 15th ACM Symposium on Operating Systems Principles (SOSP). Google ScholarDigital Library
The NTP Project. NTP: The Network Time Protocol. http://www.ntp.org/.Google Scholar
Tucker, P. A. and Maier, D. 2003. Dealing with disorder. In Proceedings of the Workshop on Management and Processing of Data Streams (MPDS).Google Scholar
Urbano, R. 2003. Oracle Streams Replication Administrator's Guide, 10g Release 1 (10.1). Oracle Corporation.Google Scholar

Index Terms

Fault-tolerance in the borealis distributed stream processing system
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Parallel and distributed DBMSs
2. Software and its engineering
  1. Software organization and properties
    1. Extra-functional properties
      1. Software fault tolerance

Recommendations

Fault-tolerant stream processing using a distributed, replicated file system

We present SGuard, a new fault-tolerance technique for distributed stream processing engines (SPEs) running in clusters of commodity servers. SGuard is less disruptive to normal stream processing and leaves more resources available for normal stream ...
Read More
Fault-tolerance in the Borealis distributed stream processing system
SIGMOD '05: Proceedings of the 2005 ACM SIGMOD international conference on Management of data

We present a replication-based approach to fault-tolerant distributed stream processing in the face of node failures, network failures, and network partitions. Our approach aims to reduce the degree of inconsistency in the system while guaranteeing that ...
Read More
Borealis-R: a replication-transparent stream processing system for wide-area monitoring applications
SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data

Borealis-R is a replication-based system for both fast and

highly-available processing of data streams over wide-area networks. In Borealis-R, multiple operator replicas send outputs to downstream replicas, allowing each replica to use whichever data ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Database Systems Volume 33, Issue 1
March 2008
211 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/1331904
Issue’s Table of Contents

Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 March 2008
- Accepted: 1 August 2007
- Revised: 1 May 2007
- Received: 1 September 2006
Published in tods Volume 33, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Distributed stream processing
availability
consistency
fault-tolerance
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 144
  Total Citations
  View Citations
- 1,621
  Total Downloads
- Downloads (Last 12 months)25
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fault-tolerance in the borealis distributed stream processing system

ACM Transactions on Database Systems

Abstract

References

Cited By

Index Terms

Recommendations

Fault-tolerant stream processing using a distributed, replicated file system

Fault-tolerance in the Borealis distributed stream processing system

Borealis-R: a replication-transparent stream processing system for wide-area monitoring applications

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Fault-tolerance in the borealis distributed stream processing system

ACM Transactions on Database Systems

Abstract

References

Cited By

Index Terms

Recommendations

Fault-tolerant stream processing using a distributed, replicated file system

Fault-tolerance in the Borealis distributed stream processing system

Borealis-R: a replication-transparent stream processing system for wide-area monitoring applications

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media