skip to main content
research-article

State management in Apache Flink®: consistent stateful distributed stream processing

Published:01 August 2017Publication History
Skip Abstract Section

Abstract

Stream processors are emerging in industry as an apparatus that drives analytical but also mission critical services handling the core of persistent application logic. Thus, apart from scalability and low-latency, a rising system need is first-class support for application state together with strong consistency guarantees, and adaptivity to cluster reconfigurations, software patches and partial failures. Although prior systems research has addressed some of these specific problems, the practical challenge lies on how such guarantees can be materialized in a transparent, non-intrusive manner that relieves the user from unnecessary constraints. Such needs served as the main design principles of state management in Apache Flink, an open source, scalable stream processor.

We present Flink's core pipelined, in-flight mechanism which guarantees the creation of lightweight, consistent, distributed snapshots of application state, progressively, without impacting continuous execution. Consistent snapshots cover all needs for system reconfiguration, fault tolerance and version management through coarse grained rollback recovery. Application state is declared explicitly to the system, allowing efficient partitioning and transparent commits to persistent storage. We further present Flink's backend implementations and mechanisms for high availability, external state queries and output commit. Finally, we demonstrate how these mechanisms behave in practice with metrics and large-deployment insights exhibiting the low performance trade-offs of our approach and the general benefits of exploiting asynchrony in continuous, yet sustainable system deployments.

References

  1. AthenaX : Ubers stream processing platform on Flink. http://sf.flink-forward.org/kb_sessions/athenax-ubers-streaming-processing-platform-on-flink/.Google ScholarGoogle Scholar
  2. Blink: How Alibaba Uses Apache Flink. http://data-artisans.com/blink-flink-alibaba-search/, 2016.Google ScholarGoogle Scholar
  3. Introduction to Spark's Structured Streaming. https://www.oreilly.com/learning/apache-spark-2-0--introduction-to-structured-streaming, 2016.Google ScholarGoogle Scholar
  4. Rbea: Scalable Real-Time Analytics at King. https://techblog.king.com/rbea-scalable-real-time-analytics-king/, 2016.Google ScholarGoogle Scholar
  5. Apache Apex. https://apex.apache.org, 2017.Google ScholarGoogle Scholar
  6. Apache Beam. https://beam.apache.org/, 2017.Google ScholarGoogle Scholar
  7. Apache Flink. http://flink.apache.org/, 2017.Google ScholarGoogle Scholar
  8. Apache Storm. http://storm.apache.org/, 2017.Google ScholarGoogle Scholar
  9. Flink Forward. http://flink-forward.org/, 2017.Google ScholarGoogle Scholar
  10. Flink Survey. http://data-artisans.com/flink-user-survey-2016-part-1/, http://data-artisans.com/flink-user-survey-2016-part-2/, 2017.Google ScholarGoogle Scholar
  11. Google Cloud Dataflow. https://cloud.google.com/dataflow/, 2017.Google ScholarGoogle Scholar
  12. Real-time monitoring with Flink, Kafka and HB. http://2016.flink-forward.org/kb_sessions/a-brief-history-of-time-with-apache-flink-real-time-monitoring-and-analysis-with-flink-kafka-hb/, 2017.Google ScholarGoogle Scholar
  13. Rockdb. http://rocksdb.org/, 2017.Google ScholarGoogle Scholar
  14. Stream processing with Flink at Netflix. http://sf.flink-forward.org/kb_sessions/keynote-stream-processing-with-flink-at-netflix/, 2017.Google ScholarGoogle Scholar
  15. StreamING models, how ING adds models at runtime to catch fraudsters. http://sf.flink-forward.org/kb_sessions/streaming-models-how-ing-adds-models-at-runtime-to-catch-fraudsters/, 2017.Google ScholarGoogle Scholar
  16. The Trident Stream Processing Programming Model. http://storm.apache.org/releases/0.10.0/Trident-tutorial.html, 2017.Google ScholarGoogle Scholar
  17. D. J. Abadi, D. Carney, U. Çetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik. Aurora: a new model and architecture for data stream management. VLDBJ, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. Akidau, A. Balikov, K. Bekiroglu, S. Chernyak, J. Haberman, R. Lax, S. McVeety, D. Mills, P. Nordstrom, and S. Whittle. MillWheel: Fault-tolerant stream processing at internet scale. In VLDB, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. J. Fernández-Moctezuma, R. Lax, S. McVeety, D. Mills, F. Perry, E. Schmidt, et al. The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. VLDB, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Alexandrov, R. Bergmann, S. Ewen, J.-C. Freytag, F. Hueske, A. Heise, O. Kao, M. Leich, U. Leser, V. Markl, et al. The Stratosphere platform for big data analytics. The VLDB Journal -- The International Journal on Very Large Data Bases, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Arasu, B. Babcock, S. Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani, U. Srivastava, and J. Widom. Stream: The stanford data stream management system. Book chapter, 2004.Google ScholarGoogle Scholar
  22. D. Battré, S. Ewen, F. Hueske, O. Kao, V. Markl, and D. Warneke. Nephele/pacts: a programming model and execution framework for web-scale analytical processing. In Proceedings of the 1st ACM symposium on Cloud computing, pages 119--130. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. P. Carbone, S. Ewen, S. Haridi, A. Katsifodimos, V. Markl, and K. Tzoumas. Apache flink: Stream and batch processing in a single engine. IEEE Data Engineering Bulletin, page 28, 2015.Google ScholarGoogle Scholar
  24. R. Castro Fernandez, M. Migliavacca, E. Kalyvianaki, and P. Pietzuch. Integrating scale out and fault tolerance in stream processing using operator state management. In Proceedings of the 2013 ACM SIGMOD international conference on Management of data, pages 725--736. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. Chambers, A. Raniwala, F. Perry, S. Adams, R. R. Henry, R. Bradshaw, and N. Weizenbaum. FlumeJava: easy, efficient data-parallel pipelines. In ACM Sigplan Notices. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. R. Madden, F. Reiss, and M. A. Shah. TelegraphCQ: continuous dataflow processing. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 668--668. ACM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. K. M. Chandy and L. Lamport. Distributed snapshots: determining global states of distributed systems. ACM Transactions on Computer Systems (TOCS), 3(1):63--75, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS), 26(2):4, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. Cherniack, H. Balakrishnan, M. Balazinska, D. Carney, U. Cetintemel, Y. Xing, and S. B. Zdonik. Scalable distributed stream processing. In CIDR, volume 3, pages 257--268, 2003.Google ScholarGoogle Scholar
  30. G. De Francisci Morales and A. Bifet. Samoa: Scalable advanced massive online analysis. The Journal of Machine Learning Research, 16(1):149--153, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107--113, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: amazon's highly available key-value store. ACM SIGOPS operating systems review, 41(6):205--220, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. E. N. Elnozahy, L. Alvisi, Y.-M. Wang, and D. B. Johnson. A survey of rollback-recovery protocols in message-passing systems. ACM Computing Surveys (CSUR), 34(3):375--408, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. B. He, M. Yang, Z. Guo, R. Chen, B. Su, W. Lin, and L. Zhou. Comet: batched stream processing for data intensive distributed computing. In Proceedings of the 1st ACM symposium on Cloud computing, pages 63--74. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. H. Katz, S. Shenker, and I. Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In NSDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. M. Hirzel, R. Soulé, S. Schneider, B. Gedik, and R. Grimm. A catalog of stream processing optimizations. ACM Computing Surveys (CSUR), 46(4):46, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. Zookeeper: Wait-free coordination for internet-scale systems. In USENIX annual technical conference, volume 8, page 9, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. G. Jacques-Silva, F. Zheng, D. Debrunner, K.-L. Wu, V. Dogaru, E. Johnson, M. Spicer, and A. E. Sariyce. Consistent regions: guaranteed tuple processing in ibm streams. Proceedings of the VLDB Endowment, 9(13):1341--1352, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. J. Kreps, N. Narkhede, J. Rao, et al. Kafka: A distributed messaging system for log processing. NetDB, 2011.Google ScholarGoogle Scholar
  40. Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed graphlab: a framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment, 5(8):716--727, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. N. Marz and J. Warren. Big Data: Principles and best practices of scalable realtime data systems. Manning Publications Co., 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. D. G. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham, and M. Abadi. Naiad: a timely dataflow system. In ACM SOSP, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, S. Seth, et al. Apache hadoop yarn: Yet another resource negotiator. In Proceedings of the 4th annual Symposium on Cloud Computing, page 5. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. H. Wang, L.-S. Peh, E. Koukoumidis, S. Tao, and M. C. Chan. Meteor shower: A reliable stream processing system for commodity data centers. In Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Y. Yu, M. Isard, D. Fetterly, M. Budiu, Ú. Erlingsson, P. K. Gunda, and J. Currey. Dryadlinq: A system for general-purpose distributed data-parallel computing using a high-level language.Google ScholarGoogle Scholar
  46. M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster computing with working sets. HotCloud, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. M. Zaharia, T. Das, H. Li, S. Shenker, and I. Stoica. Discretized streams: an efficient and fault-tolerant model for stream processing on large clusters. In Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing, pages 10--10. USENIX Association, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. State management in Apache Flink®: consistent stateful distributed stream processing
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Proceedings of the VLDB Endowment
          Proceedings of the VLDB Endowment  Volume 10, Issue 12
          August 2017
          427 pages
          ISSN:2150-8097
          Issue’s Table of Contents

          Publisher

          VLDB Endowment

          Publication History

          • Published: 1 August 2017
          Published in pvldb Volume 10, Issue 12

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader