skip to main content
10.1145/3062341.3062366acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

Low-synchronization, mostly lock-free, elastic scheduling for streaming runtimes

Published:14 June 2017Publication History

ABSTRACT

We present the scalable, elastic operator scheduler in IBM Streams 4.2. Streams is a distributed stream processing system used in production at many companies in a wide range of industries. The programming language for Streams, SPL, presents operators, tuples and streams as the primary abstractions. A fundamental SPL optimization is operator fusion, where multiple operators execute in the same process. Streams 4.2 introduces automatic submission-time fusion to simplify application development and deployment. However, potentially thousands of operators could then execute in the same process, with no user guidance for thread placement. We needed a way to automatically figure out how many threads to use, with arbitrarily sized applications on a wide variety of hardware, and without any input from programmers. Our solution has two components. The first is a scalable operator scheduler that minimizes synchronization, locks and global data, while allowing threads to execute any operator and dynamically come and go. The second is an elastic algorithm to dynamically adjust the number of threads to optimize performance, using the principles of trusted measurements to establish trends. We demonstrate our scheduler's ability to scale to over a hundred threads, and our elasticity algorithm's ability to adapt to different workloads on an Intel Xeon system with 176 logical cores, and an IBM Power8 system with 184 logical cores.

References

  1. Streaming Analytics. https://console.ng.bluemix.net/catalog/ services/streaming-analytics. Retrieved April, 2017.Google ScholarGoogle Scholar
  2. Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou. Cilk: An efficient multithreaded runtime system. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP ’95, New York, NY, USA, 1995. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Boost.Lockfree. http://www.boost.org/doc/libs/1_63_0/doc/ html/lockfree.html. Retrieved March, 2017.Google ScholarGoogle Scholar
  4. C++ std::atomic. http://en.cppreference.com/w/cpp/atomic/ atomic. Retrieved March, 2017.Google ScholarGoogle Scholar
  5. Apache Flink. http://flink.apache.org. Retrieved March, 2017.Google ScholarGoogle Scholar
  6. Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. The Implementation of the Cilk-5 Multithreaded Language. In Programming Language Design and Implementation (PLDI), 1998.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bugra Gedik, Scott Schneider, Martin Hirzel, and Kun-Lung Wu. Elastic scaling for data stream processing. IEEE Transactions on Parallel and Distributed Systems (TPDS), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. IBM Streams Demo. https://github.com/IBMStreams/streamsx. demo.logwatch. Retrieved March, 2017.Google ScholarGoogle Scholar
  9. IBM Streams Samples. https://github.com/IBMStreams/samples. Retrieved March, 2017.Google ScholarGoogle Scholar
  10. Heron. https://twitter.github.io/heron. Retrieved March, 2017.Google ScholarGoogle Scholar
  11. Martin Hirzel, Henrique Andrade, Bu˘gra Gedik, Gabriela Jacques-Silva, Rohit Khandekar, Vibhore Kumar, Mark Mendell, Howard Nasgaard, Scott Schneider, Robert Soulé, and Kun-Lung Wu. IBM Streams Processing Language: Analyzing big data in motion. IBM Journal of Research and Development, 57(3/4), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Martin Hirzel, Scott Schneider, and Bu˘gra Gedik. SPL: An extensible language for distributed stream processing. ACM Transactions on Programming Languages and Systems (TOPLAS), 39(1), March 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. IBM Stream Computing. http://www.ibm.com/analytics/us/en/ technology/stream-computing. Retrieved March, 2017.Google ScholarGoogle Scholar
  14. Gabriela Jacques-Silva, Fang Zheng, Daniel Debrunner, Kun-Lung Wu, Victor Dogaru, Eric Johnson, Michael Spicer, and Ahmet Erdem Sariyuce. Consistent Regions: Guaranteed Tuple Processing in IBM Streams. In Very Large Data Bases Conference (VLDB), 2016.Google ScholarGoogle Scholar
  15. Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. Naiad: A timely dataflow system. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP ’13, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. OpenMP. http://openmp.org/. Retrieved March, 2017.Google ScholarGoogle Scholar
  17. Semih Sahin. C-stream: A coroutine-based elastic stream processing engine. Master’s thesis, Bilkent University, June 2015.Google ScholarGoogle Scholar
  18. Scott Schneider. The ElasticLoadBalance Operator. https://developer.ibm.com/streamsdev/2015/01/27/ elasticloadbalance-operator, 2015.Google ScholarGoogle Scholar
  19. Scott Schneider, Henrique Andrade, Bugra Gedik, Alain Biem, and Kun-Lung Wu. Elastic scaling of data parallel operators in stream processing. In IEEE International Parallel and Distributed Processing Symposium, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Scott Schneider, Bugra Gedik, and Martin Hirzel. Language runtime and optimizations in IBM Streams. IEEE Database Engineering Bulletin, 38(4), 2015.Google ScholarGoogle Scholar
  21. SPL Reference. http://www.ibm.com/support/knowledgecenter/ SSCRJU_4.2.0/com.ibm.streams.ref.doc/doc/spl-container. html. Retrieved March, 2017.Google ScholarGoogle Scholar
  22. Apache Storm. http://storm.apache.org. Retrieved March, 2017.Google ScholarGoogle Scholar
  23. StreamsDev: IBM Streams Developer Community. https:// developer.ibm.com/streamsdev. Retrieved March, 2017.Google ScholarGoogle Scholar
  24. Yuzhe Tang and Bugra Gedik. Auto-pipelining for data stream processing. IEEE Transactions on Parallel and Distributed Systems (TPDS), 24(11), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M. Patel, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, Nikunj Bhagat, Sailesh Mittal, and Dmitriy Ryaboy. Storm@twitter. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD ’14, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Low-synchronization, mostly lock-free, elastic scheduling for streaming runtimes

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          PLDI 2017: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation
          June 2017
          708 pages
          ISBN:9781450349888
          DOI:10.1145/3062341

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 June 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate406of2,067submissions,20%

          Upcoming Conference

          PLDI '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader