skip to main content
10.1145/2611286.2611294acmconferencesArticle/Chapter ViewAbstractPublication PagesdebsConference Proceedingsconference-collections
research-article

Latency-aware elastic scaling for distributed data stream processing systems

Published:26 May 2014Publication History

ABSTRACT

Elastic scaling allows a data stream processing system to react to a dynamically changing query or event workload by automatically scaling in or out. Thereby, both unpredictable load peaks as well as underload situations can be handled. However, each scaling decision comes with a latency penalty due to the required operator movements. Therefore, in practice an elastic system might be able to improve the system utilization, however it is not able to provide latency guarantees defined by a service level agreement (SLA).

In this paper we introduce an elastic scaling system, which optimizes the utilization under certain latency constraints defined by a SLA. Specifically, we present a model, which estimates the latency spike created by a set of operator movements. We use this model to built a latency-aware elastic operator placement algorithm, which minimizes the number of latency violations. We show that our solution is able to reduce the 90th percentile of the end to end latency by up to 30% and reduce the number of latency violations by 50%. The achieved system utilization for our approach is comparable to a scaling strategy, which does not use latency as optimization target.

References

  1. D. J. Abadi, Y. Ahmad, M. Balazinska, U. Cetintemel, M. Cherniack, J.-H. Hwang, W. Lindner, A. S. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. Zdonik. The Design of the Borealis Stream Processing Engine. In CIDR, pages 277--289, 2005.Google ScholarGoogle Scholar
  2. Y. Ahmad and U. Çetintemel. Network-aware query processing for stream-based applications. In VLDB, pages 456--467, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia. A view of cloud computing. Communications of the ACM, pages 50--58, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Arrington. AOL proudly releases massive amounts of private data. TechCrunch: http://www.techcrunch.com/2006/08/06/aol-proudly-releasesmassive-amounts-of-user-search-data, 2006.Google ScholarGoogle Scholar
  5. B. Chandramouli, J. Goldstein, R. Barga, M. Riedewald, and I. Santos. Accurate latency estimation in a distributed event processing system. In ICDE, pages 255--266, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Coffman Jr, M. Garey, and D. Johnson. Approximation algorithms for bin packing: A survey. In Approximation algorithms for NP-hard problems, pages 46--93, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Das, D. Agrawal, and A. El Abbadi. Elastras: An elastic transactional data store in the cloud. In USENIX HotCloud, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. J. Elmore, S. Das, D. Agrawal, and A. El Abbadi. Zephyr: live migration in shared nothing databases for elastic cloud platforms. In SIGMOD, pages 301--312, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. C. Fernandez, M. Migliavacca, E. Kalyvianaki, and P. Pietzuch. Integrating scale out and fault tolerance in stream processing using operator state management. In SIGMOD, pages 725--736, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Gedik, S. Schneider, M. Hirzel, and K. Wu. Elastic scaling for data stream processing. IEEE TPDS, 2013.Google ScholarGoogle Scholar
  11. V. Gulisano, R. Jimenez-Peris, M. Patino-Martinez, C. Soriente, and P. Valduriez. Streamcloud: An elastic and scalable data streaming system. IEEE TPDS, pages 2351--2365, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Heinze, V. Pappalardo, Z. Jerzak, and C. Fetzer. Auto-scaling techniques for elastic data stream processing. In ICDEW, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Lorido-Botrán, J. Miguel-Alonso, and J. A. Lozano. Auto-scaling techniques for elastic applications in cloud environments. Department of Computer Architecture and Technology, University of Basque Country, Tech. Rep. EHU-KAT-IK-09-12, 2012.Google ScholarGoogle Scholar
  14. S. Martello and P. Toth. Algorithms for knapsack problems. Surveys in combinatorial optimization, pages 213--258, 1987.Google ScholarGoogle Scholar
  15. P. Mell and T. Grance. The NIST definition of cloud computing. NIST Special Publication 800--145, 2011.Google ScholarGoogle Scholar
  16. P. Pietzuch, J. Ledlie, J. Shneidman, M. Roussopoulos, M. Welsh, and M. Seltzer. Network-aware operator placement for stream-processing systems. In ICDE, pages 49--49, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. E. A. Rundensteiner, L. Ding, T. Sutherland, Y. Zhu, B. Pielech, and N. Mehta. CAPE: Continuous query engine with heterogeneous-grained adaptivity. In VLDB, pages 1353--1356, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. K. Sellis. Multiple-query optimization. ACM TODS, pages 23--52, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. A. Shah, J. M. Hellerstein, S. Chandrasekaran, and M. J. Franklin. Flux: An adaptive partitioning operator for continuous query systems. In ICDE, pages 25--36, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  20. M. Stonebraker, U. Çetintemel, and S. Zdonik. The 8 requirements of real-time stream processing. ACM SIGMOD Record, pages 42--47, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. E. Wu, Y. Diao, and S. Rizvi. High-performance complex event processing over streams. In SIGMOD, pages 407--418, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Yang, J. Kramer, D. Papadias, and B. Seeger. Hybmig: A hybrid approach to dynamic plan migration for continuous queries. IEEE TKDE, pages 398--411, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Zhu, E. A. Rundensteiner, and G. T. Heineman. Dynamic plan migration for continuous queries over data streams. In SIGMOD, pages 431--442, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Latency-aware elastic scaling for distributed data stream processing systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      DEBS '14: Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems
      May 2014
      371 pages
      ISBN:9781450327374
      DOI:10.1145/2611286

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 May 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      DEBS '14 Paper Acceptance Rate16of174submissions,9%Overall Acceptance Rate130of553submissions,24%

      Upcoming Conference

      DEBS '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader