skip to main content
10.1145/2320765.2320789acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

Stormy: an elastic and highly available streaming service in the cloud

Published:30 March 2012Publication History

ABSTRACT

In recent years, new highly scalable storage systems have significantly contributed to the success of Cloud Computing. Systems like Dynamo or Bigtable have underpinned their ability to handle tremendous amounts of data and scale to a very large number of nodes. Although these systems are designed the store data, the fundamental architectural properties and the techniques used (e.g., request routing, replication and load balancing) can also be applied to data streaming systems. In this paper, we present Stormy, a distributed stream processing service for continuous data processing. Stormy is based on proven techniques from existing Cloud storage systems that are adapted to efficiently execute streaming workloads. The primary design focus lies in providing a scalable, elastic, and fault-tolerant framework for continuous data processing, while at the same time optimizing resource utilization and increasing cost efficiency. Stormy is able to process any kind of streaming workloads, thus, covering a wide range of use cases ranging from realtime data analytics to long-term data aggregation jobs.

References

  1. D. J. Abadi, Y. Ahmad, M. Balazinska, M. Cherniack, J.-H. Hwang, W. Lindner, A. S. Maskey, E. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. Zdonik. The design of the Borealis stream processing engine. In CIDR '05, pages 277--289, 2005.Google ScholarGoogle Scholar
  2. Apache Zookeeper. http://zookeeper.apache.org/.Google ScholarGoogle Scholar
  3. A. Arasu, M. Cherniack, E. Galvez, D. Maier, A. S. Maskey, E. Ryvkina, M. Stonebraker, and R. Tibbetts. Linear Road: A stream data management benchmark. In VLDB '04, pages 480--491, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. H. Balakrishnan, M. F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Looking up data in P2P systems. CACM, 46:43--48, February 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Balazinska, H. Balakrishnan, S. R. Madden, and M. Stonebraker. Fault-tolerance in the Borealis distributed stream processing system. ACM Transactions on Database Systems, 33:1--44, March 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Burrows. The Chubby lock service for loosely-coupled distributed systems. In OSDI '06, pages 335--350, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. R. Madden, F. Reiss, and M. A. Shah. TelegraphCQ: Continuous dataflow processing. In SIGMOD '03, page 668, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Cherniack, H. Balakrishnan, M. Balazinska, D. Carney, U. Çetintemel, Y. Xing, and S. Zdonik. Scalable distributed stream processing. In CIDR '03, pages 257--268, 2003.Google ScholarGoogle Scholar
  9. J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. In OSDI '04, pages 137--150, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. In SOSP '07, pages 205--220, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Grinev, M. Grineva, M. Hentschel, and D. Kossmann. Analytics for the real-time web. PVLDB, 4(12):1391--1394, September 2011.Google ScholarGoogle Scholar
  12. V. Gulisano, R. Jimenez-Peris, M. Patino-Martinez, and P. Valduriez. StreamCloud: A large scale data streaming system. In ICDCS '10, pages 126--137, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed data-parallel programs from sequential building blocks. In EuroSys '07, pages 59--72, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine, and D. Lewin. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In STOC '97, pages 654--663, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Kossmann, T. Kraska, S. Loesing, S. Merkli, R. Mittal, and F. Pfaffhauser. Cloudy: A modular cloud storage system. PVLDB, 3:1533--1536, September 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Lakshman and P. Malik. Cassandra: A decentralized structured storage system. SIGOPS Operating Systems Review, 44:35--40, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Lamport. The part-time parliament. ACM Transactions on Computer Systems, 16:133--169, May 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. N. Marz. A Storm is coming. http://engineering.twitter.com/2011/08/storm-is-coming-more-details-and-plans.html, August 2011.Google ScholarGoogle Scholar
  19. F. McSherry, R. Isaacs, M. Isard, and D. G. Murray. Naiad: The animating spirit of rivers and streams. In SOSP '11, 2011.Google ScholarGoogle Scholar
  20. Microsoft SQL Server StreamInsight. http://www.microsoft.com/sqlserver/en/us/solutions-technologies/business-intelligence/complex-event-processing.aspx.Google ScholarGoogle Scholar
  21. MXQuery: A lightweight, full-featured XQuery engine. http://mxquery.org/.Google ScholarGoogle Scholar
  22. D. Peng and F. Dabek. Large-scale incremental processing using distributed transactions and notifications. In OSDI '10, pages 251--264, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Shah, J. Hellerstein, S. Chandrasekaran, and M. Franklin. Flux: An adaptive partitioning operator for continuous query systems. In ICDE '03, pages 25--36, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  24. I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In SIGCOMM '01, pages 149--160, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. StreamBase Systems, Inc. http://www.streambase.com/.Google ScholarGoogle Scholar
  26. N. Tatbul, U. Çetintemel, and S. Zdonik. Staying FIT: efficient load shedding techniques for distributed stream processing. In VLDB '07, pages 159--170, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Y. Xing, S. Zdonik, and J.-H. Hwang. Dynamic load distribution in the Borealis stream processor. In ICDE '05, pages 791--802, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Stormy: an elastic and highly available streaming service in the cloud

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        EDBT-ICDT '12: Proceedings of the 2012 Joint EDBT/ICDT Workshops
        March 2012
        265 pages
        ISBN:9781450311434
        DOI:10.1145/2320765

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 30 March 2012

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate7of10submissions,70%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader