research-article

Scotty: General and Efficient Open-source Window Aggregation for Stream Processing Systems

Authors:
Jonas Traub

Technische Universität Berlin, Germany

Technische Universität Berlin, Germany
View Profile

,
Philipp Marian Grulich

Technische Universität Berlin, Germany

Technische Universität Berlin, Germany
View Profile

,
Alejandro Rodríguez Cuéllar

Galápago Agroconsultores S.A.S., Bicaramanga, Colombia

Galápago Agroconsultores S.A.S., Bicaramanga, Colombia
View Profile

,
Sebastian Breß

Technische Universität Berlin, Germany

Technische Universität Berlin, Germany
View Profile

,
Asterios Katsifodimos

Delft University of Technology, Netherlands

Delft University of Technology, Netherlands
View Profile

,
Tilmann Rabl

HPI, University of Potsdam, Germany

HPI, University of Potsdam, Germany
View Profile

,
Volker Markl

Technische Universität Berlin & DFKI, Berlin, Germany

Technische Universität Berlin & DFKI, Berlin, Germany
View Profile

Authors Info & Claims

ACM Transactions on Database Systems Volume 46 Issue 1Article No.: 1pp 1–46https://doi.org/10.1145/3433675

Published:27 March 2021Publication History

ACM Transactions on Database Systems

Abstract

Window aggregation is a core operation in data stream processing. Existing aggregation techniques focus on reducing latency, eliminating redundant computations, or minimizing memory usage. However, each technique operates under different assumptions with respect to workload characteristics, such as properties of aggregation functions (e.g., invertible, associative), window types (e.g., sliding, sessions), windowing measures (e.g., time- or count-based), and stream (dis)order. In this article, we present Scotty, an efficient and general open-source operator for sliding-window aggregation in stream processing systems, such as Apache Flink, Apache Beam, Apache Samza, Apache Kafka, Apache Spark, and Apache Storm. One can easily extend Scotty with user-defined aggregation functions and window types. Scotty implements the concept of general stream slicing and derives workload characteristics from aggregation queries to improve performance without sacrificing its general applicability. We provide an in-depth view on the algorithms of the general stream slicing approach. Our experiments show that Scotty outperforms alternative solutions.

References

Tyler Akidau, Robert Bradshaw, et al. 2015. The dataflow model: A practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. Proc. VLDB Endow. 8, 12 (2015), 1792--1803.Google ScholarDigital Library
Alexander Alexandrov, Rico Bergmann, Stephan Ewen, et al. 2014. The Stratosphere platform for big data analytics. VLDB J. 23, 6 (2014), 939--964.Google ScholarDigital Library
Apache Apex. 2018. Enterprise-grade unified stream and batch processing engine. Retrieved from https://apex.apache.org/.Google Scholar
Apache Beam. 2018. An advanced unified programming model. Retrieved from https://beam.apache.org/.Google Scholar
Arvind Arasu and Jennifer Widom. 2004. Resource sharing in continuous sliding-window aggregates. Proceedings of the International Conference on Very Large Data Bases (PVLDB’04). 336--347.Google ScholarCross Ref
Michael Armbrust, Tathagata Das, Joseph Torres, Burak Yavuz, Shixiong Zhu, Reynold Xin, et al. 2018. Structured streaming: A declarative API for real-time applications in Apache Spark. In Proceedings of the ACM Special Interest Group on Management of Data (SIGMOD’18). 601--613.Google Scholar
Ahmed Awad, Jonas Traub, and Sherif Sakr. 2019. Adaptive watermarks: A concept drift-based approach for predicting event-time progress in data streams. In Proceedings of the International Conference on Extending Database Technology (EDBT’19).Google Scholar
Cagri Balkesen and Nesime Tatbul. 2011. Scalable data partitioning techniques for parallel sliding window processing over data streams. In Proceedings of the International Workshop on Data Management for Sensor Networks (DMSN’11).Google Scholar
Lawrence Benson, Philipp M. Grulich, Steffen Zeuch, Volker Markl, and Tilmann Rabl. 2020. Disco: Efficient distributed window aggregation. In Proceedings of the International Conference on Extending Database Technology (EDBT’20).Google Scholar
Pramod Bhatotia, Umut A. Acar, Flavio P. Junqueira, and Rodrigo Rodrigues. 2014. Slider: Incremental sliding window analytics. In Proceedings of the International Middleware Conference. ACM, 61--72.Google ScholarDigital Library
Brice Bingman. 2018. Poor performance with sliding time windows. In Flink Jira Issues. Retrieved from issues.apache.org/jira/browse/FLINK-6990.Google Scholar
Irina Botan, Roozbeh Derakhshan, Nihal Dindar, Laura Haas, Renée J. Miller, and Nesime Tatbul. 2010. SECRET: A model for analysis of the execution semantics of stream processing systems. Proc. VLDB Endow. 3, 1--2 (2010), 232--243.Google ScholarDigital Library
Paris Carbone. 2018. Scalable and Reliable Data Stream Processing. Ph.D. Dissertation. KTH Stockholm.Google Scholar
Paris Carbone, Stephan Ewen, Gyula Fóra, Seif Haridi, Stefan Richter, and Kostas Tzoumas. 2017. State management in Apache Flink: Consistent stateful distributed stream processing. Proc. VLDB Endow. 10, 12 (2017), 1718--1729.Google ScholarDigital Library
Paris Carbone, Gyula Fóra, Stephan Ewen, Seif Haridi, and Kostas Tzoumas. 2015. Lightweight asynchronous snapshots for distributed dataflows. Retrieved from https://arxiv.org/abs/1506.08603.Google Scholar
Paris Carbone, Asterios Katsifodimos, Stephan Ewen, Volker Markl, Seif Haridi, and Kostas Tzoumas. 2015. Apache Flink: Stream and batch processing in a single engine. IEEE Data Eng. Bull. 38, 4 (2015), 28--38.Google Scholar
Paris Carbone, Jonas Traub, Asterios Katsifodimos, Seif Haridi, and Volker Markl. 2016. Cutty: Aggregate sharing for user-defined windows. In Proceedings of the ACM Conference on Information and Knowledge Management (CIKM’16). 1201--1210.Google ScholarDigital Library
Badrish Chandramouli, Jonathan Goldstein, Mike Barnett, Robert DeLine, Danyel Fisher, John C. Platt, James F. Terwilliger, and John Wernsing. 2014. Trill: A high-performance incremental query processor for diverse analytics. Proceedings of the International Conference on Very Large Data Bases (PVLDB’14) 8, 4 (2014), 401--412.Google ScholarDigital Library
Sanket Chintapalli, Derek Dagit, Bobby Evans et al. 2016. Benchmarking streaming computation engines: Storm, Flink and Spark streaming. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW’16). 1789--1792.Google ScholarCross Ref
Xenofontas Dimitropoulos, Paul Hurley, Andreas Kind, and Marc Ph Stoecklin. 2009. On the 95-percentile billing method. In Passive and Active Network Measurement. Springer, 207--216.Google Scholar
Buğra Gedik. 2014. Generic windowing support for extensible stream processing systems. Softw.: Pract. Exp. 44, 9 (2014), 1105--1128.Google ScholarDigital Library
Thanaa M. Ghanem, Moustafa A. Hammad, Mohamed F. Mokbel, Walid G. Aref, and Ahmed K. Elmagarmid. 2007. Incremental evaluation of sliding-window queries over data streams. IEEE Trans. Knowl. Data Eng. 19, 1 (2007), 57--72.Google ScholarDigital Library
Jim Gray, Surajit Chaudhuri, Adam Bosworth, Andrew Layman, Don Reichart, Murali Venkatrao, Frank Pellow, and Hamid Pirahesh. 1997. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Min. Knowl. Discov. 1, 1 (1997), 29--53.Google ScholarDigital Library
Michael Grossniklaus, David Maier, James Miller, Sharmadha Moorthy, and Kristin Tufte. 2016. Frames: Data-driven windows. In Proceedings of the ACM International Conference on Distributed and Event-based Systems (DEBS’16).Google ScholarDigital Library
Philipp M. Grulich, Sebastian Breß, Jonas Traub, Tilmann Rabl, Janis von Bleichert, Zongxiong Chen, Steffen Zeuch, and Volker Markl. 2020. Grizzly: Efficient stream processing through adaptive query compilation. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM New York, NY.Google ScholarDigital Library
Philipp M. Grulich, Jonas Traub, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, and Volker Markl. 2019. Generating reproducible out-of-order data streams. In Proceedings of the ACM International Conference on Distributed and Event-based Systems (DEBS’19). 256--257.Google ScholarDigital Library
Shenoda Guirguis, Mohamed A. Sharaf, Panos K. Chrysanthis, and Alexandros Labrinidis. 2011. Optimized processing of multiple aggregate continuous queries. In Proceedings of the Conference on Information and Knowledge Management (CIKM’11). ACM, 1515--1524.Google ScholarDigital Library
Shenoda Guirguis, Mohamed A. Sharaf, Panos K. Chrysanthis, and Alexandros Labrinidis. 2012. Three-level processing of multiple aggregate continuous queries. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’12). 929--940.Google ScholarDigital Library
Guenter Hesse, Christoph Matthies, Kelvin Glass, Johannes Huegle, and Matthias Uflacker. 2019. Quantitative impact evaluation of an abstraction layer for data stream processing systems. In Proceedings of the International Conference on Distributed Computing Systems (ICDCS’19). IEEE, 1381--1392.Google ScholarCross Ref
Martin Hirzel, Henrique Andrade, Buğra Gedik, Vibhore Kumar, Giuliano Losa, Howard Nasgaard, Robert Soulé, and Kun-Lung Wu. 2009. SPL stream processing language specification. IBM Res. Report (2009). http://cs.yale.edu/homes/soule/pubs/rc24897.pdf.Google Scholar
Martin Hirzel, Scott Schneider, and Kanat Tangwongsan. 2017. Sliding-window aggregation algorithms: Tutorial. In Proceedings of the ACM International Conference on Distributed and Event-based Systems (DEBS’17). 11--14.Google ScholarDigital Library
Martin Hirzel, Robert Soulé, Scott Schneider, Buğra Gedik, and Robert Grimm. 2014. A catalog of stream processing optimizations. Comput. Surveys 46, 4 (2014), 46.Google ScholarDigital Library
Kartik Hosanagar, John Chuang, Ramayya Krishnan, and Michael D. Smith. 2008. Service adoption and pricing of content delivery network (CDN) services. Manage. Sci. 54, 9 (2008), 1579--1593.Google ScholarDigital Library
Ryan Huebsch, Minos Garofalakis, Joseph M. Hellerstein, and Ion Stoica. 2007. Sharing aggregate computation for distributed queries. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 485--496.Google ScholarDigital Library
Julius Hülsmann, Jonas Traub, and Volker Markl. 2020. Demand-based sensor data gathering with multi-query optimization. Proc. VLDB Endow. 13, 12 (2020), 2801--2804.Google ScholarDigital Library
Zbigniew Jerzak, Thomas Heinze, Matthias Fehr, Daniel Gröber, Raik Hartung, and Nenad Stojanovic. 2012. The DEBS 2012 grand challenge. In Proceedings of the ACM International Conference on Distributed and Event-based Systems (DEBS’12). 393--398.Google Scholar
Uwe Jugel, Zbigniew Jerzak, Gregor Hackenbroich, and Volker Markl. 2014. M4: A visualization-oriented time series data aggregation. In Proceedings of the International Conference on Very Large Data Bases (PVLDB’14). 797--808.Google ScholarDigital Library
Jeyhun Karimov, Tilmann Rabl, Asterios Katsifodimos, Roman Samarev, Henri Heiskanen, and Volker Markl. 2018. Benchmarking distributed stream processing engines. In IEEE Proceedings of the IEEE International Conference on Data Engineering (ICDE’18).Google Scholar
Alexandros Koliousis, Matthias Weidlich, Raul Castro Fernandez, Alexander L. Wolf, Paolo Costa, and Peter Pietzuch. 2016. SABER: Window-based hybrid stream processing for heterogeneous architectures. In Proceedings of the ACM Special Interest Group on Management of Data (SIGMOD’16). 555--569.Google Scholar
Jay Kreps. 2016. Introducing Kafka streams: Stream processing made simple. Confluent Blog. Retrieved from https://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple/.Google Scholar
Sailesh Krishnamurthy, Michael J. Franklin, Jeffrey Davis, Daniel Farina, Pasha Golovko, Alan Li, and Neil Thombre. 2010. Continuous analytics over discontinuous streams. In Proceedings of the ACM Special Interest Group on Management of Data (SIGMOD’10). 1081--1092.Google ScholarDigital Library
Sailesh Krishnamurthy, Chung Wu, and Michael Franklin. 2006. On-the-fly sharing for streamed aggregation. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 623--634.Google ScholarDigital Library
Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, and Peter A. Tucker. 2005. No pane, no gain: Efficient evaluation of sliding-window aggregates over data streams. SIGMOD Record 34, 1 (2005), 39--44.Google ScholarDigital Library
Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, and Peter A. Tucker. 2005. Semantics and evaluation techniques for window aggregates in data streams. In Proceedings of the ACM Special Interest Group on Management of Data (SIGMOD’05). 311--322.Google Scholar
Jin Li, Kristin Tufte, David Maier, and Vassilis Papadimos. 2008. AdaptWID: An adaptive, memory-efficient window aggregation implementation. IEEE Internet Comput. 12, 6 (2008), 22--29.Google ScholarDigital Library
Jin Li, Kristin Tufte, Vladislav Shkapenyuk, Vassilis Papadimos, Theodore Johnson, and David Maier. 2008. Out-of-order processing: A new architecture for high-performance stream systems. Proc. VLDB Endow. 1, 1 (2008), 274--288.Google ScholarDigital Library
Christopher Mutschler, Holger Ziekow, and Zbigniew Jerzak. 2013. The DEBS 2013 grand challenge. In Proceedings of the ACM International Conference on Distributed and Event-based Systems (DEBS’13). 289--294.Google Scholar
Shadi A. Noghabi, Kartik Paramasivam, Yi Pan, Navina Ramesh, Jon Bringhurst, Indranil Gupta, and Roy H. Campbell. 2017. Samza: Stateful scalable stream processing at LinkedIn. Proc. VLDB Endow. 10, 12 (2017), 1634--1645.Google ScholarDigital Library
OpenJDK. 2018. JMH benchmarking suite project website. Retrieved from http://openjdk.java.net/projects/code-tools/jmh/.Google Scholar
OpenJDK. 2018. Nashorn project, ObjectSizeCalculator. Retrieved from http://openjdk.java.net/projects/nashorn/.Google Scholar
Kostas Patroumpas et al. 2006. Window specification over data streams. In Proceedings of the International Conference on Extending Database Technology (EDBT’06).Google Scholar
David Salomon. 2007. Variable-length Codes for Data Compression. Springer.Google ScholarDigital Library
Anatoli U. Shein, Panos K. Chrysanthis, and Alexandros Labrinidis. 2015. F1: Accelerating the optimization of aggregate continuous queries. In Proceedings of the Conference on Information and Knowledge Management (CIKM’15). 1151--1160.Google ScholarDigital Library
Anatoli U. Shein, Panos K. Chrysanthis, and Alexandros Labrinidis. 2017. Flatfit: Accelerated incremental sliding-window aggregation for real-time analytics. In Proceedings of the International Conference on Scientific and Statistical Database Management.Google ScholarDigital Library
Anatoli U. Shein, Panos K. Chrysanthis, and Alexandros Labrinidis. 2018. SlickDeque: High throughput and low latency incremental sliding-window aggregation. In Proceedings of the International Conference on Extending Database Technology (EDBT’18).Google Scholar
Leo Syinchwun. 2016. Lightweight event time window. In Flink Jira Issues. Retrieved from issues.apache.org/jira/browse/FLINK-5387.Google Scholar
Kanat Tangwongsan, Martin Hirzel, and Scott Schneider. 2017. Low-latency sliding-window aggregation in worst-case constant time. In Proceedings of the ACM International Conference on Distributed and Event-based Systems (DEBS’17). 66--77.Google ScholarDigital Library
Kanat Tangwongsan, Martin Hirzel, and Scott Schneider. 2019. Optimal and general out-of-order sliding-window aggregation. Proc. VLDB Endow. 12, 10 (2019), 1167--1180.Google ScholarDigital Library
Kanat Tangwongsan, Martin Hirzel, Scott Schneider, and Kun-Lung Wu. 2015. General incremental sliding-window aggregation. Proceedings of the International Conference on Very Large Data Bases (PVLDB’15) 8, 7 (2015), 702--713.Google ScholarDigital Library
Joseph Torres, Michael Armbrust, Tathagata Das, and Shixiong Zhu. 2018. Introducing low-latency continuous processing mode in structured streaming in Apache Spark 2.3. Databricks Blog. Retrieved from https://databricks.com/blog/2018/03/20/low-latency-continuous-processing-mode-in-structured-streaming-in-apache-spark-2-3-0.html.Google Scholar
Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M. Patel, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, et al. 2014. Storm@twitter. In Proceedings of the ACM Special Interest Group on Management of Data (SIGMOD’14). 147--156.Google Scholar
Jonas Traub. 2019. Demand-based data stream gathering, processing, and transmission. TU Berlin, PhD Thesis.Google Scholar
Jonas Traub, Sebastian Breß, Tilmann Rabl, Asterios Katsifodimos, and Volker Markl. 2017. Optimized on-demand data streaming from sensor nodes. Proceedings of the Symposium on Cloud Computing (SoCC’17). 586--597.Google ScholarDigital Library
Jonas Traub, Philipp M. Grulich, Alejandro Rodriguez Cuellar, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, and Volker Markl. 2018. Scotty: Efficient window aggregation for out-of-order stream processing. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’18).Google ScholarCross Ref
Jonas Traub, Philipp M. Grulich, Alejandro Rodriguez Cuellar, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, and Volker Markl. 2019. Efficient window aggregation with general stream slicing. In Proceedings of the International Conference on Extending Database Technology (EDBT’17).Google Scholar
Jonas Traub, Julius Hülsmann, Sebastian Breß, Tilmann Rabl, and Volker Markl. 2019. SENSE: Scalable data acquisition from distributed sensors with guaranteed time coherence. Retrieved from https://arxiv.org/abs/1912.04648.Google Scholar
Jonas Traub, Nikolaas Steenbergen, Philipp M. Grulich, Tilmann Rabl, and Volker Markl. 2017. I2: Interactive real-time visualization for streaming data. In Proceedings of the International Conference on Extending Database Technology (EDBT’17). 526--529.Google Scholar
Peter A. Tucker, David Maier, Tim Sheard, and Leonidas Fegaras. 2003. Exploiting punctuation semantics in continuous data streams. IEEE Trans. Knowl. Data Eng. 15, 3 (2003), 555--568.Google ScholarDigital Library
Kostas Tzoumas et al. 2015. High-throughput, low-latency, and exactly-once stream processing with Apache Flink. Retrieved from data-artisans.com/blog/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink.Google Scholar
Mikhail Vorontsov. 2013. Memory consumption of popular Java data types - part 2. Java Performance Tuning Guide. Retrieved from http://java-performance.info/memory-consumption-of-java-data-types-2/.Google Scholar
Guozhang Wang. 2017. Enabling exactly-once in Kafka streams. Confluent Blog. Retrieved from https://www.confluent.io/blog/enabling-exactly-once-kafka-streams/.Google Scholar
Jark Wu. 2017. Improve performance of sliding time window with pane optimization. In Flink Jira Issues. Retrieved from issues.apache.org/jira/browse/FLINK-7001.Google Scholar
Yuan Yu, Pradeep Kumar Gunda, and Michael Isard. 2009. Distributed aggregation for data-parallel computing: Interfaces and implementations. In Proceedings of the ACM Special Interest Group on Operating Systems (SIGOPS’09). 247--260.Google Scholar
Matei Zaharia, Tathagata Das, Haoyuan Li, Scott Shenker, and Ion Stoica. 2012. Discretized streams: An efficient and fault-tolerant model for stream processing on large clusters. In Proceedings of the USENIX Conference on Hot Topics in Cloud Computing.Google Scholar
Matei Zaharia, Reynold S. Xin, Patrick Wendell, et al. 2016. Apache Spark: A unified engine for big data processing. Commun. ACM 59, 11 (2016), 56--65.Google ScholarDigital Library
Steffen Zeuch, Ankit Chaudhary, Bonaventura Del Monte, et al. 2020. The NebulaStream Platform: Data and application management for the Internet of Things. In Proceedings of the Conference on Innovative Data Systems Research (CIDR’20).Google Scholar
Steffen Zeuch, Bonaventura Del Monte, Jeyhun Karimov, Clemens Lutz, Manuel Renz, Jonas Traub, Sebastian Breß, Tilmann Rabl, and Volker Markl. 2019. Analyzing efficient stream processing on modern hardware. In Proceedings of the International Conference on Very Large Data Bases (PVLDB’19).Google ScholarDigital Library
Shuhao Zhang, Feng Zhang, Yingjun Wu, Bingsheng He, and Paul Johns. 2020. Hardware-conscious stream processing: A survey. ACM SIGMOD Record 48, 4 (2020), 18--29.Google ScholarDigital Library

Index Terms

Scotty: General and Efficient Open-source Window Aggregation for Stream Processing Systems
1. Information systems
  1. Data management systems
    1. Database design and models
      1. Data model extensions
        Data streams
    2. Database management system engines
      1. Database query processing
        Query operators
      2. Stream management
  2. Information systems applications
    1. Spatial-temporal systems
      1. Data streaming
2. Theory of computation
  1. Models of computation
    1. Streaming models

Recommendations

Experimental study on the performance and resource utilization of data streaming frameworks
CCGrid '18: Proceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

With the advent of the Internet of Things (IoT), data stream processing have gained increased attention due to the ever-increasing need to process heterogeneous and voluminous data streams. This work addresses the problem of selecting a correct stream ...
Read More
Speeding up the multimedia feature extraction: a comparative study on the big data approach

The current explosion of multimedia data is significantly increasing the amount of potential knowledge. However, to get to the actual information requires to apply novel content-based techniques which in turn require time consuming extraction of ...
Read More
HYAS: Hybrid Autoscaler Agent for Apache Flink
Web Engineering
Abstract
Apache Flink is a distributed processing engine for stateful computations over unbounded and bounded data streams. Despite its versatility, Apache Flink cannot automatically and optimally adjust its computing resources to match the requirements of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Database Systems Volume 46, Issue 1
March 2021
143 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/3457891
Editor:
Christopher Jermaine
Rice University, USA
Issue’s Table of Contents
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 March 2021
- Accepted: 1 November 2020
- Revised: 1 October 2020
- Received: 1 March 2020
Published in tods Volume 46, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Apache Beam
Apache Flink
Apache Kafka Streams
Apache Samza
Apache Spark
Apache Storm
Scotty
Window
aggregate sharing
aggregation
open-source
session window
sliding-window
stream processing
tumbling window
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 572
  Total Downloads
- Downloads (Last 12 months)90
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Scotty: General and Efficient Open-source Window Aggregation for Stream Processing Systems

ACM Transactions on Database Systems

Abstract

References

Cited By

Index Terms

Recommendations

Experimental study on the performance and resource utilization of data streaming frameworks

Speeding up the multimedia feature extraction: a comparative study on the big data approach

HYAS: Hybrid Autoscaler Agent for Apache Flink

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Scotty: General and Efficient Open-source Window Aggregation for Stream Processing Systems

ACM Transactions on Database Systems

Abstract

References

Cited By

Index Terms

Recommendations

Experimental study on the performance and resource utilization of data streaming frameworks

Speeding up the multimedia feature extraction: a comparative study on the big data approach

HYAS: Hybrid Autoscaler Agent for Apache Flink

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media