ABSTRACT
Complex event processing uses patterns to detect composite events in streams of simple events. Typically, the events are logically partitioned by some key. For instance, the key can be the stock symbol in stock quotes, the author in tweets, the vehicle in transportation, or the patient in health-care. Composite event patterns often become meaningful only after partitioning. For instance, a pattern over stock quotes is typically meaningful over quotes for the same stock symbol. This paper proposes a pattern syntax and translation scheme organized around the notion of partitions. Besides making patterns meaningful, partitioning also benefits performance, since different keys can be processed in parallel. We have implemented partitioned parallel complex event processing as an extension to IBM's System S high-performance streaming platform. Our experiments with several benchmarks from finance and social media demonstrate processing speeds of up to 830,000 events per second, and substantial speedups for expensive patterns parallelized on multi-core machines as well as multi-machine clusters. Partitioning the event stream before detecting composite events makes event processing both more intuitive and parallel.
- D. J. Abadi, Y. Ahmad, M. Balazinska, U. Cetintemel, M. Cherniack, J.-H. Hwang, W. Lindner, A. S. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. Zdonik. The design of the Borealis stream processing engine. In Conference on Innovative Data Systems Research (CIDR), pages 277--289, 2005.Google Scholar
- A. Adi and O. Etzion. Amit -- the situation manager. Journal on Very Large Data Bases (VLDB J.), pages 177--203, 2004. Google ScholarDigital Library
- J. Agrawal, Y. Diao, D. Gyllstrom, and N. Immerman. Efficient pattern matching over event streams. In International Conference on Management of Data (SIGMOD), pages 147--160, 2008. Google ScholarDigital Library
- A. Aho, M. S. Lam, R. Sethi, and J. Ullman. Compilers: principles, techniques, & tools. Addison-Wesley, second edition, 2007. Google ScholarDigital Library
- L. Amini, H. Andrade, R. Bhagwan, F. Eskesen, R. King, P. Selo, Y. Park, and C. Venkatramani. SPC: A distributed, scalable platform for data mining. In Workshop on Data Mining Standards, Services and Platforms (DM-SSP), pages 27--37, 2006. Google ScholarDigital Library
- A. Arasu, S. Babu, and J. Widom. The CQL continuous query language: semantic foundations and query execution. Journal on Very Large Data Bases (VLDB J.), pages 121--142, 2006. Google ScholarDigital Library
- R. H. Arpaci-Dusseau, E. Anderson, N. Treuhaft, D. E. Culler, J. M. Hellerstein, D. Patterson, and K. Yelick. Cluster I/O with River: Making the fast case common. In Workshop on I/O in Parallel and Distributed Systems (IOPADS), pages 10--22, 1999. Google ScholarDigital Library
- L. Brenna, J. Gehrke, D. Johansen, and M. Hong. Distributed event stream processing with non-deterministic finite automata. In Conference on Distributed Event-Based Systems (DEBS), 2009. Google ScholarDigital Library
- Cayuga webpage with benchmark descriptions. http://www.cs.cornell.edu/bigreddata/cayuga/.Google Scholar
- J. Chen, D. J. DeWitt, F. Tian, and Y. Wang. NiagaraCQ: A scalable continuous query system for internet databases. In International Conference on Management of Data (SIGMOD), pages 379--390, 2000. Google ScholarDigital Library
- N. H. Cohen and K. T. Kalleberg. EventScript: An event-processing language based on regular expressions with actions. In Languages, Compiler, and Tool Support for Embedded Systems (LCTES), pages 111--120, 2008. Google ScholarDigital Library
- A. Demers, J. Gehrke, B. Panda, M. Riedewald, V. Sharma, and W. White. Cayuga: A general purpose event monitoring system. In Conference on Innovative Data Systems Research (CIDR), pages 412--422, 2007.Google Scholar
- B. Gedik, H. Andrade, K.-L. Wu, P. S. Yu, and M. Doo. SPADE: The System S declarative stream processing engine. In International Conference on Management of Data (SIGMOD), pages 1123--1134, 2008. Google ScholarDigital Library
- M. Hirzel, H. Andrade, B. Gedik, V. Kumar, G. Losa, M. Mendell, H. Nasgaard, R. Soulé, and K.-L. Wu. SPL Streams Processing Language Specification. Technical Report RC24897, IBM Research, 2009.Google Scholar
- K. R. Jayaram and P. Eugster. Scalable efficient composite event detection. In Coordination Models and Languages, pages 168--182, 2010. Google ScholarDigital Library
- P. Li, K. Agrawal, J. Buhler, and R. D. Chamberlain. Deadlock avoidance for streaming computations with filtering. In Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 243--252, 2010. Google ScholarDigital Library
- M. Mendell, H. Nasgaard, E. Bouillet, M. Hirzel, and B. Gedik. Extending a general-purpose streaming system for XML. In Extending Database Technology (EDBT), 2012. Google ScholarDigital Library
- R. Sadri, C. Zaniolo, A. Zarkesh, and J. Adibi. Optimization of sequence queries in database systems. In Principles of Database Systems (PODS), pages 71--81, 2001. Google ScholarDigital Library
- S. Schneider, M. Hirzel, B. Gedik, and K.-L. Wu. Auto-parallelizing stateful distributed streaming applications. In Parallel Architectures and Compilation Techniques (PACT), 2012.Google ScholarDigital Library
- N. Schultz-Møller, M. Migliavacca, and P. Pietzuch. Distributed complex event processing with query rewriting. In Conference on Distributed Event-Based Systems (DEBS), 2009. Google ScholarDigital Library
- M. A. Shah, J. M. Hellerstein, and E. Brewer. Highly available, fault-tolerant, parallel dataflows. In International Conference on Management of Data (SIGMOD), pages 827--838, 2004. Google ScholarDigital Library
- R. Soulé, M. Hirzel, R. Grimm, B. Gedik, H. Andrade, V. Kumar, and K.-L. Wu. A universal calculus for stream processing languages. In European Symposium on Programming (ESOP), pages 507--528, 2010. Google ScholarDigital Library
- L. Woods, J. Teubner, and G. Alonso. Complex event detection at wire speed with FPGAs. In Very Large Data Bases (VLDB), pages 660--669, 2010. Google ScholarDigital Library
- E. Wu, Y. Diao, and S. Rizvi. High-performance complex event processing over streams. In International Conference on Management of Data (SIGMOD), pages 407--418, 2006. Google ScholarDigital Library
- F. Zemke, A. Witkowski, M. Cherniak, and L. Colby. Pattern matching in sequences of rows. Technical report, ANSI Standard Proposal, 2007.Google Scholar
Index Terms
- Partition and compose: parallel complex event processing
Recommendations
RIP: run-based intra-query parallelism for scalable complex event processing
DEBS '13: Proceedings of the 7th ACM international conference on Distributed event-based systemsRecognition of patterns in event streams has become important in many application areas of Complex Event Processing (CEP) including financial markets, electronic health-care systems, and security monitoring systems. In most applications, patterns have ...
Index-Accelerated Pattern Matching in Event Stores
SIGMOD '21: Proceedings of the 2021 International Conference on Management of DataIoT applications require a new type of database systems termed event stores for ingesting fast arriving event streams and efficiently supporting analytical ad-hoc queries over time. One of the most important operations in this regard is sequential ...
SGTNE: semi-global time of the next event algorithm
PADS '95: Proceedings of the ninth workshop on Parallel and distributed simulationThis paper describes an extension of the TNE algorithm, the objective of which is to increase its parallelism and to break the inter-processor deadlocks inherent with the use of TNE. The algorithm, which we call the SGTNE algorithm (Semi Global TNE), is ...
Comments