Abstract
This work revisits the processing of stream joins on modern hardware architectures. Our work is based on the recently proposed handshake join algorithm, which is a mechanism to parallelize the processing of stream joins in a NUMA-aware and hardware-friendly manner. Handshake join achieves high throughput and scalability, but it suffers from a high latency penalty and a non-deterministic ordering of the tuples in the physical result stream. In this paper, we first characterize the latency behavior of the handshake join and then propose a new low-latency handshake join algorithm, which substantially reduces latency without sacrificing throughput or scalability. We also present a technique to generate punctuated result streams with very little overhead; such punctuations allow the generation of correctly ordered physical output streams with negligible effect on overall throughput and latency.
- D. J. Abadi, D. Carney, U. Çetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik. Aurora: a new model and architecture for data stream management. The VLDB Journal, 12(2):120--139, Aug. 2003. Google ScholarDigital Library
- A. Arasu, M. Cherniack, E. Galvez, D. Maier, A. S. Maskey, E. Ryvkina, M. Stonebraker, and R. Tibbetts. Linear road: a stream data management benchmark. In Proc. of VLDB, volume 30, pages 480--491, 2004. Google ScholarDigital Library
- C. Balkesen, J. Teubner, G. Alonso, and M. Ozsu. Main-memory hash joins on multi-core cpus: Tuning to the underlying hardware. In Proc. of ICDE, pages 362--373, 2013. Google ScholarDigital Library
- A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schüpbach, and A. Singhania. The Multikernel: A new OS architecture for scalable multicore systems. In Proc. of ACM SOSP, Big Sky, MT, USA, 2009. Google ScholarDigital Library
- P. Conway, N. Kalyanasundharam, G. Donley, K. Lepak, and B. Hughes. Cache hierarchy and memory subsystem of the AMD Opteron processor. IEEE Micro, 30(2):16--29, 2010. Google ScholarDigital Library
- L. Ding, N. Mehta, E. A. Rundensteiner, and G. T. Heineman. Joining punctuated streams. In Proc. of EDBT, pages 587--604. 2004.Google ScholarCross Ref
- L. Ding and E. A. Rundensteiner. Evaluating window joins over punctuated streams. In Proc. of CIKM, pages 98--107. ACM, 2004. Google ScholarDigital Library
- J.-P. Dittrich, B. Seeger, D. S. Taylor, and P. Widmayer. Progressive merge join: A generic and non-blocking sort-based join algorithm. In Proc. of VLDB, pages 299--310, 2002. Google ScholarDigital Library
- B. Gedik, P. S. Yu, and R. Bordawekar. CellJoin: A parallel stream join operator for the Cell processor. The VLDB Journal, 18(2):501--519, 2009. Google ScholarDigital Library
- J. Kang, J. F. Naughton, and S. D. Viglas. Evaluating window joins over unbounded streams. In Proc. of ICDE, Bangalore, India, 2003.Google ScholarCross Ref
- T. Karnagel, D. Habich, B. Schlegel, and W. Lehner. The HELLS-join: a heterogeneous stream join for extremely large windows. In Proc. of DaMoN, New York, NY, USA, June 2013. Google ScholarDigital Library
- T. Karnagel, B. Schlegel, D. Habich, and W. Lehner. Stream join processing on heterogeneous processors. In BTW Workshops, pages 17--26, Magdeburg, Germany, Mar. 2013.Google Scholar
- S. Krishnamurthy, M. J. Franklin, J. Davis, D. Farina, P. Golovko, A. Li, and N. Thombre. Continuous analytics over discontinuous streams. In Proc. of ACM SIGMOD, pages 1081--1092, Indianapolis, IN, USA, June 2010. Google ScholarDigital Library
- J. Li, D. Maier, K. Tufte, V. Papadimos, and P. A. Tucker. Semantics and evaluation techniques for window aggregates in data streams. In Proc. of ACM SIGMOD, Baltimore, MD, USA, 2005. Google ScholarDigital Library
- J. Li, K. Tufte, V. Shkapenyuk, V. Papadimos, T. Johnson, and D. Maier. Out-of-order processing: a new architecture for high performance stream systems. Proc. VLDB Endowment (PVLDB), 1(1), 2008. Google ScholarDigital Library
- Y. Li, I. Pandis, R. Müller, V. Raman, and G. M. Lohman. Numa-aware algorithms: the case of data shuffing. In CIDR, 2013.Google Scholar
- M. Mokbel, M. Lu, and W. Aref. Hash-merge join: a non-blocking join algorithm for producing fast and early join results. In Proc. of ICDE, pages 251--262, 2004. Google ScholarDigital Library
- Y. Oge, T. Miyoshi, H. Kawashima, and T. Yoshinaga. A fast handshake join implementation on FPGA with adaptive merging network. In Proc. of SSDBM, Baltimore, MD, USA, July 2013. Google ScholarDigital Library
- M. Sadoghi, M. Labrecque, H. Singh, W. Shum, and H.-A. Jacobsen. Efficient event processing through reconfigurable hardware for algorithmic trading. Proc. VLDB Endowment (PVLDB), 3(1-2):1525--1528, 2010. Google ScholarDigital Library
- J. Teubner and R. Mueller. How soccer players would do stream joins. In Proc. of ACM SIGMOD, Athens, Greece, 2011. Google ScholarDigital Library
- A. N. Wilschut and P. M. Apers. Dataflow query execution in a parallel main-memory environment. Distributed and Parallel Databases, 1(1):103--128, 1993. Google ScholarDigital Library
- L. Woods, J. Teubner, and G. Alonso. Complex event detection at wire speed with FPGAs. Proc. VLDB Endowment (PVLDB), 3(1), 2010. Google ScholarDigital Library
Index Terms
- Low-latency handshake join
Recommendations
Low-latency asynchronous FIFO buffers
ASYNC '95: Proceedings of the 2nd Working Conference on Asynchronous Design MethodologiesA parallel asynchronous implementation of a FIFO buffer is described and compared with the conventional alternative asynchronous implementation, Sutherland's micropipeline. The parallel design has the potential for significant reductions in propagation ...
An Implementation of Handshake Join on FPGA
ICNC '11: Proceedings of the 2011 Second International Conference on Networking and ComputingThis paper shows an implementation of handshake join on field-programmable gate array (FPGA). Handshake join is one of stream join algorithms, proposed by Teubner and Mueller. It can support very high degrees of parallelism and attain unprecedented ...
Performance of Low-Latency HTTP-based Streaming Players
MMSys '21: Proceedings of the 12th ACM Multimedia Systems ConferenceReducing end-to-end streaming latency is critical to HTTP-based live video streaming. There are currently two technologies in this domain: Low-Latency HTTP Live Streaming (LL-HLS) and Low-Latency Dynamic Adaptive Streaming over HTTP (LL-DASH). Many ...
Comments