Abstract
With the widespread deployment of mobile devices with positioning capabilities, increasingly massive volumes of trajectory data are being collected that capture the movements of people and vehicles. This data enables co-movement pattern detection, which is important in applications such as trajectory compression and future-movement prediction. Existing co-movement pattern detection studies generally consider historical data and thus propose offline algorithms. However, applications such as future movement prediction need real-time processing over streaming trajectories. Thus, we investigate real-time distributed co-movement pattern detection over streaming trajectories.
Existing off-line methods assume that all data is available when the processing starts. Nevertheless, in a streaming setting, unbounded data arrives in real time, making pattern detection challenging. To this end, we propose a framework based on Apache Flink, which is designed for efficient distributed streaming data processing. The framework encompasses two phases: clustering and pattern enumeration. To accelerate the clustering, we use a range join based on two-layer indexing, and provide techniques that eliminate unnecessary verifications. To perform pattern enumeration efficiently, we present two methods FBA and VBA that utilize id-based partitioning. When coupled with bit compression and candidate-based enumeration techniques, we reduce the enumeration cost from exponential to linear. Extensive experiments offer insight into the efficiency of the proposed framework and its constituent techniques compared with existing methods.
- C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu. A framework for clustering evolving data streams. PVLDB, 29:81--92, 2003. Google ScholarDigital Library
- J. Agrawal, Y. Diao, D. Gyllstrom, and N. Immerman. Efficient pattern matching over event streams. In SIGMOD, pages 147--160, 2008. Google ScholarDigital Library
- N. Beckmann, H. P. Kriegel, R. Schneider, and B. Seeger. The R*-tree: an efficient and robust access method for points and rectangles. In SIGMOD, pages 322--331, 1990. Google ScholarDigital Library
- C. Bohm, B. Braunmuller, F. Krebs, and H. P. Kriegel. Epsilon grid order: An algorithm for the similarity join on massive high-dimensional data. SIGMOD Record, 30(2):379--388, 2001. Google ScholarDigital Library
- T. Brinkhoff. A framework for generating network-based moving objects. GeoInformatica, 6(2):153--180, 2002. Google ScholarDigital Library
- B. Chen, Z. Lv, X. Yu, and Y. Liu. Sliding window top-k monitoring over distributed data streams. DSE, 2(4):289--300, 2017.Google ScholarCross Ref
- Y. Chen and L. Tu. Density-based clustering for real-time stream data. In SIGKDD, pages 133--142, 2009. Google ScholarDigital Library
- I. Cordova and T. S. Moh. Dbscan on resilient distributed datasets. In HPCS, pages 531--540, 2015.Google ScholarCross Ref
- M. Ester, H. P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD, pages 226--231, 1996. Google ScholarDigital Library
- Q. Fan, D. Zhang, H. Wu, and K. L. Tan. A general and parallel platform for mining co-movement patterns over large-scale trajectories. PVLDB, 10(4):313--324, 2016. Google ScholarDigital Library
- B. Gedik, H. Andrade, K. L. Wu, P. S. Yu, and M. Doo. SPADE: the System S declarative stream processing engine. In SIGMOD, pages 1123--1134, 2008. Google ScholarDigital Library
- J. Gu, J. Wang, and C. Zaniolo. Ranking support for matched patterns over complex event streams: The CEPR system. In ICDE, pages 1354--1357, 2016.Google ScholarCross Ref
- J. Gudmundsson and M. van Kreveld. Computing longest duration flocks in trajectory data. In GIS, pages 35--42, 2006. Google ScholarDigital Library
- C. H., N. Mamoulis, and D. W. Cheung. Discover of periodic patterns in spatiotemporal sequences. TKDE, 19(4):453--467, 2007. Google ScholarDigital Library
- Y. He, H. Tan, W. Luo, S. Feng, and J. Fan. Mr-dbscan: A scalable mapreduce-based dbscan algorithm for heavily skewed data. Frontiers of Computer Science, 8(1):83--99, 2014. Google ScholarDigital Library
- H. Jeung, Q. Liu, H. Shen, and Z. X. A hybrid prediction model for moving objects. In ICDE, pages 70--79, 2008. Google ScholarDigital Library
- H. Jeung, M. L. Yiu, X. Zhou, C. S. Jensen, and H. T. Shen. Discovery of convoys in trajectory databases. PVLDB, 1(1):1068--1080, 2008. Google ScholarDigital Library
- X. Li, V. Ceikute, C. S. Jensen, and K. L. Tan. Effective online group discovery in trajectory databases. TKDE, 12:2752--2766, 2013. Google ScholarDigital Library
- Y. Li, J. Bailey, and L. Kulik. Efficient mining of platoon patterns in trajectory databases. DKE, 100:167--187, 2015. Google ScholarDigital Library
- Z. Li, B. Ding, J. Han, and R. Kays. Swarm: Mining relaxed temporal moving object clusters. PVLDB, 3(1--2):723--734, 2010. Google ScholarDigital Library
- Z. Li, B. Ding, J. Han, R. Kays, and P. Nye. Mining periodic behaviors for moving objects. In KDD, pages 1099--1108, 2010. Google ScholarDigital Library
- D. G. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham, and M. Abadi. Naiad: a timely dataflow system. In SOSP, pages 439--455, 2013. Google ScholarDigital Library
- M. Nikos and D. Papadias. Slot index spatial join. TKDE, 15(1):211--231, 2003. Google ScholarDigital Library
- S. Puntheeranurak, T. T. Shein, and M. Imamura. Efficient discovery of traveling companion from evolving trajectory data stream. In COMPSAC, pages 448--453, 2018.Google ScholarCross Ref
- M. Riyadh, N. Mustapha, M. Sulaiman, and N. B. M. Sharef. CC-TRS: Continuous clustering of trajectory stream data based on micro cluster life. Mathematical Problems in Engineering, pages 1--10, 2017.Google ScholarCross Ref
- T. L. C. Silva, K. Zeitouni, and J. A. de Macedo. Online clustering of trajectory data stream. In MDM, pages 112--121, 2016.Google ScholarCross Ref
- L. A. Tang, Y. Zheng, J. Yuan, J. Han, A. Leung, C. C. Hung, and W. C. Peng. On discovery of traveling companions from streaming trajectories. In ICDE, pages 186--197, 2012. Google ScholarDigital Library
- M. R. Vieira, P. Bakalov, and V. J. Tsotras. On-line discovery of flock patterns in spatio-temporal data. In SIGSPATIAL, pages 286--295, 2009. Google ScholarDigital Library
- Y. Wang, E. Lim, and S. Y. Hwang. Efficient mining of group patterns from user movement data. PVLDB, 57(3):240--282, 2006. Google ScholarDigital Library
- L. Wu, Y. Ge, E. Chen, R. Hong, J. Du, and M. Wang. Modeling the evolution of users preferences and social links in social networking services. TKDE, 29(6):1240--1253, 2017. Google ScholarDigital Library
- D. Yang, J. Guo, Z. J. Wang, Y. Wang, J. Zhang, L. Hu, J. Yin, and J. Cao. Fastpm: An approach to pattern matching via distributed stream processing. Information Sciences, 453:263--280, 2018.Google ScholarCross Ref
- D. Yang, Z. Guo, Z. Xie, E. A. Rundensteiner, and M. O. Ward. Interactive visual exploration of neighbor-based patterns in data streams. In SIGMOD, pages 1151--1154, 2010. Google ScholarDigital Library
- H. Yoon and C. Shahabi. Accurate discovery of valid convoys from moving object trajectories. In ICDM workshops, pages 636--643, 2009. Google ScholarDigital Library
- Y. Yu, Q. Wang, X. Wang, H. Wang, and J. He. Online clustering for trajectory data stream of moving objects. Computer science and information systems, 10(3):1293--1317, 2013.Google Scholar
- Z. Yu, X. Yu, Y. Liu, W. Li, and J. Pei. Mining frequent co-occurrence patterns across multiple data streams. In EDBT, pages 73--84, 2015.Google Scholar
- F. Zhang, Y. Zheng, D. Xu, Z. Du, Y. Wang, R. Liu, and X. Ye. Real-time spatial queries for moving objects using storm topology. ISPRS International Journal of Geo-Information, 5(10):178, 2016.Google ScholarCross Ref
- Y. Zheng. Trajectory data mining: an overview. TITS, 6(3):29:1--29:41, 2015. Google ScholarDigital Library
- Y. Zheng, L. Capra, O. Wolfson, and H. Yang. Urban computing: Concepts, methodologies, and applications. ACM TIST, 5(3):38:1--38:55, 2014. Google ScholarDigital Library
Recommendations
CoMing: A Real-time Co-Movement Mining System for Streaming Trajectories
SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of DataThe aim of real-time co-movement pattern mining for streaming trajectories is to discover co-moving objects that satisfy specific spatio-temporal constraints in real time. This functionality serves a range of real-world applications, such as traffic ...
Online clustering of streaming trajectories
With the increasing availability of modern mobile devices and location acquisition technologies, massive trajectory data of moving objects are collected continuously in a streaming manner. Clustering streaming trajectories facilitates finding the ...
Comments