Abstract
Mobile and sensing devices have already become ubiquitous. They have made tracking moving objects an easy task. As a result, mobile applications like Uber and many IoT projects have generated massive amounts of trajectory data that can no longer be processed by a single machine efficiently. Among the typical query operations over trajectories, similarity search is a common yet expensive operator in querying trajectory data. It is useful for applications in different domains such as traffic and transportation optimizations, weather forecast and modeling, and sports analytics. It is also a fundamental operator for many important mining operations such as clustering and classification of trajectories. In this paper, we propose a distributed query framework to process trajectory similarity search over a large set of trajectories. We have implemented the proposed framework in Spark, a popular distributed data processing engine, by carefully considering different design choices. Our query framework supports both the Hausdorff distance the Fréchet distance. Extensive experiments have demonstrated the excellent scalability and query efficiency achieved by our design, compared to other methods and design alternatives.
- P. K. Agarwal, R. Ben Avraham, H. Kaplan, and M. Sharir. Computing the discrete frechet distance in subquadratic time. Siam Journal of Computing, 43:429--449, 2014.Google ScholarCross Ref
- P. K. Agarwal and R. Sharathkumar. Approximation algorithms for bipartite matching with metric and geometric costs. In STOC, 2014. Google ScholarDigital Library
- H. Alt and M. Godau. Computing the fröchet distance between two polygonal curves. JCG Appl., 5:75--91, 1995.Google Scholar
- H. Alt, C. Knauer, and C. Wenk. Comparison of distance measures for planar curves. Algoritkmica, 2004. Google ScholarDigital Library
- H. Alt and L. Scharf. Computing the hausdorff distance between curved objects. JCG Appl., 18(4):307--320, 2008.Google Scholar
- Y.-B. Bai, J.-H. Yong, C.-Y Liu, X.-M. Liu, and Y. Meng. Polyline approach for approximating Hausdorff distance between planar free-form curves. CAD, 43:687--698, 2011. Google ScholarDigital Library
- M. d. Berg, O. Cheong, M. v. Kreveld, and M. Overmars. Computational Geometry: Algorithms and Applications. Springer-Verlag TELOS, 2008. Google ScholarDigital Library
- T. Bozkaya and Z. M. Özsoyoglu. Indexing large metric spaces for similarity search queries. TODS, 24(3):361--404, 1999. Google ScholarDigital Library
- S. Cabello, P. Giannopoulos, C. Knauer, and G. Rote. Matching point sets with respect to the Earth mover's distance. Computational Geometry: Theory and Applications, 39:118--133, 2008. Google ScholarDigital Library
- P. Cao and Z. Wang. Efficient top-k query calculation in distributed networks. In PODC, pages 206--215, 2004. Google ScholarDigital Library
- S. Chambi, D. Lemire, O. Kaser, and R. Godin. Better bitmap performance with roaring bitmaps. Softw., Pract. Exper., 2016. Google ScholarDigital Library
- L. Chen and R. Ng. On the marriage of lp-norms and edit distance. In VLDB, pages 792--803, 2004. Google ScholarDigital Library
- L. Chen, M. T. Özsu, and V. Oria. Robust and fast similarity search for moving object trajectories. In SIGMOD, pages 491--502, 2005. Google ScholarDigital Library
- P. Ciaccia, M. Patella, and P. Zezula. M-tree: An efficient access method for similarity search in metric spaces. In VLDB, pages 426--35, 1997. Google ScholarDigital Library
- P. Cudré-Mauroux, E. Wu, and S. Madden. Trajstore: An adaptive storage system for very large trajectory data sets. In ICDE, pages 109--120, 2010.Google ScholarCross Ref
- M. de Berg, A. F. Cook IV, and J. Gudmundsson. Fast frechet queries. In Symposium on Algorithms and Computation, 2011. Google ScholarDigital Library
- A. Driemel and S. Har-Peled. Jaywalking your dog - computing the Frëchet distance with shortcuts. Siam Journal of Computing, 42:1830--1866, 2013.Google ScholarCross Ref
- E. Frentzos, K. Gratsias, and Y Theodoridis. Index-based most similar trajectory search. In ICDE, pages 816--825, 2007.Google ScholarCross Ref
- A. W. Fu, P. M. Chan, Y Cheung, and Y. S. Moon. Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances. VLDBJ, 2000. Google ScholarDigital Library
- A. Guttman. R-trees: A dynamic index structure for spatial searching. In SIGMOD, pages 47--57, 1984. Google ScholarDigital Library
- J. Hangouët. Computation of the Hausdorff distance betwene plane vector polylines. In Proceedings AutoCarto 12, 1995.Google Scholar
- I. Kamel and C. Faloutsos. Hilbert r-tree: An improved r-tree using fractals. In VLDB, pages 500--509, 1994. Google ScholarDigital Library
- J.-G. Lee, J. Han, and X. Li. Trajectory outlier detection: A partition-and-detect framework. In ICDE, pages 140--149, 2008. Google ScholarDigital Library
- J.-G. Lee, J. Han, X. Li, and H. Gonzalez. Traclass: trajectory classification using hierarchical region-based and trajectory-based clustering. In VLDB, 2008. Google ScholarDigital Library
- J.-G. Lee, J. Han, and K.-Y. Whang. Trajectory clustering: a partition-and-group framework. In SIGMOD, pages 593--604, 2007. Google ScholarDigital Library
- S. T. Leutenegger, M. Lopez, J. Edgington, et al. Str: A simple and efficient algorithm for r-tree packing. In ICDE, pages 497--506, 1997. Google ScholarDigital Library
- C. Long, R. C. Wong, and H. V. Jagadish. Direction-preserving trajectory simplification. PVLDB, 6(10):949--960, 2013. Google ScholarDigital Library
- C. Long, R. C. Wong, and H. V. Jagadish. Trajectory simplification: On minimizing the direction-based error. PVLDB, 8(1):49--60, 2014. Google ScholarDigital Library
- S. Ranu, D. P, A. D. Telang, P. Deshpande, and S. Raghavan. Indexing and matching trajectories under inconsistent sampling rates. In ICDE, 2015.Google ScholarCross Ref
- B. K. Sriperumbudur, K. Fukumizu, and G. R. G. Lanckriet. Universality, characteristic kernels and RKHS embedding of measures. Journal of Machine Learning Research, 12:2389--2410, 2011. Google ScholarDigital Library
- H. Su, K. Zheng, J. Huang, H. Wang, and X. Zhou. Calibrating trajectory data for spatio-temporal similarity analysis. VLDBJ, 24(1):93--116, 2015. Google ScholarDigital Library
- H. Su, K. Zheng, H. Wang, J. Huang, and X. Zhou. Calibrating trajectory data for similarity-based analysis. In SIGMOD, 2013. Google ScholarDigital Library
- H. Su, K. Zheng, K. Zeng, J. Huang, S. W. Sadiq, N. J. Yuan, and X. Zhou. Making sense of trajectory data: A partition-and-summarization approach. In ICDE, pages 963--974, 2015.Google ScholarCross Ref
- H. Su, K. Zheng, K. Zheng, J. Huang, and X. Zhou. Stmaker - A system to make sense of trajectory data. PVLDB, 7(13):1701--1704, 2014. Google ScholarDigital Library
- C. F. Torres and R. Trujillo-Rasua. The fréchet/manhattan distance and the trajectory anonymisation problem. In DBSec, pages 19--34, 2016.Google ScholarCross Ref
- J. K. Uhlmann. Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett., 40(4):175--179, 1991.Google ScholarCross Ref
- M. Vlachos, D. Gunopulos, and G. Kollios. Discovering similar multidimensional trajectories. In ICDE, 2002. Google ScholarDigital Library
- H. Wang, H. Su, K. Zheng, S. W. Sadiq, and X. Zhou. An effectiveness study on trajectory similarity measures. In ADC, 2013. Google ScholarDigital Library
- H. Wang, K. Zheng, X. Zhou, and S. W. Sadiq. Sharkdb: An in-memory storage system for massive trajectory data. In SIGMOD, pages 1099--1104, 2015. Google ScholarDigital Library
- D. Xie, F. Li, B. Yao, G. Li, L. Zhou, and M. Guo. Simba: Efficient in-memory spatial analytics. In SIGMOD, pages 1071--1085, 2016. Google ScholarDigital Library
- B.-K. Yi, H. Jagadish, and C. Faloutsos. Efficient retrieval of similar time sequences under time warping. In ICDE, 1998. Google ScholarDigital Library
- P. N. Yianilos. Data structures and algorithms for nearest neighbor search in general metric spaces. In SODA, 1993. Google ScholarDigital Library
- R. Ying, J. Pan, K. Fox, and P. K. Agarwal. A simple efficient approximation algorithm for dynamic time warping. In SIGSPATIAL, 2016. Google ScholarDigital Library
- J. Yuan, Y. Zheng, C. Zhang, W. Xie, X. Xie, G. Sun, and Y Huang. T-drive: driving directions based on taxi trajectories. In SIGSPATIAL, 2010. Google ScholarDigital Library
- M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI, 2012. Google ScholarDigital Library
- Z. Zhang, K. Huang, and T. Tan. Comparison of similarity measures for trajectory clustering in outdoor surveillance scenes. In ICPR, 2006. Google ScholarDigital Library
- Y. Zheng and X. Zhou, editors. Computing with Spatial Trajectories. Springer, 2011. Google ScholarDigital Library
Index Terms
- Distributed trajectory similarity search
Recommendations
Trajectory similarity measures
Storing, querying, and analyzing trajectories is becoming increasingly important, as the availability and volumes of trajectory data increases. One important class of trajectory analysis is computing trajectory similarity. This paper introduces and ...
Parallel trajectory search based on distributed index
Study distributed data management from big data trajectory based on distributed R-tree.The query trajectory is based on distance threshold and activities involved in the trajectory.The algorithms to store and maintain data into distributed index achieve ...
SST: Synchronized Spatial-Temporal Trajectory Similarity Search
AbstractThe volume of trajectory data has become tremendously large in recent years. How to effectively and efficiently search similar trajectories has become an important task. Firstly, to measure the similarity between a trajectory and a query, ...
Comments