skip to main content
research-article

Distributed trajectory similarity search

Published:01 August 2017Publication History
Skip Abstract Section

Abstract

Mobile and sensing devices have already become ubiquitous. They have made tracking moving objects an easy task. As a result, mobile applications like Uber and many IoT projects have generated massive amounts of trajectory data that can no longer be processed by a single machine efficiently. Among the typical query operations over trajectories, similarity search is a common yet expensive operator in querying trajectory data. It is useful for applications in different domains such as traffic and transportation optimizations, weather forecast and modeling, and sports analytics. It is also a fundamental operator for many important mining operations such as clustering and classification of trajectories. In this paper, we propose a distributed query framework to process trajectory similarity search over a large set of trajectories. We have implemented the proposed framework in Spark, a popular distributed data processing engine, by carefully considering different design choices. Our query framework supports both the Hausdorff distance the Fréchet distance. Extensive experiments have demonstrated the excellent scalability and query efficiency achieved by our design, compared to other methods and design alternatives.

References

  1. P. K. Agarwal, R. Ben Avraham, H. Kaplan, and M. Sharir. Computing the discrete frechet distance in subquadratic time. Siam Journal of Computing, 43:429--449, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  2. P. K. Agarwal and R. Sharathkumar. Approximation algorithms for bipartite matching with metric and geometric costs. In STOC, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. H. Alt and M. Godau. Computing the fröchet distance between two polygonal curves. JCG Appl., 5:75--91, 1995.Google ScholarGoogle Scholar
  4. H. Alt, C. Knauer, and C. Wenk. Comparison of distance measures for planar curves. Algoritkmica, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. H. Alt and L. Scharf. Computing the hausdorff distance between curved objects. JCG Appl., 18(4):307--320, 2008.Google ScholarGoogle Scholar
  6. Y.-B. Bai, J.-H. Yong, C.-Y Liu, X.-M. Liu, and Y. Meng. Polyline approach for approximating Hausdorff distance between planar free-form curves. CAD, 43:687--698, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. d. Berg, O. Cheong, M. v. Kreveld, and M. Overmars. Computational Geometry: Algorithms and Applications. Springer-Verlag TELOS, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. Bozkaya and Z. M. Özsoyoglu. Indexing large metric spaces for similarity search queries. TODS, 24(3):361--404, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Cabello, P. Giannopoulos, C. Knauer, and G. Rote. Matching point sets with respect to the Earth mover's distance. Computational Geometry: Theory and Applications, 39:118--133, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Cao and Z. Wang. Efficient top-k query calculation in distributed networks. In PODC, pages 206--215, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Chambi, D. Lemire, O. Kaser, and R. Godin. Better bitmap performance with roaring bitmaps. Softw., Pract. Exper., 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. Chen and R. Ng. On the marriage of lp-norms and edit distance. In VLDB, pages 792--803, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. L. Chen, M. T. Özsu, and V. Oria. Robust and fast similarity search for moving object trajectories. In SIGMOD, pages 491--502, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. Ciaccia, M. Patella, and P. Zezula. M-tree: An efficient access method for similarity search in metric spaces. In VLDB, pages 426--35, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Cudré-Mauroux, E. Wu, and S. Madden. Trajstore: An adaptive storage system for very large trajectory data sets. In ICDE, pages 109--120, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  16. M. de Berg, A. F. Cook IV, and J. Gudmundsson. Fast frechet queries. In Symposium on Algorithms and Computation, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Driemel and S. Har-Peled. Jaywalking your dog - computing the Frëchet distance with shortcuts. Siam Journal of Computing, 42:1830--1866, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  18. E. Frentzos, K. Gratsias, and Y Theodoridis. Index-based most similar trajectory search. In ICDE, pages 816--825, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  19. A. W. Fu, P. M. Chan, Y Cheung, and Y. S. Moon. Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances. VLDBJ, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Guttman. R-trees: A dynamic index structure for spatial searching. In SIGMOD, pages 47--57, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Hangouët. Computation of the Hausdorff distance betwene plane vector polylines. In Proceedings AutoCarto 12, 1995.Google ScholarGoogle Scholar
  22. I. Kamel and C. Faloutsos. Hilbert r-tree: An improved r-tree using fractals. In VLDB, pages 500--509, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J.-G. Lee, J. Han, and X. Li. Trajectory outlier detection: A partition-and-detect framework. In ICDE, pages 140--149, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J.-G. Lee, J. Han, X. Li, and H. Gonzalez. Traclass: trajectory classification using hierarchical region-based and trajectory-based clustering. In VLDB, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J.-G. Lee, J. Han, and K.-Y. Whang. Trajectory clustering: a partition-and-group framework. In SIGMOD, pages 593--604, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. T. Leutenegger, M. Lopez, J. Edgington, et al. Str: A simple and efficient algorithm for r-tree packing. In ICDE, pages 497--506, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Long, R. C. Wong, and H. V. Jagadish. Direction-preserving trajectory simplification. PVLDB, 6(10):949--960, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Long, R. C. Wong, and H. V. Jagadish. Trajectory simplification: On minimizing the direction-based error. PVLDB, 8(1):49--60, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Ranu, D. P, A. D. Telang, P. Deshpande, and S. Raghavan. Indexing and matching trajectories under inconsistent sampling rates. In ICDE, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  30. B. K. Sriperumbudur, K. Fukumizu, and G. R. G. Lanckriet. Universality, characteristic kernels and RKHS embedding of measures. Journal of Machine Learning Research, 12:2389--2410, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. H. Su, K. Zheng, J. Huang, H. Wang, and X. Zhou. Calibrating trajectory data for spatio-temporal similarity analysis. VLDBJ, 24(1):93--116, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. Su, K. Zheng, H. Wang, J. Huang, and X. Zhou. Calibrating trajectory data for similarity-based analysis. In SIGMOD, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. H. Su, K. Zheng, K. Zeng, J. Huang, S. W. Sadiq, N. J. Yuan, and X. Zhou. Making sense of trajectory data: A partition-and-summarization approach. In ICDE, pages 963--974, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  34. H. Su, K. Zheng, K. Zheng, J. Huang, and X. Zhou. Stmaker - A system to make sense of trajectory data. PVLDB, 7(13):1701--1704, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. C. F. Torres and R. Trujillo-Rasua. The fréchet/manhattan distance and the trajectory anonymisation problem. In DBSec, pages 19--34, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  36. J. K. Uhlmann. Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett., 40(4):175--179, 1991.Google ScholarGoogle ScholarCross RefCross Ref
  37. M. Vlachos, D. Gunopulos, and G. Kollios. Discovering similar multidimensional trajectories. In ICDE, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. H. Wang, H. Su, K. Zheng, S. W. Sadiq, and X. Zhou. An effectiveness study on trajectory similarity measures. In ADC, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. H. Wang, K. Zheng, X. Zhou, and S. W. Sadiq. Sharkdb: An in-memory storage system for massive trajectory data. In SIGMOD, pages 1099--1104, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. D. Xie, F. Li, B. Yao, G. Li, L. Zhou, and M. Guo. Simba: Efficient in-memory spatial analytics. In SIGMOD, pages 1071--1085, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. B.-K. Yi, H. Jagadish, and C. Faloutsos. Efficient retrieval of similar time sequences under time warping. In ICDE, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. P. N. Yianilos. Data structures and algorithms for nearest neighbor search in general metric spaces. In SODA, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. R. Ying, J. Pan, K. Fox, and P. K. Agarwal. A simple efficient approximation algorithm for dynamic time warping. In SIGSPATIAL, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. J. Yuan, Y. Zheng, C. Zhang, W. Xie, X. Xie, G. Sun, and Y Huang. T-drive: driving directions based on taxi trajectories. In SIGSPATIAL, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Z. Zhang, K. Huang, and T. Tan. Comparison of similarity measures for trajectory clustering in outdoor surveillance scenes. In ICPR, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Y. Zheng and X. Zhou, editors. Computing with Spatial Trajectories. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Distributed trajectory similarity search
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Proceedings of the VLDB Endowment
          Proceedings of the VLDB Endowment  Volume 10, Issue 11
          August 2017
          432 pages
          ISSN:2150-8097
          Issue’s Table of Contents

          Publisher

          VLDB Endowment

          Publication History

          • Published: 1 August 2017
          Published in pvldb Volume 10, Issue 11

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader