skip to main content
10.1145/3340531.3412761acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

GeoFlink: A Distributed and Scalable Framework for the Real-time Processing of Spatial Streams

Published:19 October 2020Publication History

ABSTRACT

Apache Flink is an open-source system for scalable processing of batch and streaming data. Flink does not natively support efficient processing of spatial data streams, which is a requirement of many applications dealing with spatial data. Besides Flink, other scalable spatial data processing platforms including GeoSpark, Spatial Hadoop, etc. do not support streaming workloads and can only handle static/batch workloads. To fill this gap, we present GeoFlink, which extends Apache Flink to support spatial data types, indexes and continuous queries over spatial data streams. To enable efficient processing of spatial continuous queries and for the effective data distribution across Flink cluster nodes, a gird-based index is introduced. GeoFlink currently supports spatial range, spatial kNN and spatial join queries on point data type. An experimental study on real spatial data streams shows that GeoFlink achieves significantly higher query throughput than ordinary Flink processing.

Skip Supplemental Material Section

Supplemental Material

3340531.3412761.mp4

mp4

20.7 MB

References

  1. Fakrudeen Ali Ahmed, Jianmei Ye, and Jody Arthur. 2019. Evaluating Streaming Frameworks for Large-Scale Event Streaming. https://medium.com/adobetech/evaluating-streaming-frameworks-for-large-scale-event-streaming-7209938373c8. [Online; accessed 10-March-2020].Google ScholarGoogle Scholar
  2. Ablimit Aji, Fusheng Wang, Hoang Vo, Rubao Lee, Qiaoling Liu, Xiaodong Zhang, and Joel Saltz. 2013. Hadoop GIS: A High Performance Spatial Data Warehousing System over Mapreduce. Proc. VLDB Endow., Vol. 6, 11 (Aug. 2013), 1009--1020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. ApacheFlinkDoc. 2019. Dataflow Programming Model. https://ci.apache.org/projects/flink/flink-docs-stable/concepts/programming-model.html. [Online; accessed 06-November-2019].Google ScholarGoogle Scholar
  4. Furqan Baig, Hoang Vo, Tahsin M. Kurcc, Joel H. Saltz, and Fusheng Wang. 2017. SparkGIS: Resource Aware Efficient In-Memory Spatial Query Processing. In Proceedings of the 25th ACM SIGSPATIAL. ACM, 28:1--28:10.Google ScholarGoogle Scholar
  5. Jon Louis Bentley and Jerome H. Friedman. 1979. Data Structures for Range Searching. ACM Comput. Surv., Vol. 11, 4 (Dec. 1979), 397--409.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Eldawy and M. F. Mokbel. 2015. SpatialHadoop: A MapReduce framework for spatial data. In 2015 IEEE 31st ICDE. 1352--1363.Google ScholarGoogle Scholar
  7. ESRI. [n.d.]. ESRI: See patterns, connections, and relationships. https://www.esri.com/. [Online; accessed 12-November-2019].Google ScholarGoogle Scholar
  8. The Apache Software Foundation. [n.d.] a. Apache Kafka - A Distributed Streaming Platform. http://spark.apache.org/. [Online; accessed 11-November-2018].Google ScholarGoogle Scholar
  9. The Apache Software Foundation. [n.d.] b. Apache Samza - Distributed Stream Processing. http://samza.apache.org/. [Online; accessed 11-November-2018].Google ScholarGoogle Scholar
  10. The Apache Software Foundation. [n.d.]. Apache Spark - Lightning-Fast Cluster Computing. http://spark.apache.org/. [Online; accessed 11-November-2018].Google ScholarGoogle Scholar
  11. Ralf Hartmut Guting. 1994. An introduction to spatial database systems. VLDB Journal, Vol. 3 (1994), 357 -- 399.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Marios Hadjieleftheriou, Yannis Manolopoulos, Yannis Theodoridis, and Vassilis J. Tsotras. 2017. R-Trees: A Dynamic Index Structure for Spatial Searching .Springer International Publishing, Cham, 1805--1817.Google ScholarGoogle Scholar
  13. James N. Hughes, Andrew Annex, and et al. 2015. GeoMesa: a distributed architecture for spatio-temporal fusion. In Geospatial Informatics, Fusion, and Motion Video Analytics V, Vol. 9473.Google ScholarGoogle Scholar
  14. J. Karimov, T. Rabl, A. Katsifodimos, R. Samarev, H. Heiskanen, and V. Markl. 2018. Benchmarking Distributed Stream Data Processing Systems. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). 1507--1518.Google ScholarGoogle Scholar
  15. Jiamin Lu and Ralf Güting. 2012. Parallel SECONDO: Boosting database engines with Hadoop. Proceedings of the ICPADS, 738--743.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Yannis Manolopoulos, Yannis Theodoridis, and Vassilis J. Tsotras. 2009. Spatial Indexing Techniques .Springer US, Boston, MA, 2702--2707.Google ScholarGoogle Scholar
  17. National Institute of Advanced Industrial Science and Technology (AIST). [n.d.]. AIST Artificial Intelligence Cloud (AAIC). https://www.airc.aist.go.jp.Google ScholarGoogle Scholar
  18. PostGIS. [n.d.]. PostGIS: Spatial and Geographic objects for PostgreSQL. http://postgis.net/. [Online; accessed 10-March-2020].Google ScholarGoogle Scholar
  19. QGIS. 2020. QGIS, A Free and Open Source Geographic Information System. https://qgis.org/en/site/. [Online; accessed 31-March-2020].Google ScholarGoogle Scholar
  20. Darius Sidlauskas, Simonas Saltenis, Christian W. Christiansen, Jan M. Johansen, and Donatas Saulys. 2009. Trees or grids?: indexing moving objects in main memory. In 17th ACM SIGSPATIAL, Proceedings. 236--245.Google ScholarGoogle Scholar
  21. Apache Storm. [n.d.]. Apache Storm: Distributed realtime computation system. https://storm.apache.org/. [Online; accessed 10-March-2020].Google ScholarGoogle Scholar
  22. Mingjie Tang, Yongyang Yu, Walid G. Aref, Ahmed R. Mahmood, Qutaibah M. Malluhi, and Mourad Ouzzani. 2019. LocationSpark: In-memory Distributed Spatial Query Processing and Optimization. ArXiv, Vol. abs/1907.03736 (2019).Google ScholarGoogle Scholar
  23. Jia Yu, Zongsi Zhang, and Mohamed Sarwat. 2019. Spatial data management in apache spark: the GeoSpark perspective. GeoInformatica, Vol. 23, 1 (2019), 37--78.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Jing Yuan, Yu Zheng, Xing Xie, and Guangzhong Sun. 2011. Driving with Knowledge from the Physical World. In Proceedings of the 17th ACM SIGKDD. Association for Computing Machinery, New York, NY, USA, 316--324.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. GeoFlink: A Distributed and Scalable Framework for the Real-time Processing of Spatial Streams

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
          October 2020
          3619 pages
          ISBN:9781450368599
          DOI:10.1145/3340531

          Copyright © 2020 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 October 2020

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader