Skip to main content
Top
Published in: GeoInformatica 4/2018

05-07-2018

ST-Hadoop: a MapReduce framework for spatio-temporal data

Authors: Louai Alarabi, Mohamed F. Mokbel, Mashaal Musleh

Published in: GeoInformatica | Issue 4/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper presents ST-Hadoop; the first full-fledged open-source MapReduce framework with a native support for spatio-temporal data. ST-Hadoop is a comprehensive extension to Hadoop and SpatialHadoop that injects spatio-temporal data awareness inside each of their layers, mainly, language, indexing, and operations layers. In the language layer, ST-Hadoop provides built in spatio-temporal data types and operations. In the indexing layer, ST-Hadoop spatiotemporally loads and divides data across computation nodes in Hadoop Distributed File System in a way that mimics spatio-temporal index structures, which result in achieving orders of magnitude better performance than Hadoop and SpatialHadoop when dealing with spatio-temporal data and queries. In the operations layer, ST-Hadoop shipped with support for three fundamental spatio-temporal queries, namely, spatio-temporal range, top-k nearest neighbor, and join queries. Extensibility of ST-Hadoop allows others to extend features and operations easily using similar approaches described in the paper. Extensive experiments conducted on large-scale dataset of size 10 TB that contains over 1 Billion spatio-temporal records, to show that ST-Hadoop achieves orders of magnitude better performance than Hadoop and SpaitalHadoop when dealing with spatio-temporal data and operations. The key idea behind the performance gained in ST-Hadoop is its ability in indexing spatio-temporal data within Hadoop Distributed File System.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
3.
go back to reference Aji A, Wang F, Vo H, Lee R, Liu Q, Zhang X, Saltz J (2013) Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce. In: VLDB Aji A, Wang F, Vo H, Lee R, Liu Q, Zhang X, Saltz J (2013) Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce. In: VLDB
4.
go back to reference Al-Naami KM, Seker SE, Khan L (2014) GISQF: An Efficient Spatial Query Processing System. In: CLOUDCOM Al-Naami KM, Seker SE, Khan L (2014) GISQF: An Efficient Spatial Query Processing System. In: CLOUDCOM
5.
go back to reference Alarabi L, Mokbel MF, Musleh M (2017) St-hadoop: A mapreduce framework for spatio-temporal data. In: SSTD Alarabi L, Mokbel MF, Musleh M (2017) St-hadoop: A mapreduce framework for spatio-temporal data. In: SSTD
8.
go back to reference Eldawy A, Mokbel MF (2014) Pigeon: A spatial mapreduce language. In: ICDE Eldawy A, Mokbel MF (2014) Pigeon: A spatial mapreduce language. In: ICDE
9.
go back to reference Eldawy A, Mokbel MF (2015) SpatialHadoop: A MapReduce Framework for Spatial Data. In: ICDE Eldawy A, Mokbel MF (2015) SpatialHadoop: A MapReduce Framework for Spatial Data. In: ICDE
10.
go back to reference Eldawy A, Mokbel MF, Alharthi S, Alzaidy A, Tarek K, Ghani S (2015) SHAHED: A MapReduce-based System for Querying and Visualizing Spatio-temporal Satellite Data. In: ICDE Eldawy A, Mokbel MF, Alharthi S, Alzaidy A, Tarek K, Ghani S (2015) SHAHED: A MapReduce-based System for Querying and Visualizing Spatio-temporal Satellite Data. In: ICDE
11.
go back to reference Erwig M, Schneider M (2002) Spatio-temporal predicates. In: TKDE Erwig M, Schneider M (2002) Spatio-temporal predicates. In: TKDE
13.
go back to reference Fox AD, Eichelberger CN, Hughes JN, Lyon S (2013) Spatio-temporal indexing in non-relational distributed databases. In: BIGDATA Fox AD, Eichelberger CN, Hughes JN, Lyon S (2013) Spatio-temporal indexing in non-relational distributed databases. In: BIGDATA
14.
go back to reference Fries S, Boden B, Stepien G, Seidl T (2014) Phidj: Parallel similarity self-join for high-dimensional vector data with mapreduce. In: ICDE Fries S, Boden B, Stepien G, Seidl T (2014) Phidj: Parallel similarity self-join for high-dimensional vector data with mapreduce. In: ICDE
16.
go back to reference Han W, Kim J, Lee BS, Tao Y, Rantzau R, Markl V (2009) Cost-based predictive spatiotemporal join Han W, Kim J, Lee BS, Tao Y, Rantzau R, Markl V (2009) Cost-based predictive spatiotemporal join
18.
go back to reference Li Z, Hu F, Schnase JL, Duffy DQ, Lee T, Bowen MK, Yang C (2016) A spatiotemporal indexing approach for efficient processing of big array-based climate data with mapreduce. IJGIS Li Z, Hu F, Schnase JL, Duffy DQ, Lee T, Bowen MK, Yang C (2016) A spatiotemporal indexing approach for efficient processing of big array-based climate data with mapreduce. IJGIS
19.
go back to reference Lo M-L, Ravishankar CV (1996) Spatial Hash-joins. In: SIGMODR Lo M-L, Ravishankar CV (1996) Spatial Hash-joins. In: SIGMODR
20.
go back to reference Lu J, Guting RH (2012) Parallel Secondo: Boosting Database Engines with Hadoop. In: ICPADS Lu J, Guting RH (2012) Parallel Secondo: Boosting Database Engines with Hadoop. In: ICPADS
21.
go back to reference Lu P, Chen G, Ooi BC, Vo HT, Wu S (2014) ScalaGiST: Scalable Generalized Search Trees for MapReduce Systems. PVLDB Lu P, Chen G, Ooi BC, Vo HT, Wu S (2014) ScalaGiST: Scalable Generalized Search Trees for MapReduce Systems. PVLDB
22.
go back to reference Ma Q, Yang B, Qian W, Zhou A (2009) Query Processing of Massive Trajectory Data Based on MapReduce. In: CLOUDDB Ma Q, Yang B, Qian W, Zhou A (2009) Query Processing of Massive Trajectory Data Based on MapReduce. In: CLOUDDB
25.
go back to reference Nishimura S, Das S, Agrawal D, El Abbadi A \(\mathcal {M}\mathcal {D}\)-HBase: Design and Implementation of an Elastic Data Infrastructure for Cloud-scale Location Services. DAPD Nishimura S, Das S, Agrawal D, El Abbadi A \(\mathcal {M}\mathcal {D}\)-HBase: Design and Implementation of an Elastic Data Infrastructure for Cloud-scale Location Services. DAPD
27.
go back to reference Pavlo A, Paulson E, Rasin A, Abadi D, DeWitt D, Madden S, Stonebraker M (2009) A Comparison of Approaches to Large-Scale Data Analysis. In: SIGMOD Pavlo A, Paulson E, Rasin A, Abadi D, DeWitt D, Madden S, Stonebraker M (2009) A Comparison of Approaches to Large-Scale Data Analysis. In: SIGMOD
29.
go back to reference Stonebraker M, Brown P, Zhang D, Becla J (2013) SciDB: A Database Management System for Applications with Complex Analytics. Computing in Science and Engineering Stonebraker M, Brown P, Zhang D, Becla J (2013) SciDB: A Database Management System for Applications with Complex Analytics. Computing in Science and Engineering
30.
go back to reference Tan H, Luo W, Ni LM (2012) Clost: a hadoop-based storage system for big spatio-temporal data analytics. In: CIKM Tan H, Luo W, Ni LM (2012) Clost: a hadoop-based storage system for big spatio-temporal data analytics. In: CIKM
31.
go back to reference Wang G, Salles M, Sowell B, Wang X, Cao T, Demers A, Gehrke J, White W (2010) Behavioral Simulations in MapReduce. PVLDB Wang G, Salles M, Sowell B, Wang X, Cao T, Demers A, Gehrke J, White W (2010) Behavioral Simulations in MapReduce. PVLDB
32.
go back to reference Whitby MA, Fecher R, Bennight C (2017) Geowave: Utilizing distributed key-value stores for multidimensional data. In: Proceedings of the International Symposium on Advances in Spatial and Temporal Databases, SSTD Whitby MA, Fecher R, Bennight C (2017) Geowave: Utilizing distributed key-value stores for multidimensional data. In: Proceedings of the International Symposium on Advances in Spatial and Temporal Databases, SSTD
33.
go back to reference Whitman RT, Park MB, Ambrose SA, Hoel EG (2014) Spatial Indexing and Analytics on Hadoop. In: SIGSPATIAL Whitman RT, Park MB, Ambrose SA, Hoel EG (2014) Spatial Indexing and Analytics on Hadoop. In: SIGSPATIAL
34.
go back to reference Yokoyama T, Ishikawa Y, Suzuki Y (2012) Processing all k-nearest neighbor queries in hadoop. In: WAIM Yokoyama T, Ishikawa Y, Suzuki Y (2012) Processing all k-nearest neighbor queries in hadoop. In: WAIM
35.
go back to reference Yu J, Wu J, Sarwat M (2015) GeoSpark: A Cluster Computing Framework for Processing Large-Scale Spatial Data. In: SIGSPATIAL Yu J, Wu J, Sarwat M (2015) GeoSpark: A Cluster Computing Framework for Processing Large-Scale Spatial Data. In: SIGSPATIAL
36.
go back to reference Zhang S, Han J, Liu Z, Wang K, Feng S (2009) Spatial Queries Evaluation with MapReduce. In: GCC Zhang S, Han J, Liu Z, Wang K, Feng S (2009) Spatial Queries Evaluation with MapReduce. In: GCC
37.
go back to reference Zhang X, Ai J, Wang Z, Lu J, Meng X (2009) An efficient multi-dimensional index for cloud data management. In: CIKM Zhang X, Ai J, Wang Z, Lu J, Meng X (2009) An efficient multi-dimensional index for cloud data management. In: CIKM
38.
go back to reference Zhong Y, Zhu X, Fang J (2012) Elastic and Effective Spatio-Temporal Query Processing Scheme on Hadoop. In: BIGSPATIAL Zhong Y, Zhu X, Fang J (2012) Elastic and Effective Spatio-Temporal Query Processing Scheme on Hadoop. In: BIGSPATIAL
Metadata
Title
ST-Hadoop: a MapReduce framework for spatio-temporal data
Authors
Louai Alarabi
Mohamed F. Mokbel
Mashaal Musleh
Publication date
05-07-2018
Publisher
Springer US
Published in
GeoInformatica / Issue 4/2018
Print ISSN: 1384-6175
Electronic ISSN: 1573-7624
DOI
https://doi.org/10.1007/s10707-018-0325-6

Other articles of this Issue 4/2018

GeoInformatica 4/2018 Go to the issue