Skip to main content
Erschienen in: Distributed and Parallel Databases 2/2013

01.06.2013

\(\mathcal{MD}\)-HBase: design and implementation of an elastic data infrastructure for cloud-scale location services

verfasst von: Shoji Nishimura, Sudipto Das, Divyakant Agrawal, Amr El Abbadi

Erschienen in: Distributed and Parallel Databases | Ausgabe 2/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The ubiquity of location enabled devices has resulted in a wide proliferation of location based applications and services. To handle the growing scale, database management systems driving such location based services (LBS) must cope with high insert rates for location updates of millions of devices, while supporting efficient real-time analysis on latest location. Traditional DBMSs, equipped with multi-dimensional index structures, can efficiently handle spatio-temporal data. However, popular open-source relational database systems are overwhelmed by the high insertion rates, real-time querying requirements, and terabytes of data that these systems must handle. On the other hand, key-value stores can effectively support large scale operation, but do not natively provide multi-attribute accesses needed to support the rich querying functionality essential for the LBSs.
We present the design and implementation of \(\mathcal {MD}\) -HBase, a scalable data management infrastructure for LBSs that bridges this gap between scale and functionality. Our approach leverages a multi-dimensional index structure layered over a key-value store. The underlying key-value store allows the system to sustain high insert throughput and large data volumes, while ensuring fault-tolerance, and high availability. On the other hand, the index layer allows efficient multi-dimensional query processing. Our optimized query processing technique accesses only the index and storage level entries that intersect with the query region, thus ensuring efficient query processing. We present the design of \(\mathcal {MD}\)-HBase that demonstrates how two standard index structures—the K-d tree and the Quad tree—can be layered over a range partitioned key-value store to provide scalable multi-dimensional data infrastructure. Our prototype implementation using HBase, a standard open-source key-value store, can handle hundreds of thousands of inserts per second using a modest 16 node cluster, while efficiently processing multi-dimensional range queries and nearest neighbor queries in real-time with response times as low as few hundreds of milliseconds.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The name \(\mathcal {MD}\)-HBase signifies adding multi-dimensional data processing capabilities to HBase, a range partitioned key-value store.
 
3
Since in HBase, a region can only be split into two sub-regions, we could not implement RPB for Quad trees as our experiments are for a 3D space.
 
Literatur
1.
2.
Zurück zum Zitat Brinkhoff, T., Str, O.: A framework for generating network-based moving objects. GeoInformatica 6, 2002 (2002) CrossRef Brinkhoff, T., Str, O.: A framework for generating network-based moving objects. GeoInformatica 6, 2002 (2002) CrossRef
3.
Zurück zum Zitat Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. In: OSDI, pp. 205–218 (2006) Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. In: OSDI, pp. 205–218 (2006)
4.
Zurück zum Zitat Das, S., Agrawal, D., El Abbadi, A.: G-Store: a scalable data store for transactional multi key access in the cloud. In: SOCC, pp. 163–174 (2010) CrossRef Das, S., Agrawal, D., El Abbadi, A.: G-Store: a scalable data store for transactional multi key access in the cloud. In: SOCC, pp. 163–174 (2010) CrossRef
5.
Zurück zum Zitat Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI, pp. 137–150 (2004) Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI, pp. 137–150 (2004)
6.
Zurück zum Zitat DeWitt, D., Gray, J.: Parallel database systems: the future of high performance database systems. Commun. ACM 35(6), 85–98 (1992) CrossRef DeWitt, D., Gray, J.: Parallel database systems: the future of high performance database systems. Commun. ACM 35(6), 85–98 (1992) CrossRef
7.
Zurück zum Zitat Finkel, R.A., Bentley, J.L.: Quad trees: a data structure for retrieval on composite keys. Acta Inform. 4, 1–9 (1974) MATHCrossRef Finkel, R.A., Bentley, J.L.: Quad trees: a data structure for retrieval on composite keys. Acta Inform. 4, 1–9 (1974) MATHCrossRef
8.
Zurück zum Zitat Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: SIGMOD, pp. 47–57 (1984) Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: SIGMOD, pp. 47–57 (1984)
11.
Zurück zum Zitat Jensen, C.S., Lin, D., Ooi, B.C.: Query and update efficient b+-tree based indexing of moving objects. In: VLDB, pp. 768–779 (2004). VLDB Endowment Jensen, C.S., Lin, D., Ooi, B.C.: Query and update efficient b+-tree based indexing of moving objects. In: VLDB, pp. 768–779 (2004). VLDB Endowment
12.
Zurück zum Zitat Lawder, J.K.: Querying multi-dimensional data indexed using the Hilbert space-filling curve. SIGMOD Rec. 30, 2001 (2001) CrossRef Lawder, J.K.: Querying multi-dimensional data indexed using the Hilbert space-filling curve. SIGMOD Rec. 30, 2001 (2001) CrossRef
13.
Zurück zum Zitat Morton, G.M.: A computer oriented geodetic data base and a new technique in file sequencing. Tech. rep., IBM Ottawa, Canada (1966) Morton, G.M.: A computer oriented geodetic data base and a new technique in file sequencing. Tech. rep., IBM Ottawa, Canada (1966)
14.
Zurück zum Zitat Nishimura, S., Das, S., Agrawal, D., El Abbadi, A.: MD-HBase: a scalable multi-dimensional data infrastructure for location aware services. In: MDM, pp. 7–16 (2011) Nishimura, S., Das, S., Agrawal, D., El Abbadi, A.: MD-HBase: a scalable multi-dimensional data infrastructure for location aware services. In: MDM, pp. 7–16 (2011)
15.
Zurück zum Zitat Ramabhadran, S., Ratnasamy, S., Hellerstein, J.M., Shenker, S.: Prefix hash tree: an indexing data structure over distributed hash tables. Tech. rep., Intel Research, Berkeley (2004) Ramabhadran, S., Ratnasamy, S., Hellerstein, J.M., Shenker, S.: Prefix hash tree: an indexing data structure over distributed hash tables. Tech. rep., Intel Research, Berkeley (2004)
16.
Zurück zum Zitat Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, San Francisco (2005) Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, San Francisco (2005)
17.
Zurück zum Zitat Tao, Y., Yi, K., Sheng, C., Kalnis, P.: Quality and efficiency in high dimensional nearest neighbor search. In: SIGMOD, pp. 563–576 (2009) CrossRef Tao, Y., Yi, K., Sheng, C., Kalnis, P.: Quality and efficiency in high dimensional nearest neighbor search. In: SIGMOD, pp. 563–576 (2009) CrossRef
18.
Zurück zum Zitat Wang, J., Wu, S., Gao, H., Li, J., Ooi, B.C.: Indexing multi-dimensional data in a cloud system. In: SIGMOD, pp. 591–602 (2010) Wang, J., Wu, S., Gao, H., Li, J., Ooi, B.C.: Indexing multi-dimensional data in a cloud system. In: SIGMOD, pp. 591–602 (2010)
20.
Zurück zum Zitat Zhang, X., Ai, J., Wang, Z., Lu, J., Meng, X.: An efficient multi-dimensional index for cloud data management. In: CloudDB, pp. 17–24 (2009) CrossRef Zhang, X., Ai, J., Wang, Z., Lu, J., Meng, X.: An efficient multi-dimensional index for cloud data management. In: CloudDB, pp. 17–24 (2009) CrossRef
Metadaten
Titel
-HBase: design and implementation of an elastic data infrastructure for cloud-scale location services
verfasst von
Shoji Nishimura
Sudipto Das
Divyakant Agrawal
Amr El Abbadi
Publikationsdatum
01.06.2013
Verlag
Springer US
Erschienen in
Distributed and Parallel Databases / Ausgabe 2/2013
Print ISSN: 0926-8782
Elektronische ISSN: 1573-7578
DOI
https://doi.org/10.1007/s10619-012-7109-z

Weitere Artikel der Ausgabe 2/2013

Distributed and Parallel Databases 2/2013 Zur Ausgabe

Premium Partner