Skip to main content
Top
Published in: Distributed and Parallel Databases 4/2016

01-12-2016

CloudMdsQL: querying heterogeneous cloud data stores with a common language

Authors: Boyan Kolev, Patrick Valduriez, Carlyna Bondiombouy, Ricardo Jiménez-Peris, Raquel Pau, José Pereira

Published in: Distributed and Parallel Databases | Issue 4/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The blooming of different cloud data management infrastructures, specialized for different kinds of data and tasks, has led to a wide diversification of DBMS interfaces and the loss of a common programming paradigm. In this paper, we present the design of a cloud multidatastore query language (CloudMdsQL), and its query engine. CloudMdsQL is a functional SQL-like language, capable of querying multiple heterogeneous data stores (relational and NoSQL) within a single query that may contain embedded invocations to each data store’s native query interface. The query engine has a fully distributed architecture, which provides important opportunities for optimization. The major innovation is that a CloudMdsQL query can exploit the full power of local data stores, by simply allowing some local data store native queries (e.g. a breadth-first search query against a graph database) to be called as functions, and at the same time be optimized, e.g. by pushing down select predicates, using bind join, performing join ordering, or planning intermediate data shipping. Our experimental validation, with three data stores (graph, document and relational) and representative queries, shows that CloudMdsQL satisfies the five important requirements for a cloud multidatastore query language.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Armbrust, M., Xin, R., Lian, C., Huai, Y., Liu, D., Bradley, J., Meng, X., Kaftan, T., Franklin, M., Ghodsi, A., Zaharia, M.: Spark SQL: Relational Data Processing in Spark. ACM SIGMOD Int. Conf. on Management of Data, pp. 1383-1394 (2015) Armbrust, M., Xin, R., Lian, C., Huai, Y., Liu, D., Bradley, J., Meng, X., Kaftan, T., Franklin, M., Ghodsi, A., Zaharia, M.: Spark SQL: Relational Data Processing in Spark. ACM SIGMOD Int. Conf. on Management of Data, pp. 1383-1394 (2015)
2.
go back to reference Binnig, C., Rehrmann, R., Faerber, F., Riewe, R.: FunSQL: It is time to make SQL functional. Int. Conf. on Extending Database Technology / Database Theory (EDBT/ICDT), pp. 41-46 (2012) Binnig, C., Rehrmann, R., Faerber, F., Riewe, R.: FunSQL: It is time to make SQL functional. Int. Conf. on Extending Database Technology / Database Theory (EDBT/ICDT), pp. 41-46 (2012)
3.
go back to reference Bondiombouy, C., Kolev, B., Levchenko, O., Valduriez, P. : Integrating Big Data and Relational Data with a Functional SQL-like Query Language. Int. Conf. on Databases and Expert Systems Applications (DEXA), pp. 170-185 (2015) Bondiombouy, C., Kolev, B., Levchenko, O., Valduriez, P. : Integrating Big Data and Relational Data with a Functional SQL-like Query Language. Int. Conf. on Databases and Expert Systems Applications (DEXA), pp. 170-185 (2015)
4.
go back to reference Bugiotti, F., Bursztyn, D., Deutsch, A., Ileana, I., Manolescu, I.: Invisible Glue: Scalable Self-Tuning Multi-Stores. Conf. on Innovative Data Systems Research (CIDR), 7pp (2015) Bugiotti, F., Bursztyn, D., Deutsch, A., Ileana, I., Manolescu, I.: Invisible Glue: Scalable Self-Tuning Multi-Stores. Conf. on Innovative Data Systems Research (CIDR), 7pp (2015)
6.
go back to reference Danforth, S., Valduriez, P.: A FAD for Data-Intensive Applications. IEEE Trans. on Knowledge and Data Engineering 4(1), 34–51 (1992)CrossRefMATH Danforth, S., Valduriez, P.: A FAD for Data-Intensive Applications. IEEE Trans. on Knowledge and Data Engineering 4(1), 34–51 (1992)CrossRefMATH
7.
go back to reference Doan, A., Halevy, A., Ives, Z.: Principles of Data Integration. Morgan Kaufmann, (2012) Doan, A., Halevy, A., Ives, Z.: Principles of Data Integration. Morgan Kaufmann, (2012)
8.
go back to reference Godfrey, P., Gryz, J., Hoppe, A., Ma, W., Zuzarte, C.: Query rewrites with views for XML in DB2. IEEE Int. Conf. on Data Engineering, pp. 1339–1350 (2009) Godfrey, P., Gryz, J., Hoppe, A., Ma, W., Zuzarte, C.: Query rewrites with views for XML in DB2. IEEE Int. Conf. on Data Engineering, pp. 1339–1350 (2009)
9.
go back to reference Gulisano, V., Jiménez-Peris, R., Patiño-Martinez, M., Valduriez, P.: StreamCloud: A Large Scale Data Streaming System. IEEE Int. Conf. on Distributed Computing Systems (ICDCS), pp. 126-137 (2010) Gulisano, V., Jiménez-Peris, R., Patiño-Martinez, M., Valduriez, P.: StreamCloud: A Large Scale Data Streaming System. IEEE Int. Conf. on Distributed Computing Systems (ICDCS), pp. 126-137 (2010)
10.
go back to reference Gulisano, V., Jiménez-Peris, R., Patiño-Martinez, M., Soriente, C., Valduriez, P.: StreamCloud: An Elastic and Scalable Data Streaming System. IEEE Trans. On Parallel and Distributed Systems 23(12), 2351–2365 (2012)CrossRef Gulisano, V., Jiménez-Peris, R., Patiño-Martinez, M., Soriente, C., Valduriez, P.: StreamCloud: An Elastic and Scalable Data Streaming System. IEEE Trans. On Parallel and Distributed Systems 23(12), 2351–2365 (2012)CrossRef
11.
go back to reference Haas, L. M., Kossmann, D., Wimmers, E. L., Yang, J.: Optimizing Queries across Diverse Data Sources. Int. Conf. on Very Large Databases (VLDB), pp. 276-285 (1997) Haas, L. M., Kossmann, D., Wimmers, E. L., Yang, J.: Optimizing Queries across Diverse Data Sources. Int. Conf. on Very Large Databases (VLDB), pp. 276-285 (1997)
12.
go back to reference Haase, P., Mathäß, T., Ziller, M.: An Evaluation of Approaches to Federated Query Processing over Linked Data. Int. Conf. on Semantic Systems (I-SEMANTICS) (2010) Haase, P., Mathäß, T., Ziller, M.: An Evaluation of Approaches to Federated Query Processing over Linked Data. Int. Conf. on Semantic Systems (I-SEMANTICS) (2010)
13.
go back to reference Hacıgümüs, H., Sankaranarayanan, J., Tatemura, J., LeFevre, J., Polyzotis, N.: Odyssey: A Multi-Store System for Evolutionary Analytics. Proceedings of the VLDB Endowment (PVLDB) 6(11), 1180–1181 (2013)CrossRef Hacıgümüs, H., Sankaranarayanan, J., Tatemura, J., LeFevre, J., Polyzotis, N.: Odyssey: A Multi-Store System for Evolutionary Analytics. Proceedings of the VLDB Endowment (PVLDB) 6(11), 1180–1181 (2013)CrossRef
14.
go back to reference Hart, B., Valduriez, P., Danforth, S.: Parallelizing FAD using Compile Time Analysis Techniques. IEEE Data Engineering Bulletin (12) 1, 9–15 (1989) Hart, B., Valduriez, P., Danforth, S.: Parallelizing FAD using Compile Time Analysis Techniques. IEEE Data Engineering Bulletin (12) 1, 9–15 (1989)
16.
go back to reference LeFevre, J., Sankaranarayanan, J., Hacıgümüs, H., Tatemura, J., Polyzotis, N., Carey, M.: MISO: Souping Up Big Data Query Processing with a Multistore System. ACM SIGMOD Int. Conf. on Management of Data, pp. 1591-1602 (2014) LeFevre, J., Sankaranarayanan, J., Hacıgümüs, H., Tatemura, J., Polyzotis, N., Carey, M.: MISO: Souping Up Big Data Query Processing with a Multistore System. ACM SIGMOD Int. Conf. on Management of Data, pp. 1591-1602 (2014)
17.
go back to reference Liu, Z.H., Chang, H.J., Sthanikam, B.: Efficient support of XQuery Update Facility in XML enabled RDBMS. IEEE Int. Conf. on Data Engineering, pp. 1394–1404 (2012) Liu, Z.H., Chang, H.J., Sthanikam, B.: Efficient support of XQuery Update Facility in XML enabled RDBMS. IEEE Int. Conf. on Data Engineering, pp. 1394–1404 (2012)
18.
go back to reference Martínez-Bazan, N., Muntés-Mulero, V., Gómez-Villamor, S., Águila-Llorente, M.A., Domínguez-Sal, D., Larriba-Pey, J-L.: Efficient Graph Management Based on Bitmap Indices. Int. Database Engineering & Applications Symposium (IDEAS), pp. 110-119 (2012) Martínez-Bazan, N., Muntés-Mulero, V., Gómez-Villamor, S., Águila-Llorente, M.A., Domínguez-Sal, D., Larriba-Pey, J-L.: Efficient Graph Management Based on Bitmap Indices. Int. Database Engineering & Applications Symposium (IDEAS), pp. 110-119 (2012)
19.
go back to reference Meijer, E., Beckman, B., Bierman, G. M.: LINQ: Reconciling Object, Relations and XML in the .NET Framework. ACM SIGMOD Int. Conf. on Data Management, pp. 706-706 (2006) Meijer, E., Beckman, B., Bierman, G. M.: LINQ: Reconciling Object, Relations and XML in the .NET Framework. ACM SIGMOD Int. Conf. on Data Management, pp. 706-706 (2006)
21.
go back to reference Özsu, T., Valduriez, P.: Principles of Distributed Database Systems – Third Edition. Springer, 850 pages (2011) Özsu, T., Valduriez, P.: Principles of Distributed Database Systems – Third Edition. Springer, 850 pages (2011)
22.
go back to reference Tomasic, A., Raschid, L., Valduriez, P.: Scaling Access to Heterogeneous Data Sources with DISCO. IEEE Transactions on Knowledge and Data Engineering 10(5), 808–823 (1998)CrossRef Tomasic, A., Raschid, L., Valduriez, P.: Scaling Access to Heterogeneous Data Sources with DISCO. IEEE Transactions on Knowledge and Data Engineering 10(5), 808–823 (1998)CrossRef
23.
go back to reference Valduriez, P., Danforth, S.: Functional SQL, an SQL Upward Compatible Database Programming Language. Information Sciences 62(3), 183–203 (1992)CrossRefMATH Valduriez, P., Danforth, S.: Functional SQL, an SQL Upward Compatible Database Programming Language. Information Sciences 62(3), 183–203 (1992)CrossRefMATH
24.
go back to reference Wyss, C.M., Robertson, E.L.: Relational Languages for Metadata Integration. ACM Trans. On Database Systems 30(2), 624–660 (2005)CrossRef Wyss, C.M., Robertson, E.L.: Relational Languages for Metadata Integration. ACM Trans. On Database Systems 30(2), 624–660 (2005)CrossRef
25.
go back to reference Yuanyuan, T., Zou, T., Özcan, F., Goncalves, R., Pirahesh, H.: Joins for Hybrid Warehouses: Exploiting Massive Parallelism and Enterprise Data Warehouses. Int. Conf. on Extending Database Technology / Database Theory (EDBT/ICDT), pp. 373-384 (2015) Yuanyuan, T., Zou, T., Özcan, F., Goncalves, R., Pirahesh, H.: Joins for Hybrid Warehouses: Exploiting Massive Parallelism and Enterprise Data Warehouses. Int. Conf. on Extending Database Technology / Database Theory (EDBT/ICDT), pp. 373-384 (2015)
26.
go back to reference Zhu, M., Risch, T.: Querying Combined Cloud-Based and Relational Databases, Int. Conf. on Cloud and Service Computing, pp. 330–335 (2011) Zhu, M., Risch, T.: Querying Combined Cloud-Based and Relational Databases, Int. Conf. on Cloud and Service Computing, pp. 330–335 (2011)
27.
go back to reference Zhu, Q., Larson, P.-A.: Global Query Processing and Optimization in the CORDS Multidatabase System, Int. Conf. on Parallel and Distributed Computing Systems, pp. 640–647 (1996) Zhu, Q., Larson, P.-A.: Global Query Processing and Optimization in the CORDS Multidatabase System, Int. Conf. on Parallel and Distributed Computing Systems, pp. 640–647 (1996)
Metadata
Title
CloudMdsQL: querying heterogeneous cloud data stores with a common language
Authors
Boyan Kolev
Patrick Valduriez
Carlyna Bondiombouy
Ricardo Jiménez-Peris
Raquel Pau
José Pereira
Publication date
01-12-2016
Publisher
Springer US
Published in
Distributed and Parallel Databases / Issue 4/2016
Print ISSN: 0926-8782
Electronic ISSN: 1573-7578
DOI
https://doi.org/10.1007/s10619-015-7185-y

Other articles of this Issue 4/2016

Distributed and Parallel Databases 4/2016 Go to the issue

Premium Partner