Skip to main content

2016 | OriginalPaper | Buchkapitel

Learning-Based SPARQL Query Performance Prediction

verfasst von : Wei Emma Zhang, Quan Z. Sheng, Kerry Taylor, Yongrui Qin, Lina Yao

Erschienen in: Web Information Systems Engineering – WISE 2016

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

According to the predictive results of query performance, queries can be rewritten to reduce time cost or rescheduled to the time when the resource is not in contention. As more large RDF datasets appear on the Web recently, predicting performance of SPARQL query processing is one major challenge in managing a large RDF dataset efficiently. In this paper, we focus on representing SPARQL queries with feature vectors and using these feature vectors to train predictive models that are used to predict the performance of SPARQL queries. The evaluations performed on real world SPARQL queries demonstrate that the proposed approach can effectively predict SPARQL query performance and outperforms state-of-the-art approaches.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
Graph edit distance is the minimum amount of edit operations (i.e., deletion, insertion and substitutions of nodes and edges) needed to transform one graph to the other.
 
Literatur
1.
Zurück zum Zitat Ahmad, M., Duan, S., Aboulnaga, A., Babu, S.: Predicting completion times of batch query workloads using interaction-aware models and simulation. In: Proceedings of the 14th International Conference on Extending Database Technology (EDBT 2011), Uppsala, pp. 449–460, March 2011 Ahmad, M., Duan, S., Aboulnaga, A., Babu, S.: Predicting completion times of batch query workloads using interaction-aware models and simulation. In: Proceedings of the 14th International Conference on Extending Database Technology (EDBT 2011), Uppsala, pp. 449–460, March 2011
2.
Zurück zum Zitat Akdere, M., Çetintemel, U., Riondato, M., Upfal, E., Zdonik, S.B.: Learning-based query performance modeling and prediction. In: Proceedings of the 28th International Conference on Data Engineering (ICDE 2012), Washington, DC, pp. 390–401, April 2012 Akdere, M., Çetintemel, U., Riondato, M., Upfal, E., Zdonik, S.B.: Learning-based query performance modeling and prediction. In: Proceedings of the 28th International Conference on Data Engineering (ICDE 2012), Washington, DC, pp. 390–401, April 2012
3.
Zurück zum Zitat Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–185 (1992)MathSciNet Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–185 (1992)MathSciNet
4.
Zurück zum Zitat Bursztyn, D., Goasdoué, F., Manolescu, I.: Optimizing reformulation-based query answering in RDF. In: Proceedings of the 18th International Conference on Extending Database Technology (EDBT 2015), Brussels, pp. 265–276, March 2015 Bursztyn, D., Goasdoué, F., Manolescu, I.: Optimizing reformulation-based query answering in RDF. In: Proceedings of the 18th International Conference on Extending Database Technology (EDBT 2015), Brussels, pp. 265–276, March 2015
5.
Zurück zum Zitat Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27 (2011)CrossRef Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27 (2011)CrossRef
6.
Zurück zum Zitat Ganapathi, A., Kuno, H.A., Dayal, U., Wiener, J.L., Fox, A., Jordan, M.I., Patterson, D.A.: Predicting multiple metrics for queries: better decisions enabled by machine learning. In: Proceedings of the 25th International Conference on Data Engineering (ICDE 2009), Shanghai, pp. 592–603, March 2009 Ganapathi, A., Kuno, H.A., Dayal, U., Wiener, J.L., Fox, A., Jordan, M.I., Patterson, D.A.: Predicting multiple metrics for queries: better decisions enabled by machine learning. In: Proceedings of the 25th International Conference on Data Engineering (ICDE 2009), Shanghai, pp. 592–603, March 2009
7.
Zurück zum Zitat Gubichev, A., Neumann, T.: Exploiting the query structure for efficient join ordering in SPARQL queries. In: Proceedings of the 17th International Conference on Extending Database Technology (EDBT 2014), Athens, pp. 439–450, March 2014 Gubichev, A., Neumann, T.: Exploiting the query structure for efficient join ordering in SPARQL queries. In: Proceedings of the 17th International Conference on Extending Database Technology (EDBT 2014), Athens, pp. 439–450, March 2014
8.
Zurück zum Zitat Hasan, R.: Predicting SPARQL query performance and explaining linked data. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 795–805. Springer, Heidelberg (2014). doi:10.1007/978-3-319-07443-6_53 CrossRef Hasan, R.: Predicting SPARQL query performance and explaining linked data. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 795–805. Springer, Heidelberg (2014). doi:10.​1007/​978-3-319-07443-6_​53 CrossRef
9.
Zurück zum Zitat Li, J., König, A.C., Narasayya, V.R., Chaudhuri, S.: Robust estimation of resource consumption for SQL queries using statistical techniques. VLDB Endow. (PVLDB) 5(11), 1555–1566 (2012)CrossRef Li, J., König, A.C., Narasayya, V.R., Chaudhuri, S.: Robust estimation of resource consumption for SQL queries using statistical techniques. VLDB Endow. (PVLDB) 5(11), 1555–1566 (2012)CrossRef
10.
Zurück zum Zitat Morsey, M., Lehmann, J., Auer, S., Ngomo, A.N.: Usage-centric benchmarking of RDF triple stores. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence, Toronto, July 2012 Morsey, M., Lehmann, J., Auer, S., Ngomo, A.N.: Usage-centric benchmarking of RDF triple stores. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence, Toronto, July 2012
11.
Zurück zum Zitat Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3), 16 (2009)CrossRef Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3), 16 (2009)CrossRef
12.
Zurück zum Zitat Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, Cambridge (2011)CrossRef Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, Cambridge (2011)CrossRef
13.
Zurück zum Zitat Smola, A., Vapnik, V.: Support vector regression machines. Adv. Neural Inf. Process. Syst. 9, 155–161 (1997) Smola, A., Vapnik, V.: Support vector regression machines. Adv. Neural Inf. Process. Syst. 9, 155–161 (1997)
14.
Zurück zum Zitat Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: Proceedings of the 17th International World Wide Web Conference (WWW 2008), Beijing, pp. 595–604, April 2008 Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: Proceedings of the 17th International World Wide Web Conference (WWW 2008), Beijing, pp. 595–604, April 2008
15.
Zurück zum Zitat Tsialiamanis, P., Sidirourgos, L., Fundulaki, I., Christophides, V., Boncz, P. A.: Heuristics-based query optimisation for SPARQL. In: Proceedings of the 15th International Conference on Extending Database Technology (EDBT 2012), Uppsala, pp. 324–335, March 2012 Tsialiamanis, P., Sidirourgos, L., Fundulaki, I., Christophides, V., Boncz, P. A.: Heuristics-based query optimisation for SPARQL. In: Proceedings of the 15th International Conference on Extending Database Technology (EDBT 2012), Uppsala, pp. 324–335, March 2012
16.
Zurück zum Zitat Wu, W., Chi, Y., Zhu, S., Tatemura, J., Hacigümüs, H., Naughton, J.F.: Predicting query execution time: are optimizer cost models really unusable? In: Proceedings of the 29th International Conference on Data Engineering (ICDE 2013), Brisbane, pp. 1081–1092, April 2013 Wu, W., Chi, Y., Zhu, S., Tatemura, J., Hacigümüs, H., Naughton, J.F.: Predicting query execution time: are optimizer cost models really unusable? In: Proceedings of the 29th International Conference on Data Engineering (ICDE 2013), Brisbane, pp. 1081–1092, April 2013
17.
Zurück zum Zitat Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A.F.M., Liu, B., Yu, P.S., Zhou, Z., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2008)CrossRef Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A.F.M., Liu, B., Yu, P.S., Zhou, Z., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2008)CrossRef
Metadaten
Titel
Learning-Based SPARQL Query Performance Prediction
verfasst von
Wei Emma Zhang
Quan Z. Sheng
Kerry Taylor
Yongrui Qin
Lina Yao
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-48740-3_23

Premium Partner