Skip to main content

2025 | OriginalPaper | Buchkapitel

Selectivity Estimation for Spatial Filters Using Optimizer Feedback: A Machine Learning Perspective

verfasst von : Nadir Guermoudi, Houcine Matallah, Amin Mesmoudi, Seif-Eddine Benkabou, Allel Hadjali

Erschienen in: Web Information Systems Engineering – WISE 2024

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In query optimization, the precision of selectivity estimates for query predicates is foundational for selecting efficient execution plans. Spatial selectivity estimation, which assesses the count of relevant objects meeting specific spatial criteria, poses a significant challenge due to its multi-dimensional and complex nature. Moreover, it is essential that these estimation methods be both fast and minimize memory usage. In this paper, we leverage optimizer feedback to tackle the challenging task of estimating selectivity for multi-dimensional spatial predicates. We redefine spatial selectivity estimation as a regression problem and investigate the application of three types of machine learning (ML) models: neural networks, tree-based models, and instance-based models to address this challenge. We compare these ML approaches against baseline methods that rely on Minimum Bounding Rectangles (MBRs), encompassing both RTree-based and histogram-based estimations. Through extensive empirical evaluations using a real dataset, our study guides the choice of ML models in accordance with the data collected by the optimizer.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
An MBR is the smallest axis-parallel rectangle that fully encompasses the specified spatial area.
 
2
e.g., POLYGON((\(-80.0 35.0, -80.0 45.0, -70.0 45.0, -70.0 35.0, -80.0 35.0\))).
 
Literatur
3.
Zurück zum Zitat An, N., Yang, Z.Y., Sivasubramaniam, A.: Selectivity estimation for spatial joins. In: ICDE, pp. 368–375. IEEE (2001) An, N., Yang, Z.Y., Sivasubramaniam, A.: Selectivity estimation for spatial joins. In: ICDE, pp. 368–375. IEEE (2001)
5.
Zurück zum Zitat Bimonte, S., Gallinucci, E., Marcel, P., Rizzi, S.: Data variety, come as you are in multi-model data warehouses. Inf. Syst. 104, 101734 (2022)CrossRef Bimonte, S., Gallinucci, E., Marcel, P., Rizzi, S.: Data variety, come as you are in multi-model data warehouses. Inf. Syst. 104, 101734 (2022)CrossRef
6.
Zurück zum Zitat Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: SIGKDD, pp. 785–794 (2016) Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: SIGKDD, pp. 785–794 (2016)
7.
Zurück zum Zitat Cheng, C., Song, X., Zhou, C.: Generic cumulative annular bucket histogram for spatial selectivity estimation of spatial database management system. Int. J. Geogr. Inf. Sci. 27(2), 339–362 (2013)CrossRef Cheng, C., Song, X., Zhou, C.: Generic cumulative annular bucket histogram for spatial selectivity estimation of spatial database management system. Int. J. Geogr. Inf. Sci. 27(2), 339–362 (2013)CrossRef
8.
Zurück zum Zitat Currim, S., Snodgrass, R.T., Suh, Y.K.: Identifying the root causes of DBMS suboptimality. ACM Trans. Database Syst. 49(1), 1–40 (2024) Currim, S., Snodgrass, R.T., Suh, Y.K.: Identifying the root causes of DBMS suboptimality. ACM Trans. Database Syst. 49(1), 1–40 (2024)
9.
Zurück zum Zitat Dutt, A., Wang, C., Nazi, A., Kandula, S., Narasayya, V., Chaudhuri, S.: Selectivity estimation for range predicates using lightweight models. Proc. VLDB 12(9), 1044–1057 (2019)CrossRef Dutt, A., Wang, C., Nazi, A., Kandula, S., Narasayya, V., Chaudhuri, S.: Selectivity estimation for range predicates using lightweight models. Proc. VLDB 12(9), 1044–1057 (2019)CrossRef
10.
Zurück zum Zitat Gaede, V., Günther, O.: Multidimensional access methods. ACM Comput. Surv. (CSUR) 30(2), 170–231 (1998)CrossRef Gaede, V., Günther, O.: Multidimensional access methods. ACM Comput. Surv. (CSUR) 30(2), 170–231 (1998)CrossRef
11.
Zurück zum Zitat Georgiadis, T., Mamoulis, N.: Raster intervals: an approximation technique for polygon intersection joins. Proc. ACM Manag. Data 1(1), 1–18 (2023)CrossRef Georgiadis, T., Mamoulis, N.: Raster intervals: an approximation technique for polygon intersection joins. Proc. ACM Manag. Data 1(1), 1–18 (2023)CrossRef
12.
Zurück zum Zitat Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194, 28–61 (2013)MathSciNetCrossRef Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194, 28–61 (2013)MathSciNetCrossRef
13.
Zurück zum Zitat Jacox, E.H., Samet, H.: Spatial join techniques. ACM Trans. Database Syst. (TODS) 32(1), 7–es (2007) Jacox, E.H., Samet, H.: Spatial join techniques. ACM Trans. Database Syst. (TODS) 32(1), 7–es (2007)
14.
Zurück zum Zitat Khelil, A., Mesmoudi, A., Galicia, J., Bellatreche, L., Hacid, M.S., Coquery, E.: Combining graph exploration and fragmentation for scalable RDF query processing. ISF 23(1), 165–183 (2021) Khelil, A., Mesmoudi, A., Galicia, J., Bellatreche, L., Hacid, M.S., Coquery, E.: Combining graph exploration and fragmentation for scalable RDF query processing. ISF 23(1), 165–183 (2021)
15.
Zurück zum Zitat Kramer, O., Kramer, O.: K-nearest neighbors. Dimensionality reduction with unsupervised nearest neighbors, pp. 13–23 (2013) Kramer, O., Kramer, O.: K-nearest neighbors. Dimensionality reduction with unsupervised nearest neighbors, pp. 13–23 (2013)
16.
Zurück zum Zitat Li, S., Ying, M.: Region connection calculus: its models and composition table. Artif. Intell. 145(1–2), 121–146 (2003)MathSciNetCrossRef Li, S., Ying, M.: Region connection calculus: its models and composition table. Artif. Intell. 145(1–2), 121–146 (2003)MathSciNetCrossRef
17.
Zurück zum Zitat Loh, W.Y.: Classification and regression trees. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 1(1), 14–23 (2011)CrossRef Loh, W.Y.: Classification and regression trees. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 1(1), 14–23 (2011)CrossRef
19.
Zurück zum Zitat Meng, Z., Cao, X., Cong, G.: Selectivity estimation for queries containing predicates over set-valued attributes. Proc. ACM Manag. Data 1(4), 1–26 (2023)CrossRef Meng, Z., Cao, X., Cong, G.: Selectivity estimation for queries containing predicates over set-valued attributes. Proc. ACM Manag. Data 1(4), 1–26 (2023)CrossRef
20.
Zurück zum Zitat Patel, J.M., DeWitt, D.J.: Partition based spatial-merge join. SIGMOD Rec. 25(2), 259–270 (1996)CrossRef Patel, J.M., DeWitt, D.J.: Partition based spatial-merge join. SIGMOD Rec. 25(2), 259–270 (1996)CrossRef
21.
Zurück zum Zitat Pavlovic, M., Sidlauskas, D., Heinis, T., Ailamaki, A.: QUASII: query-aware spatial incremental index. In: 21st International Conference on Extending Database Technology (EDBT) (2018) Pavlovic, M., Sidlauskas, D., Heinis, T., Ailamaki, A.: QUASII: query-aware spatial incremental index. In: 21st International Conference on Extending Database Technology (EDBT) (2018)
22.
Zurück zum Zitat Richly, K., Schlosser, R., Boissier, M.: Joint index, sorting, and compression optimization for memory-efficient spatio-temporal data management. In: ICDE, pp. 1901–1906. IEEE (2021) Richly, K., Schlosser, R., Boissier, M.: Joint index, sorting, and compression optimization for memory-efficient spatio-temporal data management. In: ICDE, pp. 1901–1906. IEEE (2021)
23.
Zurück zum Zitat Shi, J., Cong, G., Li, X.L.: Learned index benefits: machine learning based index performance estimation. Proc. VLDB Endowment 15(13), 3950–3962 (2022)CrossRef Shi, J., Cong, G., Li, X.L.: Learned index benefits: machine learning based index performance estimation. Proc. VLDB Endowment 15(13), 3950–3962 (2022)CrossRef
24.
Zurück zum Zitat Stillger, M., Lohman, G.M., Markl, V., Kandil, M.: LEO-DB2’s learning optimizer. In: VLDB. vol. 1, pp. 19–28 (2001) Stillger, M., Lohman, G.M., Markl, V., Kandil, M.: LEO-DB2’s learning optimizer. In: VLDB. vol. 1, pp. 19–28 (2001)
26.
Zurück zum Zitat Wu, Y.c., Feng, J.w.: Development and application of artificial neural network. Wireless Pers. Commun. 102, 1645–1656 (2018) Wu, Y.c., Feng, J.w.: Development and application of artificial neural network. Wireless Pers. Commun. 102, 1645–1656 (2018)
27.
Zurück zum Zitat Yousfi, H., Mesmoudi, A., Hadjali, A., Matallah, H., Benkabou, S.E.: SRDF_QDAG: an efficient end-to-end RDF data management when graph exploration meets spatial processing. Comput. Sci. Inf. Syst. 20, 46–46 (2023) Yousfi, H., Mesmoudi, A., Hadjali, A., Matallah, H., Benkabou, S.E.: SRDF_QDAG: an efficient end-to-end RDF data management when graph exploration meets spatial processing. Comput. Sci. Inf. Syst. 20, 46–46 (2023)
28.
Zurück zum Zitat Zacharatou, E.T., Šidlauskas, D., Tauheed, F., Heinis, T., Ailamaki, A.: Efficient bundled spatial range queries. In: Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 139–148 (2019) Zacharatou, E.T., Šidlauskas, D., Tauheed, F., Heinis, T., Ailamaki, A.: Efficient bundled spatial range queries. In: Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 139–148 (2019)
Metadaten
Titel
Selectivity Estimation for Spatial Filters Using Optimizer Feedback: A Machine Learning Perspective
verfasst von
Nadir Guermoudi
Houcine Matallah
Amin Mesmoudi
Seif-Eddine Benkabou
Allel Hadjali
Copyright-Jahr
2025
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-96-0573-6_8