Skip to main content

2020 | OriginalPaper | Buchkapitel

Function Interpolation for Learned Index Structures

verfasst von : Naufal Fikri Setiawan, Benjamin I. P. Rubinstein, Renata Borovica-Gajic

Erschienen in: Databases Theory and Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Range indexes such as B-trees are widely recognised as effective data structures for enabling fast retrieval of records by the query key. While such classical indexes offer optimal worst-case guarantees, recent research suggests that average-case performance might be improved by alternative machine learning-based models such as deep neural networks. This paper explores an alternative approach by modelling the task as one of function approximation via interpolation between compressed subsets of keys. We explore the Chebyshev and Bernstein polynomial bases, and demonstrate substantial benefits over deep neural networks. In particular, our proposed function interpolation models exhibit memory footprint two orders of magnitude smaller compared to neural network models, and 30–40% accuracy improvement over neural networks trained with the same amount of time, while keeping query time generally on-par with neural network models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
It is sufficient but not necessary to prohibit insertions/deletions as done in [16].
 
Literatur
1.
Zurück zum Zitat Aldà, F., Rubinstein, B.I.P.: The Bernstein mechanism: function release under differential privacy. In: AAAI, pp. 1705–1711 (2017) Aldà, F., Rubinstein, B.I.P.: The Bernstein mechanism: function release under differential privacy. In: AAAI, pp. 1705–1711 (2017)
2.
Zurück zum Zitat Bayer, R., McCreight, E.: Organization and maintenance of large ordered indices. In: SIGFIDET, pp. 107–141 (1970) Bayer, R., McCreight, E.: Organization and maintenance of large ordered indices. In: SIGFIDET, pp. 107–141 (1970)
3.
Zurück zum Zitat Boyd, J.P., Ong, J.R.: Exponentially-convergent strategies for defeating the Runge phenomenon for the approximation of non-periodic functions, part I: single-interval schemes. Commun. Comput. Phys. 5(2–4), 484–497 (2009)MathSciNetMATH Boyd, J.P., Ong, J.R.: Exponentially-convergent strategies for defeating the Runge phenomenon for the approximation of non-periodic functions, part I: single-interval schemes. Commun. Comput. Phys. 5(2–4), 484–497 (2009)MathSciNetMATH
4.
Zurück zum Zitat Brisebarre, N., Joldeş, M.: Chebyshev interpolation polynomial-based tools for rigorous computing. In: Proceedings of the 2010 International Symposium on Symbolic and Algebraic Computation, pp. 147–154. ACM (2010) Brisebarre, N., Joldeş, M.: Chebyshev interpolation polynomial-based tools for rigorous computing. In: Proceedings of the 2010 International Symposium on Symbolic and Algebraic Computation, pp. 147–154. ACM (2010)
5.
Zurück zum Zitat Cheney, E.W.: Introduction to Approximation Theory. McGraw-Hill, New York (1966)MATH Cheney, E.W.: Introduction to Approximation Theory. McGraw-Hill, New York (1966)MATH
6.
Zurück zum Zitat Collobert, R., Kavukcuoglu, K., Farabet, C.: Torch7: a Matlab-like environment for machine learning. In: BigLearn NIPS Workshop (2011) Collobert, R., Kavukcuoglu, K., Farabet, C.: Torch7: a Matlab-like environment for machine learning. In: BigLearn NIPS Workshop (2011)
7.
Zurück zum Zitat Galakatos, A., Markovitch, M., Binnig, C., Fonseca, R., Kraska, T.: Fiting-tree: a data-aware index structure. In: SIGMOD, pp. 1189–1206 (2019) Galakatos, A., Markovitch, M., Binnig, C., Fonseca, R., Kraska, T.: Fiting-tree: a data-aware index structure. In: SIGMOD, pp. 1189–1206 (2019)
8.
Zurück zum Zitat Gammerman, A., Vovk, V., Vapnik, V.: Learning by transduction. In: UAI, pp. 148–155 (1998) Gammerman, A., Vovk, V., Vapnik, V.: Learning by transduction. In: UAI, pp. 148–155 (1998)
9.
Zurück zum Zitat Gil, A., Segura, J., Temme, N.M.: Numerical Methods for Special Functions. Society for Industrial and Applied Mathematics (2007) Gil, A., Segura, J., Temme, N.M.: Numerical Methods for Special Functions. Society for Industrial and Applied Mathematics (2007)
10.
Zurück zum Zitat Goldstein, J., Ramakrishnan, R., Shaft, U.: Compressing relations and indexes. In: ICDE (1998) Goldstein, J., Ramakrishnan, R., Shaft, U.: Compressing relations and indexes. In: ICDE (1998)
11.
Zurück zum Zitat Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)MATH Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)MATH
12.
Zurück zum Zitat Graefe, G., Larson, P.A.: B-tree indexes and CPU caches. In: ICDE, pp. 349–358 (2001) Graefe, G., Larson, P.A.: B-tree indexes and CPU caches. In: ICDE, pp. 349–358 (2001)
13.
Zurück zum Zitat Hadian, A., Heinis, T.: Interpolation-friendly B-trees: bridging the gap between algorithmic and learned indexes. In: EDBT, pp. 710–713 (2019) Hadian, A., Heinis, T.: Interpolation-friendly B-trees: bridging the gap between algorithmic and learned indexes. In: EDBT, pp. 710–713 (2019)
14.
Zurück zum Zitat Idreos, S., Kersten, M.L., Manegold, S.: Database cracking. In: CIDR, pp. 68–78 (2007) Idreos, S., Kersten, M.L., Manegold, S.: Database cracking. In: CIDR, pp. 68–78 (2007)
15.
Zurück zum Zitat Kim, C., et al.: Fast: fast architecture sensitive tree search on modern CPUs and GPUs. In: SIGMOD, pp. 339–350 (2010) Kim, C., et al.: Fast: fast architecture sensitive tree search on modern CPUs and GPUs. In: SIGMOD, pp. 339–350 (2010)
16.
Zurück zum Zitat Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: SIGMOD (2018) Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: SIGMOD (2018)
17.
Zurück zum Zitat Kubica, J.M., Moore, A., Connolly, A.J., Jedicke, R.: Spatial data structures for efficient trajectory-based queries. Technical report, CMU-RI-TR-04-61, Carnegie Mellon University (2004) Kubica, J.M., Moore, A., Connolly, A.J., Jedicke, R.: Spatial data structures for efficient trajectory-based queries. Technical report, CMU-RI-TR-04-61, Carnegie Mellon University (2004)
18.
Zurück zum Zitat Leis, V., Kemper, A., Neumann, T.: The adaptive radix tree: artful indexing for main-memory databases. In: ICDE, pp. 38–49 (2013) Leis, V., Kemper, A., Neumann, T.: The adaptive radix tree: artful indexing for main-memory databases. In: ICDE, pp. 38–49 (2013)
19.
Zurück zum Zitat Microsoft: Hardware and software requirements for installing SQL server Microsoft: Hardware and software requirements for installing SQL server
20.
Zurück zum Zitat Mitzenmacher, M.: A model for learned bloom filters, and optimizing by sandwiching. In: NIPS, pp. 462–471 (2018)CrossRef Mitzenmacher, M.: A model for learned bloom filters, and optimizing by sandwiching. In: NIPS, pp. 462–471 (2018)CrossRef
21.
Zurück zum Zitat Rao, J., Ross, K.A.: Making b+-trees cache conscious in main memory. In: SIGMOD, pp. 475–486 (2000)CrossRef Rao, J., Ross, K.A.: Making b+-trees cache conscious in main memory. In: SIGMOD, pp. 475–486 (2000)CrossRef
22.
Zurück zum Zitat Schonfelder, J.: Chebyshev expansions for the error and related functions. Math. Comput. 32(144), 1232–1240 (1978)MathSciNetCrossRef Schonfelder, J.: Chebyshev expansions for the error and related functions. Math. Comput. 32(144), 1232–1240 (1978)MathSciNetCrossRef
23.
Zurück zum Zitat Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, Cambridge (2014)CrossRef Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, Cambridge (2014)CrossRef
24.
Zurück zum Zitat Stonebraker, M.: The case for partial indexes. SIGMOD Rec. 18(4), 4–11 (1989)CrossRef Stonebraker, M.: The case for partial indexes. SIGMOD Rec. 18(4), 4–11 (1989)CrossRef
Metadaten
Titel
Function Interpolation for Learned Index Structures
verfasst von
Naufal Fikri Setiawan
Benjamin I. P. Rubinstein
Renata Borovica-Gajic
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-39469-1_6