Skip to main content
Erschienen in: World Wide Web 5/2017

21.12.2016

Tide-tree: A self-tuning indexing scheme for hybrid storage system

verfasst von: Sheng Wang, Xiaolin Qin, Zhifeng Bao, Bohan Li

Erschienen in: World Wide Web | Ausgabe 5/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Main memory index is built with the assumption that the RAM is sufficiently large to hold data. Due to the volatility and high unit price of main memory, indices under secondary memory such as SSD and HDD are widely used. However, the I/O operation with main memory is still the bottleneck for query efficiency. In this paper, we propose a self-tuning indexing scheme called Tide-tree for RAM/Disk-based hybrid storage system. Tide-tree aims to overcome the obstacles main memory and disk-based indices face, and performs like the tide to achieve a double-win in space and performance, which is self-adaptive with respect to the running environment. Particularly, Tide-tree delaminates the tree structure adaptively with high efficiency based on storage sense, and applies an effective self-tuning algorithm to dynamically load various nodes into main memory. We employ memory mapping technology to solve the persistent problem of main memory index, and improves the efficiency of data synchronism and pointer translation. To further enhance the independence of Tide-tree, we employ the index head and the level address table to manage the whole index. With the index head, three efficient operations are proposed, namely index rebuild, index load and range search. We have conducted extensive experiments to compare the Tide-tree with several state-of-the-art indices, and the results have validated the high efficiency, reusability and stability of Tide-tree.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Agrawal, D., Ganesan, D., Sitaraman, R., Diao, Y., Singh, S.: Lazy-adaptive tree: An optimized index structure for flash devices. Proceedings of the VLDB Endowment 2(1), 361–372 (2009)CrossRef Agrawal, D., Ganesan, D., Sitaraman, R., Diao, Y., Singh, S.: Lazy-adaptive tree: An optimized index structure for flash devices. Proceedings of the VLDB Endowment 2(1), 361–372 (2009)CrossRef
2.
Zurück zum Zitat Athanassoulis, M., Ailamaki, A.: Bf-tree: approximate tree indexing. Proceedings of the VLDB Endowment 7(14), 1881–1892 (2014)CrossRef Athanassoulis, M., Ailamaki, A.: Bf-tree: approximate tree indexing. Proceedings of the VLDB Endowment 7(14), 1881–1892 (2014)CrossRef
3.
Zurück zum Zitat Boehm, M., Schlegel, B., Volk, P.B., Fischer, U., Habich, D., Lehner, W.: Efficient in-memory indexing with generalized prefix trees. In: BTW, vol. 180, pp 227–246 (2011) Boehm, M., Schlegel, B., Volk, P.B., Fischer, U., Habich, D., Lehner, W.: Efficient in-memory indexing with generalized prefix trees. In: BTW, vol. 180, pp 227–246 (2011)
4.
Zurück zum Zitat Chaudhuri, S., Weikum, G.: Rethinking database system architecture: Towards a self-tuning risc-style database system. In: VLDB, pp 1–10. Citeseer (2000) Chaudhuri, S., Weikum, G.: Rethinking database system architecture: Towards a self-tuning risc-style database system. In: VLDB, pp 1–10. Citeseer (2000)
6.
Zurück zum Zitat Das, S., Nishimura, S., Agrawal, D., El Abbadi, A.: Albatross: lightweight elasticity in shared storage databases for the cloud using live data migration. Proceedings of the VLDB Endowment 4(8), 494–505 (2011)CrossRef Das, S., Nishimura, S., Agrawal, D., El Abbadi, A.: Albatross: lightweight elasticity in shared storage databases for the cloud using live data migration. Proceedings of the VLDB Endowment 4(8), 494–505 (2011)CrossRef
7.
Zurück zum Zitat Dewitt, S.J.W.D.J.: A performance study of alternative object faulting and pointer swizzling strategies. In: Proceedings 18th Int. Conf. Very Large Data Bases, Vancouver, BC, Canada (1992) Dewitt, S.J.W.D.J.: A performance study of alternative object faulting and pointer swizzling strategies. In: Proceedings 18th Int. Conf. Very Large Data Bases, Vancouver, BC, Canada (1992)
8.
Zurück zum Zitat Diaconu, C., Freedman, C., Ismert, E., Larson, P.A., Mittal, P., Stonecipher, R., Verma, N., Zwilling, M.: Hekaton: Sql server’s memory-optimized oltp engine. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp 1243–1254. ACM (2013) Diaconu, C., Freedman, C., Ismert, E., Larson, P.A., Mittal, P., Stonecipher, R., Verma, N., Zwilling, M.: Hekaton: Sql server’s memory-optimized oltp engine. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp 1243–1254. ACM (2013)
9.
Zurück zum Zitat Fu, Z., Ren, K., Shu, J., Sun, X., Huang, F.: Enabling personalized search over encrypted outsourced data with efficiency improvement. IEEE Trans. Parallel Distrib. Syst. 27(9), 2546–2559 (2016)CrossRef Fu, Z., Ren, K., Shu, J., Sun, X., Huang, F.: Enabling personalized search over encrypted outsourced data with efficiency improvement. IEEE Trans. Parallel Distrib. Syst. 27(9), 2546–2559 (2016)CrossRef
10.
Zurück zum Zitat Fu, Z., Wu, X., Guan, C., Sun, X., Ren, K.: Towards efficient multi-keyword fuzzy search over encrypted outsourced data with accuracy improvement. IEEE Transactions on Information Forensics and Security, doi:10.1109/TIFS.2016.2596138 (2016) Fu, Z., Wu, X., Guan, C., Sun, X., Ren, K.: Towards efficient multi-keyword fuzzy search over encrypted outsourced data with accuracy improvement. IEEE Transactions on Information Forensics and Security, doi:10.​1109/​TIFS.​2016.​2596138 (2016)
11.
Zurück zum Zitat Garcia-Molina, H., Ullman, J.D., Widom, J.: Database system implementation, vol. 654. Prentice Hall Upper Saddle River, NJ (2000) Garcia-Molina, H., Ullman, J.D., Widom, J.: Database system implementation, vol. 654. Prentice Hall Upper Saddle River, NJ (2000)
12.
Zurück zum Zitat Graefe, G.: Modern b-tree techniques. Foundations and Trends in Databases 3(4), 203–402 (2011)CrossRef Graefe, G.: Modern b-tree techniques. Foundations and Trends in Databases 3(4), 203–402 (2011)CrossRef
13.
Zurück zum Zitat Graefe, G., Kuno, H.: Self-selecting, self-tuning, incrementally optimized indexes. In: Proceedings of the 13th International Conference on Extending Database Technology, pp 371–381. ACM (2010) Graefe, G., Kuno, H.: Self-selecting, self-tuning, incrementally optimized indexes. In: Proceedings of the 13th International Conference on Extending Database Technology, pp 371–381. ACM (2010)
14.
Zurück zum Zitat Graefe, G., Volos, H., Kimura, H., Kuno, H., Tucek, J., Lillibridge, M., Veitch, A.: In-memory performance for big data. Proceedings of the VLDB Endowment 8(1), 37–48 (2014)CrossRef Graefe, G., Volos, H., Kimura, H., Kuno, H., Tucek, J., Lillibridge, M., Veitch, A.: In-memory performance for big data. Proceedings of the VLDB Endowment 8(1), 37–48 (2014)CrossRef
15.
Zurück zum Zitat Halim, F., Idreos, S., Karras, P., Yap, R.H.: Stochastic database cracking: Towards robust adaptive indexing in main-memory column-stores. Proceedings of the VLDB Endowment 5(6), 502–513 (2012)CrossRef Halim, F., Idreos, S., Karras, P., Yap, R.H.: Stochastic database cracking: Towards robust adaptive indexing in main-memory column-stores. Proceedings of the VLDB Endowment 5(6), 502–513 (2012)CrossRef
16.
Zurück zum Zitat Idreos, S., Kersten, M.L., Manegold, S., et al.: Database cracking. In: CIDR, vol. 3, pp 1–8 (2007) Idreos, S., Kersten, M.L., Manegold, S., et al.: Database cracking. In: CIDR, vol. 3, pp 1–8 (2007)
17.
Zurück zum Zitat Jin, P., Yang, P., Yue, L.: Optimizing b+-tree for hybrid storage systems. Distributed and Parallel Databases 33(3), 449–475 (2015)CrossRef Jin, P., Yang, P., Yue, L.: Optimizing b+-tree for hybrid storage systems. Distributed and Parallel Databases 33(3), 449–475 (2015)CrossRef
18.
Zurück zum Zitat Jørgensen, M.V., Rasmussen, R.B., Šaltenis, S., Schjønning, C.: Fb-tree: a b+-tree for flash-based ssds. In: Proceedings of the 15th Symposium on International Database Engineering & Applications, pp 34–42. ACM (2011) Jørgensen, M.V., Rasmussen, R.B., Šaltenis, S., Schjønning, C.: Fb-tree: a b+-tree for flash-based ssds. In: Proceedings of the 15th Symposium on International Database Engineering & Applications, pp 34–42. ACM (2011)
19.
Zurück zum Zitat Kissinger, T., Schlegel, B., Boehm, M., Habich, D., Lehner, W.: A high-throughput in-memory index, durable on flash-based ssd: insights into the winning solution of the sigmod programming contest 2011. ACM SIGMOD Record 41(3), 44–50 (2012)CrossRef Kissinger, T., Schlegel, B., Boehm, M., Habich, D., Lehner, W.: A high-throughput in-memory index, durable on flash-based ssd: insights into the winning solution of the sigmod programming contest 2011. ACM SIGMOD Record 41(3), 44–50 (2012)CrossRef
20.
Zurück zum Zitat Lahiri, T., Neimat, M.A., Folkman, S.: Oracle timesten: An in-memory database for enterprise applications. IEEE Data Eng. Bull. 36(2), 6–13 (2013) Lahiri, T., Neimat, M.A., Folkman, S.: Oracle timesten: An in-memory database for enterprise applications. IEEE Data Eng. Bull. 36(2), 6–13 (2013)
21.
Zurück zum Zitat Lee, H.S., Lee, D.H.: An efficient index buffer management scheme for implementing a b-tree on nand flash memory. Data Knowl. Eng. 69(9), 901–916 (2010)CrossRef Lee, H.S., Lee, D.H.: An efficient index buffer management scheme for implementing a b-tree on nand flash memory. Data Knowl. Eng. 69(9), 901–916 (2010)CrossRef
22.
Zurück zum Zitat Lehman, T.J., Carey, M.J.: A study of index structures for main memory database management systems. In: Proceedings VLDB (1986) Lehman, T.J., Carey, M.J.: A study of index structures for main memory database management systems. In: Proceedings VLDB (1986)
23.
Zurück zum Zitat Leis, V., Kemper, A., Neumann, T.: The adaptive radix tree: Artful indexing for main-memory databases. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp 38–49. IEEE (2013) Leis, V., Kemper, A., Neumann, T.: The adaptive radix tree: Artful indexing for main-memory databases. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp 38–49. IEEE (2013)
24.
Zurück zum Zitat Li, Y., He, B., Luo, Q., Yi, K.: Tree indexing on flash disks. In: 2009 IEEE 25th International Conference on Data Engineering, pp 1303–1306. IEEE (2009) Li, Y., He, B., Luo, Q., Yi, K.: Tree indexing on flash disks. In: 2009 IEEE 25th International Conference on Data Engineering, pp 1303–1306. IEEE (2009)
25.
Zurück zum Zitat Li, Y., He, B., Yang, R.J., Luo, Q., Yi, K.: Tree indexing on solid state drives. Proceedings of the VLDB Endowment 3(1-2), 1195–1206 (2010)CrossRef Li, Y., He, B., Yang, R.J., Luo, Q., Yi, K.: Tree indexing on solid state drives. Proceedings of the VLDB Endowment 3(1-2), 1195–1206 (2010)CrossRef
26.
Zurück zum Zitat Lin, Z., Kahng, M., Sabrin, K.M., Chau, D.H.P., Lee, H., Kang, U.: Mmap: Fast billion-scale graph computation on a pc via memory mapping. In: 2014 IEEE International Conference on Big Data (Big Data), pp 159–164. IEEE (2014) Lin, Z., Kahng, M., Sabrin, K.M., Chau, D.H.P., Lee, H., Kang, U.: Mmap: Fast billion-scale graph computation on a pc via memory mapping. In: 2014 IEEE International Conference on Big Data (Big Data), pp 159–164. IEEE (2014)
27.
Zurück zum Zitat Long, X., Suel, T.: Three-level caching for efficient query processing in large web search engines. World Wide Web 9(4), 369–395 (2006)CrossRef Long, X., Suel, T.: Three-level caching for efficient query processing in large web search engines. World Wide Web 9(4), 369–395 (2006)CrossRef
28.
Zurück zum Zitat Mullin, J.K.: A second look at bloom filters. Commun. ACM 26(8), 570–571 (1983)CrossRef Mullin, J.K.: A second look at bloom filters. Commun. ACM 26(8), 570–571 (1983)CrossRef
29.
Zurück zum Zitat Nath, S., Kansal, A.: Flashdb: dynamic self-tuning database for nand flash. In: Proceedings of the 6th international conference on Information processing in sensor networks, pp 410–419. ACM (2007) Nath, S., Kansal, A.: Flashdb: dynamic self-tuning database for nand flash. In: Proceedings of the 6th international conference on Information processing in sensor networks, pp 410–419. ACM (2007)
30.
Zurück zum Zitat Peng, P., Zou, L., Chen, L., Lin, X., Zhao, D.: Answering subgraph queries over massive disk resident graphs. World Wide Web 19(3), 417–448 (2016)CrossRef Peng, P., Zou, L., Chen, L., Lin, X., Zhao, D.: Answering subgraph queries over massive disk resident graphs. World Wide Web 19(3), 417–448 (2016)CrossRef
31.
Zurück zum Zitat Rao, J., Ross, K.A.: Making b+-trees cache conscious in main memory. In: ACM SIGMOD Record, vol. 29, pp 475–486. ACM (2000) Rao, J., Ross, K.A.: Making b+-trees cache conscious in main memory. In: ACM SIGMOD Record, vol. 29, pp 475–486. ACM (2000)
32.
Zurück zum Zitat Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: 2010 IEEE 26th symposium on mass storage systems and technologies (MSST), pp 1–10. IEEE (2010) Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: 2010 IEEE 26th symposium on mass storage systems and technologies (MSST), pp 1–10. IEEE (2010)
34.
Zurück zum Zitat WANG, S., QIN, X., SHEN, Y., LI, B., SHI, W.: Research on durable csb+-tree indexing technology. Journal of Frontiers of Computer Science and Technology 2, 005 (2015) WANG, S., QIN, X., SHEN, Y., LI, B., SHI, W.: Research on durable csb+-tree indexing technology. Journal of Frontiers of Computer Science and Technology 2, 005 (2015)
35.
Zurück zum Zitat Wu, C.H., Kuo, T.W., Chang, L.P.: An efficient b-tree layer implementation for flash-memory storage systems. ACM Trans. Embed. Comput. Syst. (TECS) 6(3), 19 (2007)CrossRef Wu, C.H., Kuo, T.W., Chang, L.P.: An efficient b-tree layer implementation for flash-memory storage systems. ACM Trans. Embed. Comput. Syst. (TECS) 6(3), 19 (2007)CrossRef
36.
Zurück zum Zitat Xia, Z., Wang, X., Sun, X., Wang, Q.: A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans. Parallel Distrib. Syst. 27(2), 340–352 (2016)CrossRef Xia, Z., Wang, X., Sun, X., Wang, Q.: A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans. Parallel Distrib. Syst. 27(2), 340–352 (2016)CrossRef
37.
Zurück zum Zitat Yang, C., Jin, P., Yue, L., Yang, P.: Efficient buffer management for tree indexes on solid state drives. Int. J. Parallel Prog. 44(1), 5–25 (2016)CrossRef Yang, C., Jin, P., Yue, L., Yang, P.: Efficient buffer management for tree indexes on solid state drives. Int. J. Parallel Prog. 44(1), 5–25 (2016)CrossRef
38.
Zurück zum Zitat Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, pp 2–2. USENIX Association (2012) Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, pp 2–2. USENIX Association (2012)
39.
Zurück zum Zitat Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. HotCloud 10, 10–10 (2010) Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. HotCloud 10, 10–10 (2010)
Metadaten
Titel
Tide-tree: A self-tuning indexing scheme for hybrid storage system
verfasst von
Sheng Wang
Xiaolin Qin
Zhifeng Bao
Bohan Li
Publikationsdatum
21.12.2016
Verlag
Springer US
Erschienen in
World Wide Web / Ausgabe 5/2017
Print ISSN: 1386-145X
Elektronische ISSN: 1573-1413
DOI
https://doi.org/10.1007/s11280-016-0426-9

Weitere Artikel der Ausgabe 5/2017

World Wide Web 5/2017 Zur Ausgabe

Premium Partner