Skip to main content
main-content

Tipp

Weitere Kapitel dieses Buchs durch Wischen aufrufen

2020 | OriginalPaper | Buchkapitel

Pipelined Query Processing Using Non-volatile Memory SSDs

verfasst von: Xinyu Liu, Yu Pan, Wenxiu Fang, Rebecca J. Stones, Gang Wang, Yusen Li, Xiaoguang Liu

Erschienen in: Web and Big Data

Verlag: Springer International Publishing

share
TEILEN

Abstract

NVM Optane SSDs are faster than traditional flash-based SSDs and more economical than DRAM main memory, so we explore query processing with the inverted index on NVM aiming at reducing costs, but this leads to NVM-to-DRAM I/O which negatively affects the search engine’s responsiveness. To alleviate this problem, we propose a pipelining scheme to overlap CPU computation with NVM-to-DRAM I/O. We further propose some optimizations: variable coalesced block size, data prefetching, and block skipping.
The experiments on the Gov2 and ClueWeb document corpuses indicate a reduction in CPU waiting time caused by NVM-to-DRAM I/O by around 85% for Maxscore, Wand, and BlockMaxWand queries vs. not using pipelining, while maintaining comparable query throughput (loss within 6%) vs. an in-memory inverted index (DRAM-based scheme). For RankAnd queries, we occupy 3% of the inverted index in memory for caching to achieve similar query efficiency (within 6%) vs. the DRAM-based scheme.

Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 69.000 Bücher
  • über 500 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Testen Sie jetzt 15 Tage kostenlos.

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 50.000 Bücher
  • über 380 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




Testen Sie jetzt 15 Tage kostenlos.

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 58.000 Bücher
  • über 300 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Testen Sie jetzt 15 Tage kostenlos.

Literatur
1.
Zurück zum Zitat Anh, V.N., Moffat, A.: Index compression using 64-bit words. Softw. Pract. Exp. 40(2), 131–147 (2010) Anh, V.N., Moffat, A.: Index compression using 64-bit words. Softw. Pract. Exp. 40(2), 131–147 (2010)
2.
Zurück zum Zitat Ao, N., et al.: Efficient parallel lists intersection and index compression algorithms using graphics processing units. PVLDB 4(8), 470–481 (2011) Ao, N., et al.: Efficient parallel lists intersection and index compression algorithms using graphics processing units. PVLDB 4(8), 470–481 (2011)
3.
Zurück zum Zitat Broder, A.Z., Carmel, D., Herscovici, M., Soffer, A., Zien, J.Y.: Efficient query evaluation using a two-level retrieval process. In: Proceedings of CIKM. pp. 426–434 (2003) Broder, A.Z., Carmel, D., Herscovici, M., Soffer, A., Zien, J.Y.: Efficient query evaluation using a two-level retrieval process. In: Proceedings of CIKM. pp. 426–434 (2003)
4.
Zurück zum Zitat Büttcher, S., Clarke, C.L.A., Cormack, G.V.: Information Retrieval-Implementing and Evaluating Search Engines. MIT Press, Cambridge (2010) MATH Büttcher, S., Clarke, C.L.A., Cormack, G.V.: Information Retrieval-Implementing and Evaluating Search Engines. MIT Press, Cambridge (2010) MATH
5.
Zurück zum Zitat Cambazoglu, B.B., Baeza-Yates, R.A.: Scalability challenges in web search engines. Synth. Lect. Inf. Concepts Retr. Serv. 7, 1–138 (2015) Cambazoglu, B.B., Baeza-Yates, R.A.: Scalability challenges in web search engines. Synth. Lect. Inf. Concepts Retr. Serv. 7, 1–138 (2015)
6.
Zurück zum Zitat Cambazoglu, B.B., Kayaaslan, E., Jonassen, S., Aykanat, C.: A term-based inverted index partitioning model for efficient distributed query processing. TWEB 7(3), 15:1–15:23 (2013) Cambazoglu, B.B., Kayaaslan, E., Jonassen, S., Aykanat, C.: A term-based inverted index partitioning model for efficient distributed query processing. TWEB 7(3), 15:1–15:23 (2013)
7.
Zurück zum Zitat Cutting, D.R., Pedersen, J.O.: Optimizations for dynamic inverted index maintenance. In: Proceedings of SIGIR, pp. 405–411 (1990) Cutting, D.R., Pedersen, J.O.: Optimizations for dynamic inverted index maintenance. In: Proceedings of SIGIR, pp. 405–411 (1990)
8.
Zurück zum Zitat Ding, S., Suel, T.: Faster top-k document retrieval using block-max indexes. In: Proceedings of SIGIR, pp. 993–1002 (2011) Ding, S., Suel, T.: Faster top-k document retrieval using block-max indexes. In: Proceedings of SIGIR, pp. 993–1002 (2011)
9.
Zurück zum Zitat Eisenman, A., et al.: Reducing DRAM footprint with NVM in Facebook. In: Proceedings of EuroSys, pp. 42:1–42:13 (2018) Eisenman, A., et al.: Reducing DRAM footprint with NVM in Facebook. In: Proceedings of EuroSys, pp. 42:1–42:13 (2018)
11.
Zurück zum Zitat Lemire, D., Boytsov, L.: Decoding billions of integers per second through vectorization. Softw. Pract. Exp. 45(1), 1–29 (2015) CrossRef Lemire, D., Boytsov, L.: Decoding billions of integers per second through vectorization. Softw. Pract. Exp. 45(1), 1–29 (2015) CrossRef
12.
Zurück zum Zitat Liu, X., Peng, Z.: An efficient random access inverted index for information retrieval. In: Proceedings of WWW, pp. 1153–1154 (2010) Liu, X., Peng, Z.: An efficient random access inverted index for information retrieval. In: Proceedings of WWW, pp. 1153–1154 (2010)
13.
Zurück zum Zitat Moffat, A., Stuiver, L.: Binary interpolative coding for effective index compression. Inf. Retr. 3(1), 25–47 (2000) Moffat, A., Stuiver, L.: Binary interpolative coding for effective index compression. Inf. Retr. 3(1), 25–47 (2000)
14.
Zurück zum Zitat Ottaviano, G., Tonellotto, N., Venturini, R.: Optimal space-time tradeoffs for inverted indexes. In: Proceedings of WSDM, pp. 47–56 (2015) Ottaviano, G., Tonellotto, N., Venturini, R.: Optimal space-time tradeoffs for inverted indexes. In: Proceedings of WSDM, pp. 47–56 (2015)
15.
Zurück zum Zitat Risvik, K.M., Chilimbi, T.M., Tan, H., Kalyanaraman, K., Anderson, C.: Maguro, a system for indexing and searching over very large text collections. In: Proceedings of WSDM, pp. 727–736 (2013) Risvik, K.M., Chilimbi, T.M., Tan, H., Kalyanaraman, K., Anderson, C.: Maguro, a system for indexing and searching over very large text collections. In: Proceedings of WSDM, pp. 727–736 (2013)
16.
Zurück zum Zitat Robertson, S.E., Jones, K.S.: Relevance weighting of search terms. JASIS 27(3), 129–146 (1976) Robertson, S.E., Jones, K.S.: Relevance weighting of search terms. JASIS 27(3), 129–146 (1976)
17.
Zurück zum Zitat Stepanov, A.A., Gangolli, A.R., Rose, D.E., Ernst, R.J., Oberoi, P.S.: SIMD-based decoding of posting lists. In: Proceedings of CIKM, pp. 317–326 (2011) Stepanov, A.A., Gangolli, A.R., Rose, D.E., Ernst, R.J., Oberoi, P.S.: SIMD-based decoding of posting lists. In: Proceedings of CIKM, pp. 317–326 (2011)
18.
Zurück zum Zitat Turtle, H.R., Flood, J.: Query evaluation: strategies and optimizations. Inf. Process. Manag. 31(6), 831–850 (1995) Turtle, H.R., Flood, J.: Query evaluation: strategies and optimizations. Inf. Process. Manag. 31(6), 831–850 (1995)
19.
Zurück zum Zitat Wang, J., Lo, E., Yiu, M.L., Tong, J., Wang, G., Liu, X.: The impact of solid state drive on search engine cache management. In: Proceedings of SIGIR, pp. 693–702 (2013) Wang, J., Lo, E., Yiu, M.L., Tong, J., Wang, G., Liu, X.: The impact of solid state drive on search engine cache management. In: Proceedings of SIGIR, pp. 693–702 (2013)
20.
Zurück zum Zitat Xia, F., Jiang, D., Xiong, J., Sun, N.: HIKV: a hybrid index key-value store for DRAM-NVM memory systems. In: Proceedings of USENIX, pp. 349–362 (2017) Xia, F., Jiang, D., Xiong, J., Sun, N.: HIKV: a hybrid index key-value store for DRAM-NVM memory systems. In: Proceedings of USENIX, pp. 349–362 (2017)
21.
Zurück zum Zitat Xu, J., Swanson, S.: NOVA: a log-structured file system for hybrid volatile/non-volatile main memories. In: Proceedings of FAST, pp. 323–338 (2016) Xu, J., Swanson, S.: NOVA: a log-structured file system for hybrid volatile/non-volatile main memories. In: Proceedings of FAST, pp. 323–338 (2016)
22.
Zurück zum Zitat Yan, H., Ding, S., Suel, T.: Inverted index compression and query processing with optimized document ordering. In: Proceedings of WWW, pp. 401–410 (2009) Yan, H., Ding, S., Suel, T.: Inverted index compression and query processing with optimized document ordering. In: Proceedings of WWW, pp. 401–410 (2009)
23.
Zurück zum Zitat Zhang, J., Long, X., Suel, T.: Performance of compressed inverted list caching in search engines. In: Proceedings of WWW, pp. 387–396 (2008) Zhang, J., Long, X., Suel, T.: Performance of compressed inverted list caching in search engines. In: Proceedings of WWW, pp. 387–396 (2008)
24.
Zurück zum Zitat Zhang, R., Sun, P., Tong, J., Stones, R.J., Wang, G., Liu, X.: Compact snippet caching for flash-based search engines. In: Proceedings of SIGIR, pp. 1015–1018 (2015) Zhang, R., Sun, P., Tong, J., Stones, R.J., Wang, G., Liu, X.: Compact snippet caching for flash-based search engines. In: Proceedings of SIGIR, pp. 1015–1018 (2015)
25.
Zurück zum Zitat Zukowski, M., Héman, S., Nes, N., Boncz, P.A.: Super-scalar RAM-CPU cache compression. In: Proceedings of ICDE (2006) Zukowski, M., Héman, S., Nes, N., Boncz, P.A.: Super-scalar RAM-CPU cache compression. In: Proceedings of ICDE (2006)
Metadaten
Titel
Pipelined Query Processing Using Non-volatile Memory SSDs
verfasst von
Xinyu Liu
Yu Pan
Wenxiu Fang
Rebecca J. Stones
Gang Wang
Yusen Li
Xiaoguang Liu
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-60290-1_35