Skip to main content
Top
Published in: The Journal of Supercomputing 8/2016

01-08-2016

Exploring large-scale small file storage for search engines

Authors: Weizhe Zhang, Gangzhao Lu, Hui He, Qizhen Zhang, Chuanliang Yu

Published in: The Journal of Supercomputing | Issue 8/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Large-scale small file storage for original pages degrades performance of search engines. In this paper, we first analyze the disadvantages of the existing EXT3 file system in accessing small files. Then, the rate and speed of compression algorithms are verified to choose a proper storage compression algorithm. Meanwhile, we design an original page oriented file organization structure and a read–write query tree to store the large-scale small files which need no modification. The accessing response time and disk space waste are remarkably decreased when search engines use these techniques to store original-page small files.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
3.
go back to reference Welch TA (1984) A technique for high-performance data compression. Computer 17(6):8–19CrossRef Welch TA (1984) A technique for high-performance data compression. Computer 17(6):8–19CrossRef
5.
go back to reference Tweedie SC (1998) Journaling the Linux ext2fs filesystem. In: Proceedings of the 4th Annual LinuxExpo, Durham, NC Tweedie SC (1998) Journaling the Linux ext2fs filesystem. In: Proceedings of the 4th Annual LinuxExpo, Durham, NC
9.
go back to reference Rosenblum M, Ousterhout JK (1992) The design and implementation of a log-structured file system. ACM Trans Comput Syst (TOCS) 10(1):26–52CrossRef Rosenblum M, Ousterhout JK (1992) The design and implementation of a log-structured file system. ACM Trans Comput Syst (TOCS) 10(1):26–52CrossRef
10.
go back to reference Zhang WZ, Chen HX, He H, Chen G (2014) A two-tier distributed full-text indexing system. Appl Math 8(1):321–326 Zhang WZ, Chen HX, He H, Chen G (2014) A two-tier distributed full-text indexing system. Appl Math 8(1):321–326
11.
go back to reference Zhang WZ, He H, Ye J (2013) A two-level cache for distributed information retrieval in search engines. Sci World J (2013) Zhang WZ, He H, Ye J (2013) A two-level cache for distributed information retrieval in search engines. Sci World J (2013)
12.
go back to reference Zhang WZ, He H, Zhang Q (2012) Original-page small file oriented EXT3 file storage system. ASTL 5 (Software Technology) Zhang WZ, He H, Zhang Q (2012) Original-page small file oriented EXT3 file storage system. ASTL 5 (Software Technology)
Metadata
Title
Exploring large-scale small file storage for search engines
Authors
Weizhe Zhang
Gangzhao Lu
Hui He
Qizhen Zhang
Chuanliang Yu
Publication date
01-08-2016
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 8/2016
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-015-1394-z

Other articles of this Issue 8/2016

The Journal of Supercomputing 8/2016 Go to the issue

Premium Partner