ABSTRACT
Our previous research on one-probe access to large collections of data indexed by alphanumeric keys has produced the first practical minimal perfect hash functions for this problem. Here, a new algorithm is described for quickly finding minimal perfect hash functions whose specification space is very close to the theoretical lower bound, i.e., around 2 bits per key. The various stages of processing are detailed, along with analytical and empirical results, including timing for a set of over 3.8 million keys that was processed on a NeXTstation in about 6 hours.
- 1.Richard Barnhart. The Advanced Naval Message Analyzer. Videotape production, VPI&SU, November 1990. Richard Barnhart as host, producer, script-writer; Edward Fox as executive producer, project director. Discusses the CODER system.Google Scholar
- 2.N. Cercone, M. Krause, and J. Boates. Minimal and almost minimal perfect hash function search with application to natural language lexicon design. Computers and Mathematics with Applications, 9:215-231, 1983.Google ScholarCross Ref
- 3.C.C. Chang. Letter oriented reciprocal hashing scheme. Information Sciences, 38:243-255, 1986. Google ScholarDigital Library
- 4.Qi Fan Chen. An object-oriented database system for efficient information retrieval applications. PhD thesis, Virginia Tech Dept. of Computer Science, March 1992.Google Scholar
- 5.R. J. Cichelli. Minimal perfect hash functions made simple. Communications of the Association for Computing Machinery, 23:17-19.19g0. Google ScholarDigital Library
- 6.E. A. Fox and Robert K. France. Architecture of an expert system for composite document analysis, representation and retrieval. International Journal of Approximate Reasoning, 1(1): 151-175, 1987.Google ScholarCross Ref
- 7.Edward A. Fox. Development of the CODER system: A testbe~ for artificial intelligence methods in information retrieval, information Processing & Management, 23(4):341-366, 1987. Google ScholarDigital Library
- 8.Edward A. Fox, Qi Fan Chen, Amjad M. Daoud, and Lenwood S. Heath. Order-preserving minimal perfect hash functions and information retrieval. ACM Transactions on lnformationSystems, 9(3):281-308, July 199 I. Google ScholarDigital Library
- 9.Edward A. Fox, Lenwood S. Heath, Qi Fan Chen, and Amjad M. Daoud. Practical minimal perfect hash functions for large databases. Communications of the Association for Computing Machinery, 35(1): 105-121, January 1992. Google ScholarDigital Library
- 10.Edward A. Fox, M. Prabhakar Koushik, Qi Fan Chen, and Robert K. France. Integrated access to a large medical literature database. Technical Report TR-91-15, Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA, May 1991. Google ScholarDigital Library
- 11.M.L. Fredman and J. Koml6s. On the size of separating systems and families of perfect hash functions. SIAM Journal on Algebraic and Discrete Methods', 5:61-68, 1984.Google Scholar
- 12.M.L. Fredman, J. Koml6s, and E. Szemer&li. Storing a sparse table with O(1) worst case access time. Journal of the Association for Computing Machinery, 31:538- 544, 1984. Google ScholarDigital Library
- 13.G. Jaeschke. Reciprocal hashing - a method for generating minimal perfect hash functions. Communications of the Association for Computing Machinery, 24:829- 833, 1981. Google ScholarDigital Library
- 14.K. Mehlhorn. On the program size of perfect and universal hash functions. In Proceedings of the 23rd Annual IEEE Symposium on Foundations of Computer Science, pages 170-175, 1982.Google ScholarDigital Library
- 15.P. K. Pearson. Fast hashing of variable-length text strings. Communications of the Association for Computing Machinery, 33(6):677-680, June 1990. Google ScholarDigital Library
- 16.T. J. Sager. A polynomial time generator for minimal perfect hash functions. Communications of the Association for Computing Machinery, 28:523-532, 1985. Google ScholarDigital Library
- 17.M. Weaver, R. France, Q. Chen, and E. Fox. Using a frame-based language for information retrieval. International Journal of Intelligent Systems, 4(3):223-257, 1989.Google ScholarDigital Library
Index Terms
- A faster algorithm for constructing minimal perfect hash functions
Recommendations
Indexing internal memory with minimal perfect hash functions
SBBD '08: Proceedings of the 23rd Brazilian symposium on DatabasesA perfect hash function (PHF) is an injective function that maps keys from a set S to unique values, which are in turn used to index a hash table. Since no collisions occur, each key can be retrieved from the table with a single probe. A minimal perfect ...
Comments