Abstract
Andersson and Nilsson introduced in 1993 a level-compressed trie (for short, LC trie) in which a full subtree of a node is compressed to a single node of degree being the size of the subtree. Recent experimental results indicated a “dramatic improvement” when full subtrees are replaced by “partially filled subtrees.” In this article, we provide a theoretical justification of these experimental results, showing, among others, a rather moderate improvement in search time over the original LC tries. For such an analysis, we assume that n strings are generated independently by a binary memoryless source, with p denoting the probability of emitting a “1” (and q = 1 − p). We first prove that the so-called α-fillup level Fn(α) (i.e., the largest level in a trie with α fraction of nodes present at this level) is concentrated on two values with high probability: either Fn(α) = kn or Fn(α) = kn + 1, where kn = log1/√pq n − |ln (p/q)|/2 ln3/2 (1√pq) Φ−1 (α) √ ln n + O(1) is an integer and Φ(x) denotes the normal distribution function. This result directly yields the typical depth (search time) Dn(α) in the α-LC tries, namely, we show that with high probability Dn(α) ∼ C2 log log n, where C2 = 1/|log(1 − h/log(1/√pq))| for p ≠ q and h = −plog p−qlog q is the Shannon entropy rate. This should be compared with recently found typical depth in the original LC tries, which is C1log log n, where C1 = 1/|log(1−h/log(1/min{p, 1−p}))|. In conclusion, we observe that α affects only the lower term of the α-fillup level Fn(α), and the search time in α-LC tries is of the same order as in the original LC tries.
- Andersson, A., and Nilsson, S. 1993. Improved behavior of tries by adaptive branching, Inf. Proc. Lett. 46, 295--300. Google ScholarDigital Library
- Devroye, L. 1992. A note on the probabilistic analysis of Patricia tries. Random Struct. Alg. 3, 203--214.Google ScholarDigital Library
- Devroye, L. 2001. An analysis of random LC tries. Random Struc. Alg. 19, 359--375. Google ScholarDigital Library
- Devroye, L., and Szpankowski, W. 2005. Probabilistic behavior of asymmetric level compressed tries. Random Struct. Alg. 27, 2, 185--200. Google ScholarDigital Library
- de Moivre, A. 1738. The Doctrine of Chances, 2nd ed., H. Woodfall, London.Google Scholar
- Feller, W. 1971. An Introduction to Probability Theory and its Applications. Vol. II. 2nd ed., Wiley, New York.Google Scholar
- Gusfield, D. 1997. Algorithms on Strings, Trees, and Sequences. Cambridge University Press, New York. Google ScholarDigital Library
- Iivonen, P., Nilsson, S., and Tikkanen, M. 1999. An experimental study of compression methods for functional tries. In Workshop on Algorithmic Aspects of Advanced Programming Languages (WAAAPL).Google Scholar
- Jacquet, P., and Szpankowski, W. 1991. Analysis of digital tries with Markovian dependency. IEEE Trans. Inf. Theory 37, 1470--1475.Google ScholarDigital Library
- Jacquet, P., and Szpankowski, W. 1998. Analytical dePoissonization and its applications. Theor. Comput. Sci. 201, 1--62. Google ScholarDigital Library
- Knessl, C., and Szpankowski, W. 2004. On the number of full levels in tries. Random Struct. Alg. 25, 247--276. Google ScholarDigital Library
- Knuth, D. E. 1997. The Art of Computer Programming. Vol. 1: Fundamental Algorithms, 3rd ed. Addison-Wesley, Reading, MA. Google ScholarDigital Library
- Knuth, D. E. 2000. Selected Papers on Analysis of Algorithms, CSLI, Stanford, CA. Google ScholarDigital Library
- Mahmoud, H. 1992. Evolution of Random Search Trees. John Wiley and Sons, New York.Google Scholar
- Nilsson, S. 1996. Radix Sorting and Searching. PhD Thesis, Lund University.Google Scholar
- Nilsson, S., and Karlsson, G. 1998. Fast address look-up for Internet routers. In Proceedings of the IFIP 4th International Conference on Broadband Communications, 11--22. Google ScholarDigital Library
- Nilsson, S., and Karlsson, G. 1999. IP-address lookup using LC-tries. IEEE J. Select. Areas in Commun. 17, 6, 1083--1092. Google ScholarDigital Library
- Nilsson, S., and Tikkanen, M. 2002. An experimental study of compression methods for dynamic tries. Algorithmica 33, 1, 19--33.Google ScholarDigital Library
- Pittel, B. 1985. Asymptotic growth of a class of random trees. Ann. Probab. 18, 414--427.Google ScholarCross Ref
- Pittel, B. 1986. Paths in a random digital tree: limiting distributions, Adv. Appl. Probab. 18, 139--155.Google ScholarCross Ref
- Reznik, Y. 2002. Some results on tries with adaptive branching. Theor. Comput. Sci. 289, 1009--1026. Google ScholarDigital Library
- Reznik, Y. 2005. On the average density and selectivity of nodes in multi-digit tries. In Proceedings of the 7th Workshop on Algorithm Engineering and Experiments and the 2nd Workshop on Analytic Algorithmics and Combinatorics (ALENEX/ANALCO) (Vancouver, British Columbia, Canada). SIAM, 230--239.Google Scholar
- Srinivasan, V., and Varghese, G. 1998. Fast address lookups using controlled prefix expansions. ACM SIGMETRICS. Google ScholarDigital Library
- Szpankowski, W. 1991. On the height of digital trees and related problems. Algorithmica 6, 256--277.Google ScholarDigital Library
- Szpankowski, W. 2001. Average Case Analysis of Algorithms on Sequences. John Wiley, New York. Google ScholarDigital Library
Index Terms
- Partial fillup and search time in LC tries
Recommendations
Partial fillup and search time in LC tries
ANALCO '06: Proceedings of the Meeting on Analytic Algorithmics and CombinatoricsAndersson and Nilsson introduced in 1993 a level-compressed trie (in short: LC trie) in which a full subtree of a node is compressed to a single node of degree being the size of the subtree. Recent experimental results indicated a "dramatic improvement" ...
Profiles of PATRICIA Tries
A PATRICIA trie is a trie in which non-branching paths are compressed. The external profile $$B_{n,k}$$Bn,k, defined to be the number of leaves at level k of a PATRICIA trie on n nodes, is an important "summarizing" parameter, in terms of which several ...
Succinct indexes for strings, binary relations and multilabeled trees
We define and design succinct indexes for several abstract data types (ADTs). The concept is to design auxiliary data structures that ideally occupy asymptotically less space than the information-theoretic lower bound on the space required to encode the ...
Comments