On the height of digital trees and related problems

Szpankowski, Wojciech

doi:10.1007/BF01759045

On the height of digital trees and related problems

Published: June 1991

Volume 6, pages 256–277, (1991)
Cite this article

Algorithmica Aims and scope Submit manuscript

Wojciech Szpankowski¹

112 Accesses
43 Citations
Explore all metrics

Abstract

This paper studies in a probabilistic framework some topics concerning the way words (strings) can overlap, and relationship of this to the height of digital trees associated with this set of words. A word is defined as a random sequence of (possibly infinite) symbols over a finite alphabet. A key notion of analignment matrix {C _ij} ⁿ_i,j=1 is introduced whereC _ij is the length of the longest string that is a prefix of theith and thejth word. It is proved that the height of an associated digital tree is simply related to the alignment matrix through some order statistics. In particular, using this observation and proving some inequalities for order statistics, we establish that the height of adigital trie under anindependent model (i.e., all words are statistically independent) is asymptotically equal to 2 log_α n wheren is the number of words stored in the trie and α is a parameter of the probabilistic model. This result is generalized in three directions, namely we considerb-tries,Markovian model (i.e., dependency among letters in a word), and adependent model (i.e., dependency among words). In particular, when consecutive letters in a word are Markov dependent (Markovian model), then we demonstrate that the height converges in probability to 2 log_θ n where θ is a parameter of the underlying Markov chain. On the other hand, for suffix trees which fall into the dependent model, we show that the height does not exceed 2 log_κ n, where κ is a parameter of the probabilistic model. These results find plenty of applications in the analysis of data structures built over digital words.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Variety of Shapes in Digital Trees

Article 09 July 2016

On the Tree of Ternary Square-Free Words

Longest Common Extensions in Trees

References

A. V. Aho, J. E. Hopcroft, and J. D. Ullman,The Design and Analysis of Computer Algorithms, Addison-Wesley, Reading, MA (1974).
MATH Google Scholar
D. Knuth,The Art of Computer Programming. Sorting and Searching, vol. III, Addison-Wesley, Reading, MA (1973).
Google Scholar
A. Apostolico, The Myriad Virtues of Suffix Trees,Combinatorial Algorithms on Words, pp. 85–96, Springer-Verlag, New York (1985).
Google Scholar
R. Fagin, J. Nievergelt, N. Pippenger, and H. Strong, Extendible Hashing: A Fast Access Method for Dynamic Files,ACM TODS,4, 315–344 (1979).
Article Google Scholar
P. Flajolet, On the Performance Evaluation of Extendible Hashing and Trie Searching,Acta Inform.,20, 345–369 (1983).
Article MATH MathSciNet Google Scholar
R. Gallager,Information Theory and Reliable Communications, Wiley, New York (1968).
Google Scholar
J. Capetanakis, Tree Algorithms for Packet Broadcast Channels,IEEE Trans. Inform. Theory,25, 505–525 (1979).
Article MATH MathSciNet Google Scholar
IEEE Transaction Inform. Theory,31, 2 (1985).
Ph. Jacquet and M. Regnier,Trie Partitioning Process: Limiting Distributions, Lecture Notes in Computer Science, vol. 214, pp. 196–210, Springer-Verlag, Berlin, 1986.
Google Scholar
L. Devroye, A Probabilistic Analysis of the Height of Tries and of the Complexity of Trie Sort,Acta Inform.,21, 229–232 (1984).
Article MATH MathSciNet Google Scholar
B. Pittel, Asymptotic Growth of a Class of Random Trees,Ann. Probab.,13, 414–427 (1985).
Article MATH MathSciNet Google Scholar
B. Pittel, Path in a Random Digital Tree: Limiting Distributions,Adv. in Appl. Probab.,18, 139–155 (1986).
Article MATH MathSciNet Google Scholar
M. Regnier, On the Average Height of Trees in Digital Searching and Dynamic Hashing,Inform. Process. Lett.,13, 64–66 (1981).
Article MATH MathSciNet Google Scholar
A. Yao, A Note on the Analysis of Extendible Hashing,Inform. Process. Lett.,11, 84–86 (1980).
Article MATH MathSciNet Google Scholar
W. Szpankowski, On the Analysis of the Average Height of a Digital Trie: Another Approach, CSD TR-646, Purdue University (1986).
A. Apostolico and W. Szpankowski, Self-Alignments in Words and Their Applications, CSD TR-732, Purdue University (1987), submitted to a journal.
W. Szpankowski, Some Results onV-ary Asymmetric Tries,J. Algorithms,9, 224–244 (1988).
Article MATH MathSciNet Google Scholar
P. Kirschenhofer, H. Prodinger, and W. Szpankowski, On the Variance of the External Path Length in a Symmetric Digital Trie,Discrete Appl. Math.,25, 129–143 (1989).
Article MATH MathSciNet Google Scholar
H. David,Order Statistics, Wiley, New York (1980).
Google Scholar
J. Galambos,The Asymptotic Theory of Extreme Order Statistics, Wiley, New York (1978).
MATH Google Scholar
T. Lai and H. Robbins, A Class of Dependent Random Variables and Their Maxima,Z. Wahrsch. Vern. Gebiete,42, 89–111 (1978).
Article MATH MathSciNet Google Scholar
P. Billingsley,Probability and Measures, Wiley, New York (1986).
Google Scholar
W. Szpankowski, (Probably) Optimal Solutions to Some Problems NOT Only on Graphs, CSD TR-780, Purdue University (1988); revision TR-872 (1989).
D. Knuth,Art of Computer Programming. Fundamental Algorithms, vol. I, Addison-Wesley, Reading, MA (1973).
Google Scholar
D. Sankoff and J. Kruskal (eds.),Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparisons, Addison-Wesley, Reading, MA (1983).
Google Scholar
Bull. Math. Biol,46, No. 4 (1984) (a special commemorative issue honoring Margonet O. Dayhoff).
D. Aldous,Probability Approximations via the Poisson Clumping Heristic, Springer-Verlag, New York (1989).
Google Scholar
N. Kamarkar, R. M. Karp, G. S. Lueker, and A. D. Odlyzko, Probabilistic Analysis of Optimum Partitioning,J. Appl. Probab.,23, 626–645 (1986).
Article MathSciNet Google Scholar
S. Karlin and F. Ost, Counts of Long Aligned Word Matches Among Random Letter Sequences,Adv. Appl. Probab.,19, 293–351 (1987).
Article MATH MathSciNet Google Scholar
J. F. C. Kingman, Subadditive Processes, inEcole d'Eté de Probabilités de Saint-Flour V—1975, Lecture Notes in Mathematics, vol. 539, Springer-Verlag, Berlin, 1976.
Google Scholar
L. Devroye, W. Szpankowski, and B. Rais, A Note on the Height of Suffix Trees, CSD TR-905, Purdue University (1989).

Download references

Author information

Authors and Affiliations

Department of Computer Science, Purdue University, 47907, West Lafayette, IN, USA
Wojciech Szpankowski

Authors

Wojciech Szpankowski
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Communicated by Philippe Flajolet.

This research was supported by NSF Grants NCR-8702115 and CCR-8900305, in part by Grant R01 LM05118 from the National Library of Medicine, and by AFOSR Grant 90-0107.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Szpankowski, W. On the height of digital trees and related problems. Algorithmica 6, 256–277 (1991). https://doi.org/10.1007/BF01759045

Download citation

Received: 30 November 1987
Revised: 15 August 1989
Issue Date: June 1991
DOI: https://doi.org/10.1007/BF01759045

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the height of digital trees and related problems

Abstract

Access this article

Similar content being viewed by others

On the Variety of Shapes in Digital Trees

On the Tree of Ternary Square-Free Words

Longest Common Extensions in Trees

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

On the height of digital trees and related problems

Abstract

Access this article

Similar content being viewed by others

On the Variety of Shapes in Digital Trees

On the Tree of Ternary Square-Free Words

Longest Common Extensions in Trees

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation