Skip to main content
Top

1990 | Book

Trees and Hierarchical Structures

Proceedings of a Conference held at Bielefeld, FRG, Oct. 5–9th, 1987

Editors: Andreas Dress, Arndt von Haeseler

Publisher: Springer Berlin Heidelberg

Book Series : Lecture Notes in Biomathematics

insite
SEARCH

About this book

The "raison d'etre" of hierarchical dustering theory stems from one basic phe­ nomenon: This is the notorious non-transitivity of similarity relations. In spite of the fact that very often two objects may be quite similar to a third without being that similar to each other, one still wants to dassify objects according to their similarity. This should be achieved by grouping them into a hierarchy of non-overlapping dusters such that any two objects in ~ne duster appear to be more related to each other than they are to objects outside this duster. In everyday life, as well as in essentially every field of scientific investigation, there is an urge to reduce complexity by recognizing and establishing reasonable das­ sification schemes. Unfortunately, this is counterbalanced by the experience of seemingly unavoidable deadlocks caused by the existence of sequences of objects, each comparatively similar to the next, but the last rather different from the first.

Table of Contents

Frontmatter
1. Introduction
Abstract
The “raison d’être” of hierarchical clustering theory stems from one basic phenomenon: This is the notorious non-transitivity of similarity relations. In spite of the fact that very often two objects may be quite similar to a third without being that similar to each other, one still wants to classify objects according to their similarity. This should be achieved by grouping them into a hierarchy of non-overlapping clusters such that any two objects in one cluster appear to be more related to each other than they are to objects outside this cluster.
Andreas W. M. Dress, Arndt von Haeseler
2. Reconstruction of Phylogenies by Distance Data: Mathematical Framework and Statistical Analysis
Abstract
The reconstruction of phylogenies is an important problem in biological systematics since Darwin recognized natural selection as scientific cause of biological evolution. Since the early beginnings, cp. Haeckel’s rule of ‘recapitulation of phylogeny in ontogeny’, a main problem in the reconstruction of phylogenies has been to measure homology, i.e. the natural similarity of species given by common ancestors and not a similarity given by adaptation to a common environment.
Paul O. Degens, Berthold Lausen, Werner Vach
3. Additive-Tree Representations
Abstract
Additive-trees are used to represent objects as “leaves” on a tree, so that the distance on the tree between two leaves reflects the similarity between the objects. Formally, an observed similarity δ is represented by a tree-distance d. As such, additive-trees belong to the descriptive multivariate statistic tradition. Additive-tree representations are useful in a wide variety of domains.
Hervé Abdi
4. Finding the Minimal Change in a Given Tree
Abstract
It is a common task in biology to determine the genealogy of species, populations, people, or genes and estimate the condition of the ancestral forms. That is often done for molecules such as proteins and nucleic acids. There are many procedures. I address here only the parsimony procedures which ask, “ How can I account for the descent of these various sequences from a common ancestor with the fewest number of changes?” The general problem being addressed is as follows. One has a set of s sequences, each sequence being a linear string of letters from some alphabet. In molecular biology the sequences are either proteins, of which there are 20 letters (amino acids), or nucleic acids, of which there are 4 letters (nucleotides). We assume that the sequences have been aligned by some method so that they are all of the same length, t. It is assumed that these s sequences arose from a common ancestral sequence by a branching process that is properly described as a strictly bifurcating tree, that is, as a graph in which there is one and only one path connecting any two nodes on the tree. The tree has s tips (exterior nodes of degree one), one for each of the s sequences and s — 2 interior nodes of degree three, plus one node of degree two, called the root, that is the ultimate ancestor, the node at which the branching process began. The edges connecting two adjacent nodes are called branches. The task is to discover for any given tree topology, the minimum amount of change, and its nature, on each branch.
Patrick L. Williams, Walter M. Fitch
5. Search, Parallelism, Comparison, and Evaluation: Algorithms for Evolutionary Trees
Abstract
In this lecture we discuss four aspects of the construction and testing of evolutionary trees. These are the large number of possible trees, the potential for parallel computing, the need for an objective comparison of trees, and testing hypotheses about ‘unique’ events that occurred at remote times in the past. In a short space it is not possible to give a complete review of all four subjects. Instead we will concentrate on our own approaches but give references that allow an introduction to the wider literature. More formal details are given in the relevant publications.
David Penny, Pauline Penny
6. The Phylogeny of Prochloron: Is There Numerical Evidence from S AB Values? A Response to Van Valen
Abstract
Numerical methods for constructing phylogenies are widely used in molecular biology: often similarity coefficients are derived from (partial) sequence data, and standard clustering algorithms are then applied to produce appropriate dendrograms (Fox et al., 1980; Sneath & Sokal, 1973). Alternative dendrograms based on the same data set may be evaluated by means of numerical criteria. Such criteria do not always seem to be on a sound logical basis: an instance is provided by Van Valen’s methodology of evaluating conflicting estimated phylogenies (Van Valen, 1982); the data in question are S AB values for Prochloron some cyanobacteria and chloroplasts, as reported by Seewaldt & Stackebrandt (1982).
Hans-Jürgen Bandelt, Arndt von Haeseler
7. Evolution of the Collagen Fibril by Duplication and Diversification of a Small Primordial Exon Unit
Abstract
A connection is established between the exon/intron organization of the gene and the aggregation of the molecules of the fiber forming collagen types I, II, and III. The exon structure is found to be reflected in the alternating interaction pattern that governs the D-staggered aggregation of the molecules in the fibril. The origin of the fibril formation is discussed.
Hans Hofmann, Klaus Kühn
8. The Poincare Paradox and the Cluster Problem
Abstract
The motivation behind this paper is to put together two streams of ideas — Poincaré’s perception of the physical continuum and the problem to give a characterization of similarity relations in terms of clusters. As the reader will see below both streams encompass the question of finding characteristics of non-transitive systems.
Ulrich Höhle
9. An Incremental Error Correcting Evaluation Algorithm For Recursion Networks without Circuits
Abstract
A recursion network consists of a directed graph together with an evaluation function v on the set of nodes of the graph, such that the value of every nonterminal node recursively results from the values of its successors. An example is a game graph with a grundy-function. Consider the problem to determine the value of a specified node ‘root’, when instead of v only an erroneous estimation function \( \hat v\) on the set of all nodes is given. (Typically v and \( \hat v\) will coincide in many, but not in all nodes.)
This paper presents a simple algorithm which yields the correct value v (root), if on every longpath with starting node ‘root’ the estimates \( \hat v\) provide “more correct than wrong information” about the v-values. A longpath is a path that cannot be prolonged.
Ingo Althöfer
Backmatter
Metadata
Title
Trees and Hierarchical Structures
Editors
Andreas Dress
Arndt von Haeseler
Copyright Year
1990
Publisher
Springer Berlin Heidelberg
Electronic ISBN
978-3-662-10619-8
Print ISBN
978-3-540-52453-3
DOI
https://doi.org/10.1007/978-3-662-10619-8