Home » Computational Evolutionary Biology

Finding a Tree via Distances

Introduction

An evolutionary tree for a set of entities, each with defined character states, can be constructed using 'tree clustering', i.e. using a cluster analysis method which builds trees. A simple approach is:

  1. Form a matrix of distances between all pairs of entities.
  2. Find the closest pair (entities or clusters).
  3. Combine this pair to form a new cluster. The distances between the new cluster and existing entities or clusters can be determined in several ways, e.g. as the minimum of the distances for each of the pair or the unweighted or weighted average.
  4. Repeat from (2) until only one cluster remains.

Input

The initial distance for every pair of entities is found by summing over all characters, adding 1 for every different state if the character is discrete and the ASCII code difference if the character is ordered. This typically results in a substantial fraction of the distances being the same. To help in reducing ambiguity, for every pair of entities, the average difference between the two distances from each of the pair to every other entity is found. This is divided by 100 and added to the initial distance.

The Fitch/Wagner algorithms are applied to the tree resulting from the clustering process. ('Fitch' and 'Wagner' are used in the sense of Ronquist & Beerli, http://people.sc.fsu.edu/~beerli/ISC5317/Lectures/02pars.pdf. However, the algorithm for the Wagner up-pass is not as given by their 2007 lecture notes: see instead Programming the Wagner Uppass Algorithm.)

Note: there is no validation of the input, so it needs to be in the right format!

Input the list of character states:
Select the distance method used to combine distances:
Page maintained by: Dr Peter Coxhead / Last update: 10 Dec 2008