Random triangles and the curse of dimensionality
Bob Durrant, School of Computer Science
Date and time: Wednesday 2nd March 2011 at 12:00
Location: Room 124, School of Computer Science
Prerequisites: Passing familiarity with basic probability and the ideas of an integral, a limit, and 3D geometry.
Outline: Given three vectors chosen randomly from a standard normal distribution in R^2, what is the probability that the triangle they form is acute or obtuse? It turns out that the probability of an obtuse triangle is exactly 3/4, and this can be shown using either a slightly scary integral or via a simple "proof by pictures" argument.
Now what about if we choose three vectors randomly from a standard normal distribution in some outrageously high dimensional space R^d? Say R^1000 or R^10000? What is the probability that our random triangle is still obtuse? We can no longer draw the picture, so we revert to our scary integral and find that the probability of an obtuse random triangle drops sharply as d increases. In fact in the limit when d -> infinity the probability is zero.
Why should this be? It turns out to be the result of a very general phenomenon in high dimensional space that the random vectors are very likely to have nearly the same length and be nearly orthogonal to each other. This means that in the limit our random triangle is equilateral with probability 1.
Finally, these last facts are aspects of the "curse of dimensionality" and have practical implications for a range of popular data-mining techniques, where the typical domain is indeed often something like R^1000.