Here is a preliminary outline of the structure and lecture timetable for my part of the module. I will develop most of the ideas on the blackboard. You are encouraged to take notes during the lectures. Any handouts used will be made available here as pdf files shortly after the paper versions have been distributed.
Week |
Session 1 Tuesdays 10:00-11:00 |
Session 2 Thursdays 12:00-13:00 |
---|---|---|
1 | B. Hendley | B. Hendley |
2 | Dimensionality reduction of vectorial data. Basic concepts of vector and matrix algebras. | B. Hendley |
3 | Linear models. Principal Component Analysis I. | B. Hendley |
3 | Principal Component Analysis II. | B. Hendley |
4 | Nonlinear methods. Self-organizing topographic maps I. | B. Hendley |
5 | Self-organizing topographic maps II. | B. Hendley |
6 | Probabilistic approaches. Basic probability and statistics. | B. Hendley |
7 | Latent-space reformulations of topographic maps. Generative topographic mapping. | B. Hendley |
9 | Enhancing the information content of visualisation plots. | B. Hendley |
10 | Hierarchical visualisation | Fractal Images. Iterative function systems. |
11 | Mandelbrot and Julia sets. | Fractal modelling of real world objects. |
12 | Two Revision Lectures Covering the Whole Module |
Make yourself
familiar with my
implementation (in C) of SOM.
You are very welcome to use your own implementation, or
download a working code from the web.
Un-tar the file
som.tar.gz and go to the folder
"SOM".
The subfolder "SOURCE" contains the c-implementation of som,
"som.c".
There is an example in the folder "GAUSS.2D".
Please consult the "read.me" file there.
You can chose any data set(s) from the list bellow.
Un-tar the relevant file and go to the corresponding folder.
The folder contains the data set, as well as additional information
about the data. Read the available information, especially
description of the features (data dimensions).
You will need to clean the data, so that it contains
only numerical features (dimensions)
and the features are space-separated (not comma-separated.
To make the plots informative, you should
come up with a labelling scheme for data points.
If the data can be classified into several classes
(find out in the data and feature description!), use that information
as the basis for your labelling scheme.
In that case exclude the class information
from the data dimensions.
Alternatively, you can make labels out of any dimension,
e.g. by quantising it into several intervals. For example,
if the data dimension represents age of a person,
you can quantise it into 5 labels (classes)
[child, teenager, young adult, middle age, old].
Associate the data labels with different markers and
use the markers to show what kind of data points get projected
to different regions of the visualization plot
(computer screen).
Try a
Java applet for an interactive PCA/SOM created by Daoxiao Jin.
Source code (tar+gzip)
As you get familiar with different types of models, try them out on benchmark data sets people in the machine learning community have been using to support their claims about yet another excellent learning system :-)
Here are two of the widely used data repisitories that contain data description, data itself and other useful things, like previously obtained results.
For formal details about the aims and objectives and assessment you should look at the official Module Description Page and Syllabus Page.
There are two components to the assessment of this module: A two hour examination (80%) and a continuous assessment by mini-project report (20%).
As the material is developed
I will give you
ideas of the standard and type of questions you can expect
in this year's examination.
I will address questions related to the material covered in previous
lectures in great detail during the timetabled
Exercise Sessions.
The Recommended Books for this module are:
Title | Author(s) | Publisher, Date | Comments |
---|---|---|---|
Introduction to Visualisation and Virtual Environments | Chaomei Chen | Springer, 2002 | - |
Data Visualization: The state of the art | Frits H. Post, Gregory M. Nielson, Georges-Pierre Bonneau | Kluwer Academic, 2002 | - |
Fractals Everywhere | Michael F. Barnsley | Morgan Kaufmann, 2000 | Fractal image generation/compression mostly via iterative function systems. Highly recommended for mathematically minded students. |
The Computational Beauty of Nature: Computer Explorations of Fractals, Chaos, Complex Systems, and Adaptation | William Gary Flake | Bradford Book, 2000 | A beautiful book exploring links between science and art. |
Neural Networks: A Comprehensive Foundation | Simon Haykin | Prentice Hall, 1999 | Very comprehensive, a bit heavy in maths. |
Neural Networks for Pattern Recognition | Christopher Bishop | Clarendon Press, Oxford, 1995 | Highly recommended for mathematically minded students. |