3rd Year UG, MSc in Advanced Computer Science

Imaging and Visualisation Systems

Course Material and Useful Links

Peter Tino


Taught jointly with Bob Hendley

Lecture Timetable and Handouts

Here is a preliminary outline of the structure and lecture timetable for my part of the module. I will develop most of the ideas on the blackboard. You are encouraged to take notes during the lectures. Any handouts used will be made available here as pdf files shortly after the paper versions have been distributed.

Week Session 1
Tuesdays 10:00-11:00
Session 2
Thursdays 12:00-13:00
1 B. Hendley B. Hendley
2 Dimensionality reduction of vectorial data. Basic concepts of vector and matrix algebras. B. Hendley
3 Linear models. Principal Component Analysis I. B. Hendley
3 Principal Component Analysis II. B. Hendley
4 Nonlinear methods. Self-organizing topographic maps I. B. Hendley
5 Self-organizing topographic maps II. B. Hendley
6 Probabilistic approaches. Basic probability and statistics. B. Hendley
7 Latent-space reformulations of topographic maps. Generative topographic mapping. B. Hendley
9 Enhancing the information content of visualisation plots. B. Hendley
10 Hierarchical visualisation Fractal Images. Iterative function systems.
11 Mandelbrot and Julia sets. Fractal modelling of real world objects.

12 Two Revision Lectures Covering the Whole Module

Useful Links

Suggested reading + software

Assignment - hand in during revision lectures on May 3 and May 5

Make yourself familiar with my implementation (in C) of SOM. You are very welcome to use your own implementation, or download a working code from the web.
Un-tar the file som.tar.gz and go to the folder "SOM".
The subfolder "SOURCE" contains the c-implementation of som, "som.c".
There is an example in the folder "GAUSS.2D". Please consult the "read.me" file there.

You can chose any data set(s) from the list bellow.
Un-tar the relevant file and go to the corresponding folder.
The folder contains the data set, as well as additional information about the data. Read the available information, especially description of the features (data dimensions).
You will need to clean the data, so that it contains only numerical features (dimensions) and the features are space-separated (not comma-separated.

To make the plots informative, you should come up with a labelling scheme for data points.
If the data can be classified into several classes (find out in the data and feature description!), use that information as the basis for your labelling scheme. In that case exclude the class information from the data dimensions.
Alternatively, you can make labels out of any dimension, e.g. by quantising it into several intervals. For example, if the data dimension represents age of a person, you can quantise it into 5 labels (classes) [child, teenager, young adult, middle age, old].
Associate the data labels with different markers and use the markers to show what kind of data points get projected to different regions of the visualization plot (computer screen).

Before starting to work on the assignment, please carefully study the example I prepared using the boston database. Un-tar the file boston.ex.tar.gz and go to the folder "BOSTON.EX".
The subfolder "FIGURES" contains all the relevant figures as eps or gif files.
Please consult the "boston.read.me" file in BOSTON.EX.

The report should describe experiments with a chosen data set(s) along the lines of `boston example'.
In the labeling scheme, concentrate on more than one coordinate (dimension), e.g. in the `boston example', consider not just the `price' feature, but run separate experiments with `per capita crime rate in the town', or 'pupil-teacher ratio in the town' instead of the `price' coordinate).

In the report concentrate on the following questions:
- How did you preprocess the data?
- What features (coordinates) did you use for labeling the projected points with different markers?
- How did you design the labeling schemes?
- What visualisation techniques did you use?
- What interesting aspects of the data did you detect based on the data visualisations?

You should demonstrate that you
- understand the visualisation techniques used
- are able to extract useful information about otherwise inconceivable high-dimensional data using dimensionality-reducing visualisation techniques.

Data Visualisation using PCA and SOM

Try a Java applet for an interactive PCA/SOM created by Daoxiao Jin.

Source code (tar+gzip)

Preparing for the exam - Sample Questions

Problems to solve for those really interested

Try the models out! - Benchmark data sets

As you get familiar with different types of models, try them out on benchmark data sets people in the machine learning community have been using to support their claims about yet another excellent learning system :-)

Here are two of the widely used data repisitories that contain data description, data itself and other useful things, like previously obtained results.

  • DELVE - Data for Evaluating Learning in Valid Experiments
  • UCI Knowledge Discovery in Databases Archive

    Aims, Objectives, and Assessment

    For formal details about the aims and objectives and assessment you should look at the official Module Description Page and Syllabus Page.

    There are two components to the assessment of this module: A two hour examination (80%) and a continuous assessment by mini-project report (20%).

    As the material is developed I will give you ideas of the standard and type of questions you can expect in this year's examination. I will address questions related to the material covered in previous lectures in great detail during the timetabled Exercise Sessions.

    Recommended Books

    The Recommended Books for this module are:

    Title Author(s) Publisher, Date Comments
    Introduction to Visualisation and Virtual Environments Chaomei Chen Springer, 2002 -
    Data Visualization: The state of the art Frits H. Post, Gregory M. Nielson, Georges-Pierre Bonneau Kluwer Academic, 2002 -
    Fractals Everywhere Michael F. Barnsley Morgan Kaufmann, 2000 Fractal image generation/compression mostly via iterative function systems. Highly recommended for mathematically minded students.
    The Computational Beauty of Nature: Computer Explorations of Fractals, Chaos, Complex Systems, and Adaptation William Gary Flake Bradford Book, 2000 A beautiful book exploring links between science and art.
    Neural Networks: A Comprehensive Foundation Simon Haykin Prentice Hall, 1999 Very comprehensive, a bit heavy in maths.
    Neural Networks for Pattern Recognition Christopher Bishop Clarendon Press, Oxford, 1995 Highly recommended for mathematically minded students.

    This page is maintained by Peter Tino. Last updated on 1 March 2004.