This module is assessed by 80% examination and 20% continuous assessment. This document specifies the continuous assessment component.
The objective of this exercise is for you to gain practical experience in designing, implementing, training and optimizing a neural network to carry out a specific real world task.
This year the task is the Optical Recognition of Handwritten Digits, and the data sets you
must use are provided by the
UCI Machine Learning Repository.
A paper copy of your report must be handed in to the School Office, and the associated
files submitted using the Canvas VLE, by 12noon on Wednesday 13 January 2016.
If you miss this deadline, your mark will be reduced by 5% (out of 100%) for each School working
day (or part thereof) your submission is late.
Feedback with marks will be returned within three weeks of the hand-in deadline.
What You Need To Do
(1) Download the data sets and descriptions.
(2) Design a suitable Multi-Layer Perceptron neural network that can learn from the training data set to generalize well to the testing data set. Hint: Some form of Back-Propagation learning algorithm will probably be most appropriate.
(3) Implement your neural network, using any programming language you like, or by setting it up in an existing neural network simulator. You may use any code or simulator you find on the internet, as long as you properly reference the sources in your report, or you can program your own neural network from scratch. If, like most students have done in the past, you decide to write your own code, you may find John Bullinaria's Step by Step Guide to Implementing a Simple Neural Network in C helpful to get you started. Hint: Writing your own code often proves easier than figuring out how to get someone else's simulator to do what you need, and you will probably learn more, but there is limited credit/marks available for the implementation aspect of this exercise.
(4) Experiment systematically with the neural network details that you think are most likely to improve the generalization performance you achieve. Hints: Consider how you are going to avoid under-fitting and over-fitting of the training data, and what paramters are going to have the biggest effect on that. It is usually better to aim to optimize two or three things well, rather than a large number of things not very well.
(5) Write a report explaining what you did and what you found. You should include the following sections:
1. Introduction - Say what the data sets involved and what you aimed to achieve. [5%]
2. Design - Describe and justify the neural network you designed for the task, and the factors you decided to experiment with. [15%]
3. Implementation - Describe how you implemented your neural network and the associated performance analysis mechanisms. Explain why you chose to do it that way. Remember to cite any sources you used. [20%]
4. Experiments - Describe the experiments you carried out to optimise your network's generalization performance, and present the results you obtained. Explain in detail how you used the training and testing data sets. The results should be presented in a statistically rigorous manner, e.g. by computing simple statistics of performance measures (such as means and standard deviations of percentages correct) across multiple runs of network training/testing, and plotting graphs with error-bars. [50%]
5. Conclusions - Summarize your key findings, including which factors proved most crucial, and what was the best generalization performance you achieved. [10%]
Don't forget that this exercise corresponds to only 20% of a 10 credit module.
Spend an appropriate amount of time on it!
Assessment
You need to submit a paper copy of your report to the School Office, and upload to the Canvas VLE a .pdf version of your report, source files for your report (e.g., LaTeX or Microsoft Word files), the source code of your neural network with instructions for using it, and any results files you think appropriate.
A reasonable length for the report would be between 3000 and 4000 words, plus as many diagrams,
tables and graphs as you think appropriate. Marks will be awarded according to the percentage
proportions indicated above, based on what you did and how well you described it.
Help and Advice
If you get stuck, feel free to ask for advice at one of John Bullinaria's office hours which are at 12noon every Tuesday during Autumn Term in Room 113 of the School of Computer Science. Or talk to him at the end of any of the module lectures.