Number of hidden neurons

Introduction

Introducing a hidden layer in training a feed-forward neural network allows for a multitude of functions to be learned and represented

Method

1 hidden neuron

 

2 hidden neurons

3 hidden neurons

 

4 hidden neurons

 

5 hidden neurons

 

6 hidden neurons

7 hidden neurons

Summary

 
Number of hidden neurons
  1 2 3 4 5 6 7
Average MSE (Training) 0.46 0.2061 0.3194 0.437 0.0670 0.0890 0.07418
Average MSE (Testing) 0.5637 0.1834 0.2671 0.1215 0.1088 0.111 0.1162
S.D. MSE (Training) 0.1775 0.1995 0.0276 0.0303 0.0114 0.0155 0.0127
S.D. MSE (Testing) 0.1392 0.1826 0.3051 0.0349 0.0197 0.0093 0.0157

Training data graph mean MSE with standard deviation

Testing data graph mean MSE with standard deviation

Conclusion

Using only one or two hidden neurons, the network is incapable of learning and the high error rates on both testing and training sets indicates this. For three hidden neurons the network appears to be able learn the training data but has variable behaviour when used on the test set. A significant difference is noticed when four or more neurons are used. Clearly this is the minimum number of hidden neurons that is required for successful learning and generalisation behaviour of a FFNN. Statistically there is no significant difference between the results where the number of hidden neurons is greater than four. In which case, based on the principal of Occam's Razor, we should favour networks of four hidden neurons when using the Iris dataset. However, it is widely known that a large number of hidden neurons can lead to overfitting of the training data.

Next: Hidden Activation Functions

Home