Validation techniques

Introduction

So far, we have used the holdout method for testing the networks that have been trained. In cases where data is limited we can use a cross-validation method (Stone 1974, 1977) that uses the training data to make a predicton, or generalisation estimate. Here, we will look at 10 fold, 5 fold and 100-fold(leave-one-out) cross-validation on a much smaller dataset taken from the Iris data. This will be 50 stratified instances. The aim is to see how cross-validation estimates the error in comparison to a method which sets aside 10 of the instances for testing in the usual holdout manner. Four cases will be looked at:

1. Holdout set method.

2. 5-fold cross-validation.

3. 10-fold cross-validation.

4. n-fold (n=50) cross-validation (also known as ' leave one out cross-validation').

Method

Results

Method  
Run
    1 2 3 4 5 6 7 8 9 10 Mean S.D.
1.Holdout method Error on test set 0.1284 0.1364 0.2158 0.1107 0.157 0.138 0.1199 0.1368 0.1153 0.148 0.1406 0.0300
2. 5-fold CV Combined error estimate 0.1444 0.1384 0.1588 0.1408 0.2482 0.1475 0.1271 0.1423 0.1722 0.1487 0.1568 0.0343
3. 10-fold CV Combined error estimate 0.1348 0.1743 0.1004 0.1368 0.1142 0.1511 0.1242 0.104 0.118 0.1178 0.1275 0.0225
4. 50-fold CV Combined error estimate 0.1267 0.1189 0.0983 0.1242 0.1065 0.127 0.1228 0.1093 0.1178 0.1319 0.1183 0.0106

Conclusion

Cross-validation compares favourably with the technique of using holdout sets. In the case of 10-fold and n-fold cross-validation the error rates were lower on average which suggests that these methods provide a good error estimate in comparison to the holdout method. On a smaller sample of data these methods can be used to evaluate how well a model generalises without the need to set aside some instances purely for testing. As the size of the folds increases. the bias of the networks decreases which is seen in the lower average error rates and in the lower deviation of error rates across differernt runs. The higher the size of n though, the more computation time is required and in the case of leave-1-out validation the process is completely deterministic (i.e. no random partition of training data) and this can sometimes produce undesirable results (although this seems not to have happened here). In all cases the cross-validation was not stratified. This can result in poor error estimates in some cases which can be seen from the large deviation of results when n=5 and to a lesser extent to when n=10.

References

Goutte, C. (1997), "Note on free lunches and cross-validation," Neural Computation, 9, 1211-1215

S. Geman, E. Bienenstock and R. Doursat (1992). "Neural Networks and the Bias/Variance Dilemma", Neural Computation , vol. 4 , N: 1, pp. 1-58

Home