Created by W.Langdon from gp-bibliography.bib Revision:1.4780
Wed, 17 Apr 1996 09:20:19 PDT
When I read your email (koza's), I went back and checked the output on two other problems that we ran as part of that paper. Gaussian 3D and Phoneme Classification. Each of these was a two output problem and the way the classification was set up, one would expect less than 50% correct classification from a randomly created individual.
In those problems, we used 10 different random seeds, 3000 individuals per run. The following were the results for the best individual from generation 0's classification rate.
Mean Best Worst gauss 0.59 0.64 0.55 iris 0.98 0.99 0.97 phoneme 0.73 0.75 0.71
Note that these figures represent the results of a random search of 30,000 individuals.
As Peter Nordin points out in his email to which this is a reply, on the IRIS problem, even the worst figure is very good. In fact it was statistically indistinguishible from a highly optimized KNN beachmark run on twice as large a training set. This is because the IRIS problem is trivial. As pointed out in the above referenced paper, IRIS should probably not be used as a measure of the learning ability of any ML system, notwithstanding its status as a 'classic' problem. It is probably better characterized as a 'classic' way to make a ML system look good.
On the other two problems, which were much more difficult, the genetic search improved on the random search considerably. The individuals with the best abilitiy to generalize on the test data set were respectively.
Best Generalizer Gaussian 3D 72% Phoneme 85%
I report these figures here because the generation 0 figures are not reported in the above paper directly.
Genetic Programming entries for Frank D Francone Peter Nordin Wolfgang Banzhaf