Created by W.Langdon from gp-bibliography.bib Revision:1.2031
@InProceedings{francone:1996:bench,
author = "Frank D. Francone and Peter Nordin and
Wolfgang Banzhaf",
title = "Benchmarking the Generalization Capabilities of a
Compiling Genetic programming System using Sparse Data
Sets",
booktitle = "Genetic Programming 1996: Proceedings of the First
Annual Conference",
editor = "John R. Koza and David E. Goldberg and
David B. Fogel and Rick L. Riolo",
year = "1996",
month = "28--31 " # jul,
keywords = "genetic algorithms, genetic programming",
pages = "72--80",
address = "Stanford University, CA, USA",
publisher = "MIT Press",
URL = "
http://www.cs.mun.ca/~banzhaf/papers/benchmarking.pdf",
size = "9 pages",
notes = "GP-96 Notes based upon version submitted to GP-96
Wed, 17 Apr 1996 09:20:19 PDT
When I read your email (koza's), I went back and
checked the output on two other problems that we ran as
part of that paper. Gaussian 3D and Phoneme
Classification. Each of these was a two output problem
and the way the classification was set up, one would
expect less than 50% correct classification from a
randomly created individual.
In those problems, we used 10 different random seeds,
3000 individuals per run. The following were the
results for the best individual from generation 0's
classification rate.
Mean Best Worst gauss 0.59 0.64 0.55 iris 0.98 0.99
0.97 phoneme 0.73 0.75 0.71
Note that these figures represent the results of a
random search of 30,000 individuals.
As Peter Nordin points out in his email to which this
is a reply, on the IRIS problem, even the worst figure
is very good. In fact it was statistically
indistinguishible from a highly optimized KNN beachmark
run on twice as large a training set. This is because
the IRIS problem is trivial. As pointed out in the
above referenced paper, IRIS should probably not be
used as a measure of the learning ability of any ML
system, notwithstanding its status as a 'classic'
problem. It is probably better characterized as a
'classic' way to make a ML system look good.
On the other two problems, which were much more
difficult, the genetic search improved on the random
search considerably. The individuals with the best
abilitiy to generalize on the test data set were
respectively.
Best Generalizer Gaussian 3D 72% Phoneme 85%
I report these figures here because the generation 0
figures are not reported in the above paper
directly.
Regards
Frank Francone
",
}
Genetic Programming entries for Frank D Francone Peter Nordin Wolfgang Banzhaf