Evolutionary Program Induction of Binary Machine Code and its Applications

Created by W.Langdon from gp-bibliography.bib Revision:1.4549

  author =       "Peter Nordin",
  title =        "Evolutionary Program Induction of Binary Machine Code
                 and its Applications",
  school =       "der Universitat Dortmund am Fachereich Informatik",
  year =         "1997",
  ISBN =         "3-931546-07-1",
  keywords =     "genetic algorithms, genetic programming",
  URL =          "http://www.amazon.co.uk/Evolutionary-Program-Induction-Machine-Applications/dp/3931546071",
  size =         "290 pages",
  abstract =     "This thesis presents the Compiling Genetic Programming
                 System (CGPS) a machine learning method for automatic
                 program induction. The objective of the system is to
                 automatically produce computer programs. CGPS is the
                 marriage of two new ideas (1) a special evolutionary
                 program induction algorithm variant and (2) the use of
                 large scale meta manipulation of binary code. Both
                 ideas may have merits on their own but it is by
                 combining the two that the real benefits emerge, making
                 CGPS a powerful machine learning paradigm.",
  notes =        "(1) The evolutionary program induction method is an
                 instance of an evolutionary algorithm (EA) a class of
                 algorithms that borrow metaphors from biology,
                 evolution and natural selection. It uses a linear
                 program representation in contrast to other well used
                 methods such as Koza's genetic programming (GP)
                 approach which has a hierarchical tree-based program
                 structure. One way to view CGPS is as a large alphabet
                 genetic algorithm (GA) where each letter in the
                 alphabet corresponds to a syntactically closed computer
                 program structure. A letter in the GA could for example
                 be a line in a computer program i.e. an assignment a :=
                 a + 1. CGPS uses recombination (crossover) in between
                 letters which guarantees syntactic closure during
                 evolution. However, from a GA point of view, the letter
                 in the linear string normally does not have an internal
                 structure. A program line in a computer language the
                 program or a machine code instruction have plenty of
                 internal structure determining the operation to be
                 performed and the operands used. A better metaphor
                 could therefore be the gene concept in nature. Genes in
                 DNA are syntactically closed sequences providing the
                 recipe of a protein. As in CGPS, crossover normally
                 acts between genes/instructions preserving the
                 syntactic closure of the object. There is a mutation
                 operator, to produce variation within the internal
                 structure of the gene/instruction. In CGPS, the
                 mutation operator changes the syntactically closed
                 program object into another syntactically closed
                 program object it replaces a valid letter with another
                 valid letter in the GA alphabet. CGPS also uses two
                 significant features in the program individual (genome)
                 which is the header and the footer. The header is a
                 prefix part of the individual while the footer is the
                 suffix part. The genetic operators prevent the header
                 and the footer from being changed, and the header and
                 the footer are used to ensure syntactic closure of the
                 whole individual.",
  notes =        "(2) CGPS in this thesis is mostly applied to program
                 induction of binary machine code. Binary machine code
                 is the code in the computer that is directly executed
                 by the processor. The program individual in CGPS is a
                 binary machine code function and the genetic operators
                 operate directly on the binary machine code. This
                 implies that CGPS is a meta-manipulating program. In
                 low level programming, a meta-manipulating program is a
                 program that changes its own binary code or the binary
                 code of another program. Usually, this only means
                 changing a few bytes e.g. dynamic linking of a
                 function. However, CGPS is the first real instance of a
                 large scale meta manipulating program which constantly
                 manipulates, shuffles around and changes large chunks
                 of binary code as a part of the learning process. So,
                 the output from CGPS is a binary machine code program
                 that can be executed directly by the processor. The
                 meta manipulation of the individual (genome) is done
                 with the evolutionary algorithm, mentioned in (1)
                 above. The headers and footers are used to make sure
                 that the individual always preserves syntactic closure
                 during evolution, no matter what the genetic operators
                 do with the code in between.",
  notes =        "The marriage of these two ideas produce a system which
                 can induce turing-complete machine code programs very
                 efficiently. The system also has several other
                 attractive properties, such as constant memory usage,
                 compact representation and uncomplicated memory
                 management. In this thesis the background to CGPS is
                 presented together with other evolutionary algorithms
                 and other evolutionary program induction techniques.
                 The detailed description of CGPS and its implementation
                 is described together with several evaluations or case
                 studies in different feasible domains such as robotic
                 control, image processing and natural language
                 processing. Some of the evaluations make comparisons to
                 other machine learning algorithms; neural networks and
                 hierarchical genetic programming. The formal definition
                 of CGPS is given followed by aninvestigation of
                 generalization in variable length evolutionary
                 algorithms. Finally, several possible directions for
                 the future of CGPS research are presented.",
  notes =        "Tel/Fax +49 231 261550 email: krehl-verlag@t-online.de
                 Price: 68DM

                 * Orders within Germany can easily order in any
                 bookstore (DM 68,-), although firms/institutions or
                 (Master or Visa) credit card owners also can order
                 directly (shipping and credit card service charge

                 * International Sales are possible to Master or
                 Visa-Card owners:

                 - order should arrive via FAX (+49 2501 261550) or
                 PGP-signed (and optionally encrypted, our PGP-key is
                 available upon request) email (with PGP public key
                 trustfully available). In the moment, these ways are
                 the only ones protecting the customers sensitive
                 payment data (we are working on secure order via WWW
                 but have no final timeline for this now) The order
                 should mention credit card type, number and expiration
                 date, as well as card holders name and address.

                 - Beside the cost of the book (w/o VAT), i.e. DM 68,-
                 minus 7% the interested party has to decide about
                 {"}surface{"} or {"}air{"} mail (rates available upon
                 request for different areas, e.g. USA DM 3.50 for 3-5
                 weeks delivery or DM 24,-- for 1-2 weeks delivery). The
                 total amount is drawn from the credit card in german
  notes =        "Table of Contents

                 1 Introduction 21

                 1.1 Evolutionary Algorithms 22

                 1.2 Genetic Programming 25

                 1.3 Machine Language Genetic Programming 31

                 1.4 Introduction to CGPS 38

                 I Implementation 47

                 2 Computer Hardware 49

                 2.1 Introduction to Computer Hardware50

                 2.2 Von Neumann Computers 50

                 3 The SPARC Architecture and CGPS 55

                 3.1 Fundamentals of CGPS 56

                 3.2 Implementation 58

                 3.3 Floating Point Instructions 69

                 3.4 Self Modifying Code in C language 69

                 3.5 How to Call a Self|made Function from C 69

                 3.6 Genetic Operators 70

                 3.7 Initialisation 72

                 4 Additional Features of the System 75

                 4.1 Leaf Procedures and Primitives 77

                 4.2 Memory in Tree-Based GP 78

                 4.3 Conditionals 80

                 4.4 Automatically Defined Subroutines in CGPS 80

                 4.5 Leaf Procedure Examples 84

                 4.6 Loops and Recursion in Tree-Based GP 85

                 4.7 Loops and Recursion in CGPS 88

                 4.8 Loop Example in CGPS 88

                 4.9 External Functions 91

                 4.10 Strings and Lists 91

                 4.11 Parameters to the System 92

                 4.12 C-language Output 93

                 4.13 Platforms for CGPS 94

                 4.14 Portability Methods 94

                 4.15 Caveats 94

                 4.16 How to Get Started 95

                 4.17 Using Tree Representation 96
BibTeX entry too long. Truncated

Genetic Programming entries for Peter Nordin