Created by W.Langdon from gp-bibliography.bib Revision:1.4771
The system under study is being characterised by some important control parameters which need to be available for an observer, but usually are difficult to monitor, e.g. they need to be measured in a lab, simulated or observed in real time only, or at high time and computational expenses. Empirical modelling attempts to express these critical control variables via other controllable variables that are easier to monitor, can be measured more accurately or timely, are cheaper to simulate, etc. Symbolic regression provides such expressions of crucial process characteristics, or, response variables, defined (symbolically) as mathematical functions of some of the easy-to-measure input variables, and calls these expressions empirical input-response models. Examples of these are (i) structure-activity relationships in pharmaceutical research, which define the activity of a drug through the physical structure of molecules of drug components, (ii) structure-property relationships in material science, which define product qualities, such as shininess, opacity, smell, or stiffness through physical properties of composites and processing conditions, or (iii) economic models, e.g. expressing return on investment through daily closes of S&P 500 quotes and in ation rates.",
To discover plausible models with realistic time and computational efforts, symbolic regression exploits a stochastic iterative search technique, based on artificial evolution of model expressions. This method, called genetic programming looks for appropriate expressions of the response variable in the space of all valid formulae containing a minimal set of input variables and a proposed set of basic operators and constants.
At each step, the genetic programming system considers a sufficiently large quantity of various formulae, selects the subset of the best formulae according to certain user-defined criteria of goodness, and (re)combines the best formulae to create a rich set of potential solutions for the next step. This approach is inspired by principles of natural selection, where the offspring that inherits good features from both parents increases the chances to be successful in survival, adaptation, and further propagation. The challenge and the rationale of performing evolutionary search is to balance the exploitation of the good solutions discovered so far, with exploration of the new areas of the search space, where even better solutions may be found.",
A special multi-objective flavour of a genetic programming search is considered, called Pareto GP. Pareto GP used for symbolic regression has strong advantages in creating diverse sets of regression models, satisfying competing criteria of model structural simplicity and model prediction accuracy.",
In addition to the new strategies for model development and model selection, this dissertation presents a new approach for analysis, ranking, and compression of given multi-dimensional input-output data for the purpose of balancing the information content in undesigned data sets.
To present contributions of this research in the context of real-life problem solving, the dissertation exploits a generic framework of adaptive model-based problem solving used in many industrial modelling applications. This framework consists of an iterative feed-back loop over: (Part I) data generation, analysis and adaptation, (Part II) model development, and (Part III) problem analysis and reduction.",
Part II of the thesis consist of Chapters 3-7 and addresses the model induction method - Pareto genetic programming. Since time to solution, or, more accurately, time-to-convincing-solution is a major practical challenge of evolutionary search algorithms, and Pareto GP in particular, Part II focuses on algorithmic enhancements of Pareto GP that lead it to the discovery of better solutions faster (i.e. solutions of sufficient quality at a smaller computational effort, or of considerably better quality at the same computational effort).",
In Chapter 3 a general description of the Pareto GP methodology is presented in a framework of evolutionary search, as an iterative loop over the stages of model generation, model evaluation, and model selection.
In Chapter 5 a novel strategy for model selection through explicit non-linearity control is presented. A new complexity measure called the order of non-linearity of symbolic models is introduced and used BibTeX entry too long. Truncated
Genetic Programming entries for Ekaterina (Katya) Vladislavleva