Multiobjective Optimization in Quantitative Structure-Activity Relationships: Deriving Accurate and Interpretable QSARs

Created by W.Langdon from gp-bibliography.bib Revision:1.3949

@Article{Nicolotti:2002:JMC,
  author =       "Orazio Nicolotti and Valerie J. Gillet and 
                 Peter J. Fleming and Darren V. S. Green",
  title =        "Multiobjective Optimization in Quantitative
                 Structure-Activity Relationships: Deriving Accurate and
                 Interpretable QSARs",
  journal =      "Journal of Medicinal Chemistry",
  year =         "2002",
  volume =       "45",
  number =       "23",
  pages =        "5069--5080",
  month =        nov # " 7",
  keywords =     "genetic algorithms, genetic programming, QSAR,
                 cheminformatics, MOGA, MOGP, GSK, Akaike Information
                 Criterion (AIC)",
  ISSN =         "0022-2623",
  URL =          "http://pubs3.acs.org/acs/journals/doilookup?in_doi=10.1021/jm020919o",
  DOI =          "doi:10.1021/jm020919o",
  abstract =     "Deriving quantitative structure-activity relationship
                 (QSAR) models that are accurate, reliable, and easily
                 interpretable is a difficult task. In this study, two
                 new methods have been developed that aim to find useful
                 QSAR models that represent an appropriate balance
                 between model accuracy and complexity. Both methods are
                 based on genetic programming (GP). The first method,
                 referred to as genetic QSAR (or GPQSAR), uses a penalty
                 function to control model complexity. GPQSAR is
                 designed to derive a single linear model that
                 represents an appropriate balance between the variance
                 and the number of descriptors selected for the model.
                 The second method, referred to as multiobjective
                 genetic QSAR (MoQSAR), is based on multiobjective GP
                 and represents a new way of thinking of QSAR.
                 Specifically, QSAR is considered as a multiobjective
                 optimization problem that comprises a number of
                 competitive objectives. Typical objectives include
                 model fitting, the total number of terms, and the
                 occurrence of nonlinear terms. MoQSAR results in a
                 family of equivalent QSAR models where each QSAR
                 represents a different tradeoff in the objectives. A
                 practical consideration often overlooked in QSAR
                 studies is the need for the model to promote an
                 understanding of the biochemical response under
                 investigation. To accomplish this, chemically intuitive
                 descriptors are needed but do not always give rise to
                 statistically robust models. This problem is addressed
                 by the addition of a further objective, called chemical
                 desirability, that aims to reward models that consist
                 of descriptors that are easily interpretable by
                 chemists. GPQSAR and MoQSAR have been tested on various
                 data sets including the Selwood data set and two
                 different solubility data sets. The study demonstrates
                 that the MoQSAR method is able to find models that are
                 at least as good as models derived using standard
                 statistical approaches and also yields models that
                 allow a medicinal chemist to trade statistical
                 robustness for chemical interpretability.",
  notes =        "http://pubs.acs.org/journals/jmcmar/index.html

                 PMID: 12408718",
}

Genetic Programming entries for Orazio Nicolotti Valerie J Gillet Peter J Fleming Darren V S Green

Citations