Pattern recognition using genetic programming for classification of diabetes and modulation data

Created by W.Langdon from gp-bibliography.bib Revision:1.4208

  author =       "Muhammad Waqar Aslam",
  title =        "Pattern recognition using genetic programming for
                 classification of diabetes and modulation data",
  school =       "University of Liverpool",
  year =         "2013",
  address =      "UK",
  month =        feb,
  keywords =     "genetic algorithms, genetic programming",
  URL =          "",
  URL =          "",
  URL =          "",
  size =         "220 pages",
  abstract =     "The field of science whose goal is to assign each
                 input object to one of the given set of categories is
                 called pattern recognition. A standard pattern
                 recognition system can be divided into two main
                 components, feature extraction and pattern
                 classification. During the process of feature
                 extraction, the information relevant to the problem is
                 extracted from raw data, prepared as features and
                 passed to a classifier for assignment of a label.
                 Generally, the extracted feature vector has fairly
                 large number of dimensions, from the order of hundreds
                 to thousands, increasing the computational complexity
                 significantly. Feature generation is introduced to
                 handle this problem which filters out the unwanted
                 features. The functionality of feature generation has
                 become very important in modern pattern recognition
                 systems as it not only reduces the dimensions of the
                 data but also increases the classification accuracy. A
                 genetic programming (GP) based framework has been used
                 in this thesis for feature generation. GP is a process
                 based on the biological evolution of features in which
                 combination of original features are evolved. The
                 stronger features propagate in this evolution while
                 weaker features are discarded. The process of evolution
                 is optimised in a way to improve the discriminatory
                 power of features in every new generation. The final
                 features generated have more discriminatory power than
                 the original features, making the job of classifier

                 One of the main problems in GP is a tendency towards
                 suboptimal-convergence. In this thesis, the response of
                 features for each input instance which gives insight
                 into strengths and weaknesses of features is used to
                 avoid suboptimal-convergence. The strengths and
                 weaknesses are used to find the right partners during
                 crossover operation which not only helps to avoid
                 suboptimal-convergence but also makes the evolution
                 more effective. In order to thoroughly examine the
                 capabilities of GP for feature generation and to cover
                 different scenarios, different combinations of GP are
                 designed. Each combination of GP differs in the way,
                 the capability of the features to solve the problem
                 (the fitness function) is evaluated. In this research
                 Fisher criterion, Support Vector Machine and Artificial
                 Neural Network have been used to evaluate the fitness
                 function for binary classification problems while
                 K-nearest neighbour classifier has been used for
                 fitness evaluation of multi-class classification
                 problems. Two Real world classification problems
                 (diabetes detection and modulation classification) are
                 used to evaluate the performance of GP for feature
                 generation. These two problems belong to two different
                 categories; diabetes detection is a binary
                 classification problem while modulation classification
                 is a multi-class classification problem. The
                 application of GP for both the problems helps to
                 evaluate the performance of GP for both categories. A
                 series of experiments are conducted to evaluate and
                 compare the results obtained using GP. The results
                 demonstrate the superiority of GP generated features
                 compared to features generated by conventional
  notes =        "Supervisor: Asoke Kumar Nandi",

Genetic Programming entries for Muhammad Waqar Aslam