Evolutionary Synthesis of Lossless Compression Algorithms: the GP-zip Family

Created by W.Langdon from gp-bibliography.bib Revision:1.4549

  author =       "Ahmed Jamil Kattan",
  title =        "Evolutionary Synthesis of Lossless Compression
                 Algorithms: the GP-zip Family",
  school =       "School of Computer Science and Electronic Engineering,
                 University of Essex",
  year =         "2010",
  address =      "UK",
  month =        oct,
  keywords =     "genetic algorithms, genetic programming",
  URL =          "http://www.ahmedkattan.com/PhD.pdf",
  size =         "189 pages",
  abstract =     "Data Compression algorithms have existed from almost
                 forty years. Many algorithms have been developed. Each
                 of which has their own strengths and weaknesses. Each
                 works best with the data types they were designed to
                 work for. No Compression algorithm can compress all
                 data types effectively. Nowadays files with a complex
                 internal structure that stores data of different types
                 simultaneously are in common use (e.g., Microsoft
                 Office documents, PDFs, computer games, HTML pages with
                 online images, etc.). All of these situations (and many
                 more) make lossless data compression a difficult, but
                 increasingly significant, problem.

                 The main motivation for this thesis was the realisation
                 that the development of data compression algorithms
                 capable to deal with heterogeneous data has
                 significantly slowed down in the last few years.
                 Furthermore, there is relatively little research on
                 using Computational Intelligence paradigms to develop
                 reliable universal compression systems. The primary aim
                 of the work presented in this thesis is to make some
                 progress towards turning the idea of using artificial
                 evolution to evolve human-competitive general-purpose
                 compression system into practice. We aim to improve
                 over current compression systems by addressing their
                 limitations in relation to heterogeneous data,
                 particularly archive files.

                 Our guiding idea is to combine existing, well-known
                 data compression schemes in order to develop an
                 intelligent universal data compression system that can
                 deal with different types of data effectively. The
                 system learns when to switch from one compression
                 algorithm to another as required by the particular
                 regularities in a file. Genetic Programming (GP) has
                 been used to automate this process.

                 This thesis contributes to the applications of GP in
                 the lossless data compression domain. In particular we
                 proposed a series of intelligent universal compression
                 systems: the GP-zip family. We presented four members
                 of this family, namely, GP-zip, GP-zip*, GP-zip2 and
                 GP-zip3. Each new version addresses the limitations of
                 previous systems and improves upon them. In addition,
                 this thesis presents a new learning technique that
                 specialised on analysing continues stream of data,
                 detect different patterns within them and associate
                 these patterns with different classes according to the
                 user need. Hence, we extended this work and explored
                 our learning technique applications to the problem of
                 the analysing human muscles EMG signals to predict
                 fatigue onset and the identification of file types.
                 This thesis includes an extensive empirical evaluation
                 of the systems developed in a variety of real world
                 situations. Results have revealed the effectiveness of
                 the systems.",

Genetic Programming entries for Ahmed Kattan