M.Phil.
Here is the abstract of my dissertation.
Title: Uso de Meta-aprendizado para a Seleção e Ordenação de Algoritmos de Agrupamento Aplicados a Dados de Expressão Gênica
(The use of meta-learning for selecting and ordering of clustering algorithms applied to gene expression data)
Abstract
The amount of gene expression data has been exponentially growing in recent years due to the new Molecular Biology technologies, that allow measuring the expression of thousands of genes at once. The computational analysis of such data is of major importance in Biology and Medicine, it allows, for example, the recovery of new biologically and clinically signicant cancer classes and the identification of new functions of genes. The unsupervised Machine Learning techniques take part in the experts' data analysis methodology. There is a variety of data clustering algorithms, each one tends to cluster the data in a specific way. The choice of such algorithms is fundamental to the clustering quality and, therefore, it's important to the proper results analysis. We propose a meta-learning methodology for the clustering algorithms selection in the context of cancer cells gene expression. So far, meta-learning has been used only for supervised learning algorithms, we extended that concept for unsupervised learning. We used datasets from different cancer microarray experiments. We extracted relevant characteristics from each dataset in order to employ them in the learning of Neural Networks, k-Nearest Neighbors and Support Vectors Machine, used as meta-learners. These methods were used as learning systems to predict the performance ranking of clustering algorithms, as well as to select the best algorithm, according to the dataset characteristics. We performed a set of experiments in order to validate the use of each meta-learner. In this context, we showed that, in average, the Support Vector Machines suggested rankings that are more correlated with the ideal ranking than the ones obtained by the default ranking. We could propose a novel approach, which can be extended to data from other contexts, so it can be the stating point for other works.
Keywords: Meta-learning, Unsupervised Learning, Gene Expression.