Projects

Current research.
Very roughly, we can define our ongoing research as following.

The semi-supervised learning is a relatively new field in Machine Learning. In contrast with supervised learning, in which we only use the labelled instances, the main idea of the semi-supervised learning is to learn from both labelled and unlabelled data in a given problem. By using the unlabelled data, we aim to improve the supervised learning process. In the context of the semi-supervised learning, we focus on the semi-supervised classification task, which consists of classifying unseen instances according to the knowledge acquired from a semi-supervised training algorithm. So, in order to address such task, we propose an ensemble-based technique, in which each ensemble component increases its classification accuracy by using the knowledge retrieved from other classifiers. In this sense, we intend to improve the current results in the literature.

Previous works.

In the context of classifier ensembles, we developed a study in which we tackle the member selection problem of classifier ensembles. Ensemble of classifiers is an effective way of improving performance of individual classifiers. However, the choice of the ensemble members can become a very difficult task, in which, in some cases, it can lead to ensembles with no performance improvement. In order to avoid this situation, there is a need to find effective classifier member selection methods. So, we proposed a DCS (Dynamic Classifier Selection)-based classifier selection system, which takes into account performance and diversity of the classifiers in order to choose the ensemble members. We were able to deliver three papers with different perspectives about this topic.

Afterwards, we worked with evolutionary RBF (Radius Basis Functions) networks. In such work, the main aim was to find an automatic method for generating RBF networks with high performance and lower complexity (number of processing units), this work considers the use of Memetic Algorithm (MA) tailored to the optimization of RBF networks. We use, as local search procedure, the k-means algorithm, an appropriate representation and the recombination operator. We searched, in this work, a trade-off between the performance and network complexity. We proposed the use of a harmonic average of these measures as the objective function (fitness function) of the MA. In this way, we intend, at the end of the process, to get a trained RBF network with an appropriate architecture to the considered problem.

Still in the context of the use of evolutionary techniques, we proposed an evolutionary method in order to obtain well formed and spatially separated clusters. The proposed algorithm uses a complete solution representation; each partition is represented by a length-variable chromosome. The variation operators were chosen to facilitate the exchange of clustering information between individuals. We have put two complementary clustering criteria together in the fitness function, so that the method can find clusters with arbitrary shapes. The k-means algorithm was the basis of the local search operator; such operator might refine the clustering solutions. The population diversity was an important issue for the algorithm, so a diversity maintenance scheme was employed. Differently from other existing clustering algorithms, our algorithm does not need the setting of the number of clusters in advance.

We also worked with meta-learning during the MSC Course. The title of the dissertation was "The Use of Meta-learning for Selecting and Ranking Clustering Algorithms Applied to Gene Expression Data". In such work, we presented a novel framework that applies a meta-learning approach to clustering algorithms. Given a dataset, our meta-learning approach provides a ranking for the candidate algorithms that could be used with that dataset. This ranking could, among other things, support non-expert users in the algorithm selection task. In order to evaluate the framework proposed, we implemented a prototype that employs regression support vector machines as the meta-learner. Our case study was developed in the context of cancer gene expression microarray datasets. Another study was also developed within the context of the dissertation, such work investigated the impact of normalization procedures in cluster analysis of gene expression datasets.

During the graduation course, we worked with multi-agent systems in order to simulate human organization using psychology theories. At that time, we also developed a study in computer games. We tackled the exploration task in games context as a multi-agent problem solved with a negotiation-based technique.