The University of Birmingham
School of Computer Science

Industrial Projects From Unilever for MSc in Natural Computation (2009/10)


Cluster discover from network data

The project aims at implementing a method to identify clustering from network data, which is based on an article called 'Fast unfolding of communities in large networks' (available at: http://arxiv.org/abs/0803.0476).

The method described in the above paper is a heuristic approach that is based on modularity optimization and its output depends on the order in which the nodes are considered. Preliminary results have shown that the ordering of the nodes does not have a significant influence on the modularity that is obtained. However the ordering can influence the computation time and the problem of choosing an order is thus worth studying since it could give good heuristics to enhance the computation time. However, this part of study has not been done by the original author yet.

In this project, the potential student would need to understand and implement the above method and further investigate the impact of several node ordering strategies on the model performance (in terms of computation time). The student is adviced to carry out the project in three phases:

  1. study various techniques for clustering (or community) discover from the network data;
  2. understand the above paper and implement the proposed method and test it on benchmark data;
  3. study various metrics for the node importance w.r.t the network and investigate the impact of those metrics on both the node ordering and the model performance (e.g. converge speed).

Page maintained by Xin Yao.