Ensemble Learning through Diversity Management: Theory, Algorithms, and Applications
ijcnnlog

The 2011 International Joint
Conference on Neural Networks

Tutorial on

Ensemble Learning through Diversity Management: Theory, Algorithms, and Applications

Huanhuan Chen, Xin Yao and Peter Tino

Tutorial notes will be available here.

Topic

Ensemble methods have been widely used to improve the generalization performance of machine learning and data mining systems. In the past decade, there are numerous studies on generating different kinds of ensemble models and the benefits of ensemble methods have been confirmed in many literatures. It is widely believed that the success of ensemble methods depends on combining diverse ensemble models. Therefore, there is a great need for systematic analysis, understanding and application of diversity in ensemble models.

In this tutorial, we will introduce diversity generation methods, theoretical analysis of the relationship between diversity and ensemble's performance, diversity management approaches for both supervised and semi-supervised learning problems. In addition, this tutorial also covers ensemble pruning methods from diversity management point of view, diversity management via multi-objective optimization, and online/imbalanced ensemble learning by manipulating diversity.

Intended Audience

The intended audience include academics, graduate students, and industrial researchers who are interested in the state-of-the-art data mining and machine learning techniques using ensemble techniques. The tutorial is intended for the broad IJCNN community. Little prior knowledge is assumed (other than basic data mining and machine learning techniques). Since ensemble methods have been used in the IJCNN community, an in-depth understanding of ensemble methods will lead to wider applications of ensembles in IJCNN. The participants are expected to learn both practical ensemble learning algorithms and fundamental techniques for generating and maintaining diversity in ensembles. Insight into specific state-of-the-art methods will be presented.

Detailed Outline

[total 120 minutes, outline subject to revision]
  1. Introduction of Ensemble Learning     [5 minutes]
    • Mixture of Experts, Bagging, Random Forest, AdaBoost
    • Generic Ensemble Generation
    • Tutorial Motivations & Overview
  2. Diversity Generation Methods in Ensemble     [15 minutes]
    • Diversity Encouragement by Training Data Manipulation
    • Diversity Encouragement by Architectures Manipulation
    • Diversity Encouragement by Hypothesis Space Traversal
    • Difference between Regression and Classification Ensembles for Diversity Generation
  3. Theoretical analysis of Ensemble and Diversity     [20 minutes]
    • Bias, Variance Tradeoff
    • Bias, Variance and Covariance Tradeoff
    • Error Decomposition for Regression and Classification Ensembles
  4. Manage Diversity in Ensemble     [20 minutes]
    • Negatively Correlated Ensembles for Diversity Management
    • Various Training Algorithms
    • Regularized Negatively Correlated Ensembles with Bayesian Inference

    break

  5. Diversity Management with Semi-supervised Learning     [15 minutes]
    • Ensemble Methods for Semi-supervised Problems
    • Diversity Encouragement in both Labelled and Unlabeled Space
  6. Diversity with Ensemble Pruning     [15 minutes]
    • Selection based Ensemble Pruning
    • Weight based Ensemble Pruning
    • Empirical, Greedy and Probabilistic Ensemble Pruning Methods
  7. Diversity Management via Multi-objective Optimization     [10 minutes]
    • Diversity, Accuracy, Regularization with Generalization
    • Multi-objective Optimization to optimize the trade-off
  8. Further Topics and Open Discussions     [10 minutes]
    • Online Ensemble Learning
    • Imbalanced Ensemble Learning
  9. Questions from the Audience     [10 minutes]

Format

The format will be data-projected slides. We will also occasionally use some simple Matlab demos to illustrate ideas.

Presenter

Huanhuan Chen received the B.Sc. degree from the University of Science and Technology of China, Hefei, China, in 2004, and Ph.D. degree, sponsored by Dorothy Hodgkin Postgraduate Award, in computer science at the University of Birmingham, Birmingham, UK, in 2008. He is a Research Fellow with the Centre of Excellence for Research in Computational Intelligence and Applications in School of Computer Science, University of Birmingham. His research interests include ensemble learning, data mining and evolutionary computation. His PhD thesis on ensemble learning "Diversity and Regularization in Neural Network Ensembles" has received 2011 IEEE Computational Intelligence Society Outstanding PhD Dissertation Award (the only winner) and 2009 CPHC/British Computer Society Distinguished Dissertations Award (the runner up). He has published more than 20 papers in refereed journals and conferences, including IEEE TNN, TKDE, and TEC journals.

Xin Yao received the B.Sc. degree from the University of Science and Technology of China (USTC), Hefei, Anhui, in 1982, the M.Sc. degree from the North China Institute of Computing Technology, Beijing, in 1985, and the Ph.D. degree from USTC in 1990. He is a Chair of Computer Science at the University of Birmingham, UK, the Director of the Centre of Excellence for Research in Computational Intelligence and Applications (CERCIA) and a Distinguished Visiting Professor at USTC, Hefei. He was the Editor-in-Chief of the IEEE Transactions on Evolutionary Computation (2003-08), an associate editor or editorial board member of twelve other journals, and the Editor of the World Scientific Book Series on Advances in Natural Computation. He has given more than 60 invited keynote and plenary speeches at international conferences. His major research interests include ensemble learning and evolutionary computation. He has more than 300 refereed publications. He won the 2001 IEEE Donald G. Fink Prize Paper Award, IEEE Transactions on Evolutionary Computation Outstanding 2008 Paper Award (bestowed in 2010), 2010 BT Gordon Radley Award for Best Author of Innovation (2nd Prize), etc.. He is a fellow of IEEE and a Distinguished Lecturer of IEEE Computational Intelligence Society.