School of Computer Science

Module 06-20122 (2017)

Intelligent Data Analysis

Level 3/H

Peter Tino Semester 2 10 credits
Co-ordinator: Peter Tino
Reviewer: Achim Jung

The Module Description is a strict subset of this Syllabus Page.

Outline

The module introduces a range of state-of-the-art techniques in the fields of statistical pattern analysis and data mining. The 'information revolution' has generated large amounts of data, but valuable information is often hidden and hence unusable. Pattern analysis and data mining techniques seek to unveil hidden patterns in the data that can help us to refine web search, construct more robust spam filters, or uncover principal trends in the evolution of a variety of stock indexes.


Aims

The aims of this module are to:

  • introduce some of the fundamental techniques and principles of statistical pattern analysis and data mining
  • investigate some common text and high-dimensional data mining algorithms and their applications
  • present the fields of data mining and pattern analysis in the larger context of learning systems

Learning Outcomes

On successful completion of this module, the student should be able to:

  1. explain principles and algorithms for dimensionality reduction and clustering of vectorial data
  2. explain principles and techniques for mining textual data
  3. demonstrate understanding of the principles of efficient web-mining algorithms
  4. demonstrate understanding of broader issues of learning and generalisation in pattern analysis and data mining systems

Restrictions

None


Taught with

  • 06-20233 - Intelligent Data Analysis (Extended)

Cannot be taken with

  • 06-20233 - Intelligent Data Analysis (Extended)

Teaching methods

2 hrs of lectures per week

Contact Hours: 23


Assessment

Sessional: 1.5 hr examination (100%).

Supplementary (where allowed): As the sessional assessment


Detailed Syllabus

  1. Overview. Various forms of pattern analysis and data mining
  2. Basics of vector and metric spaces, probability theory and statistics
  3. Analysing vectorial data
    • Dimensionality reduction techniques
    • Clustering techniques
    • Classification and regression techniques
  4. Analysing structured data
    • Mining textual data
    • Other structured data types
  5. Searching the web

Programmes containing this module