STAT462-15S2 (C) Semester Two 2015

Data Mining

15 points

Details:
Start Date: Monday, 13 July 2015
End Date: Sunday, 15 November 2015
Withdrawal Dates
Last Day to withdraw from this course:
  • Without financial penalty (full fee refund): Sunday, 26 July 2015
  • Without academic penalty (including no fee refund): Sunday, 11 October 2015

Description

Data Mining

Data mining refers to a collection of tools to discover patterns and relationships in data, especially for large data bases. It involves several fields including data base management, statistics, artificial intelligence, and machine learning, and it has had a considerable impact in business, industry and science.
This course provides an introduction to the principal methods in data mining: data preparation and warehousing, supervised learning (tree classifiers, neural networks), unsupervised learning (clustering methods), association rules, and the dealing with high-dimensional data (PCA, ICA, multidimensional scaling). Students will use applications from various fields, such as commerce (fraud detection, product placement, targeted marketing, assessing credit risk) and medicine (diagnostics). We will use data mining software to illustrate methods with data sets from these fields.

Students must (i) do problems that are assigned throughout the term and (ii) research an area and write an account of it; the instructor will give suggestions for topics in class.

Learning Outcomes

  • describe and conduct appropriate statistical modeling techniques for large datasets
  • be able to interpret the model results in such a way that a non-user of statistics can understand
  • use MATLAB competently
  • write a scientific and technical report

Prerequisites

Subject to approval of the Head of School.

Course Coordinator / Lecturer

Blair Robertson

Lecturer

Marco Reale

Assessment

Assessment Due Date Percentage 
Final Examination 40%


Assignments give you practice in analysing data and presenting results in a written report.
The project will give the opportunity to acquire presentation skills.

The lectures are complemented by computer labs where you will be guided in conducting approriate analysis and modelling.

Textbooks / Resources

Textbook
Recommended reading:
Tan, Steinbach and Kumar 2006. Introduction to Data Mining. 769pp.

This is on a restricted loan in the Library.

Indicative Fees

Domestic fee $887.00

* All fees are inclusive of NZ GST or any equivalent overseas tax, and do not include any programme level discount or additional course-related expenses.

For further information see Mathematics and Statistics .

All STAT462 Occurrences

  • STAT462-15S2 (C) Semester Two 2015