Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSCI 347 / CS 4206: Data Mining Module 01: Introduction Topic 03: Stages in Data Mining.

Similar presentations


Presentation on theme: "CSCI 347 / CS 4206: Data Mining Module 01: Introduction Topic 03: Stages in Data Mining."— Presentation transcript:

1

2 CSCI 347 / CS 4206: Data Mining Module 01: Introduction Topic 03: Stages in Data Mining

3 Module 01: Introduction - Objectives  Understand the definition of basic data mining terms  Understand, at a general level, structural descriptions in data mining  Understand, at a general level, the main steps/stages in data mining  Be aware of the biases of different basic approaches to data mining  Be aware of fielded applications in data mining  Understand and identify technical and ethical issues in data mining 2CSCI347/CS4206 Data Mining

4 Stages in Data Mining  The overall approach your textbook uses to describe data mining is to look at it according to what goes into the system (input), what happens to it (the algorithms or processing), and what comes out (output).  Input  Data Acquisition  Cleansing / Transformation  Processing (Algorithms)  Output  Representation  Evaluation 3CSCI347/CS4206 Data Mining

5 Input  As the text authors state, “We are overwhelmed with data.”  We collect an incredible amount of data, and there are potentially useful patterns in that data, but the vast amount of data available makes it impossible to manually uncover these patterns.  Input data is not only divided on the dimension of source or industry, but also by “type” of data.  Is the data numeric or symbolic?  Is it relatively error-free, or is there much error in it?  Is it consistent? 4CSCI347/CS4206 Data Mining

6 Processing  Some authors divide the data mining task into two categories: predictive and descriptive (Tan, Steinbach, and Kumar, 2004).  Predictive systems use some variables to predict unknown or future values of other variables  Descriptive systems find human interpretable patterns in the data.  Some predictive systems are:  Classification  Regression  Deviation Detection  Some descriptive systems are:  Clustering  Association Rule Discovery  Sequential Pattern Discovery  What are some examples of these? 5CSCI347/CS4206 Data Mining

7 Output  The format of the output of the system is also important.  Sometimes the correct answer is all that matters  Sometimes it is important that the patterns discovered make sense to human users  For example:  If I’m classifying sea ice, I may not be terribly concerned about the patterns the classifier came up with in making its decisions, as long as I have faith that the decisions are correct.  If I’m a physician, I’m far more worried about having a traceable decision logic that is human readable if I’m going to make a decision to intervene or not in a pregnancy. 6CSCI347/CS4206 Data Mining

8 THE Mystery Sound  And what is the mystery sound for this section???


Download ppt "CSCI 347 / CS 4206: Data Mining Module 01: Introduction Topic 03: Stages in Data Mining."

Similar presentations


Ads by Google