Data Mining CSCI 307, Spring 2019 Lecture 6

Slides:



Advertisements
Similar presentations
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
Advertisements

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall.
Knowledge Representation. 2 Outline: Output - Knowledge representation  Decision tables  Decision trees  Decision rules  Rules involving relations.
Decision Trees Chapter 18 From Data to Knowledge.
Knowledge Representation. 2 Outline: Output - Knowledge representation  Decision tables  Decision trees  Decision rules  Rules involving relations.
Decision Trees (2). Numerical attributes Tests in nodes are of the form f i > constant.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Data Mining – Algorithms: OneR Chapter 4, Section 4.1.
Data Mining – Output: Knowledge Representation
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Classification I. 2 The Task Input: Collection of instances with a set of attributes x and a special nominal attribute Y called class attribute Output:
Lecture 7. Outline 1. Overview of Classification and Decision Tree 2. Algorithm to build Decision Tree 3. Formula to measure information 4. Weka, data.
Figure 1.1 Rules for the contact lens data.. Figure 1.2 Decision tree for the contact lens data.
Data Mining Practical Machine Learning Tools and Techniques Chapter 3: Output: Knowledge Representation Rodney Nielsen Many of these slides were adapted.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
1 1 Slide Using Weka. 2 2 Slide Data Mining Using Weka n What’s Data Mining? We are overwhelmed with data We are overwhelmed with data Data mining is.
Data Mining – Algorithms: Decision Trees - ID3 Chapter 4, Section 4.3.
CS690L Data Mining: Classification
1Weka Tutorial 5 - Association © 2009 – Mark Polczynski Weka Tutorial 5 – Association Technology Forge Version 0.1 ?
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.2 Statistical Modeling Rodney Nielsen Many.
 Classification 1. 2  Task: Given a set of pre-classified examples, build a model or classifier to classify new cases.  Supervised learning: classes.
Algorithms for Classification: The Basic Methods.
Data Mining – Algorithms: Naïve Bayes Chapter 4, Section 4.2.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Sections 4.1 Inferring Rudimentary Rules Rodney Nielsen.
Machine Learning in Practice Lecture 5 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
CSCI 347, Data Mining Chapter 4 – Functions, Rules, Trees, and Instance Based Learning.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall.
Chapter 4: Algorithms CS 795. Inferring Rudimentary Rules 1R – Single rule – one level decision tree –Pick each attribute and form a single level tree.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
Data Mining Chapter 4 Algorithms: The Basic Methods Reporter: Yuen-Kuei Hsueh.
Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Output: Knowledge Representation WFH: Data Mining,
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten and E. Frank.
Data Mining Practical Machine Learning Tools and Techniques
DECISION TREES An internal node represents a test on an attribute.
Decision Trees an introduction.
Chapter 18 From Data to Knowledge
Data Science Algorithms: The Basic Methods
Classification Algorithms
Teori Keputusan (Decision Theory)
Prepared by: Mahmoud Rafeek Al-Farra
Data Science Algorithms: The Basic Methods
Decision Trees: Another Example
Artificial Intelligence
Data Science Algorithms: The Basic Methods
CSE 711: DATA MINING Sargur N. Srihari Phone: , ext. 113.
Decision Tree Saed Sayad 9/21/2018.
Data Mining Lecture 11.
Advanced Artificial Intelligence
Figure 1.1 Rules for the contact lens data.
Machine Learning Techniques for Data Mining
Clustering.
Decision Trees.
Machine Learning: Lecture 3
Dept. of Computer Science University of Liverpool
Machine Learning in Practice Lecture 17
Machine Learning in Practice Lecture 6
Chapter 7: Transformations
Ensemble Methods: Bagging.
A task of induction to find patterns
Machine Learning: Decision Tree Learning
Data Mining CSCI 307 Spring, 2019
Data Mining CSCI 307, Spring 2019 Lecture 15
Data Mining CSCI 307, Spring 2019 Lecture 21
A task of induction to find patterns
Data Mining CSCI 307 Spring, 2019
Data Mining CSCI 307, Spring 2019 Lecture 18
Data Mining CSCI 307, Spring 2019 Lecture 11
Data Mining CSCI 307, Spring 2019 Lecture 9
Presentation transcript:

Data Mining CSCI 307, Spring 2019 Lecture 6 Output: Tables, Linear Models, Trees

Output: Representing Structural Patterns Many different ways of representing patterns Decision trees, rules, instance-based, … Also called “knowledge” representation Representation determines inference method Understanding the output is the key to understanding the underlying learning methods Different types of output for different learning problems (e.g. classification, regression, …)

Tables Simplest way of representing output: Use the same format as input! Decision table for the weather problem: Simply find the row with the appropriate conditions and assign the class, in this case, play or not. Outlook Temperature Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes ...... If it is numeric prediction, the concept is the same, except instead of calling it a decision table, it's called a regression table.

Tables Sometimes some of the attributes aren't necessary for the decision. What if we don't need temperature and windy attributes? A smaller, condensed table might be better: Main problem: selecting the right attributes so as to make the right decision. Outlook Humidity Play Sunny High No Sunny Normal Yes Overcast High Yes Overcast Normal Yes Rainy High No Rainy Normal No

Another Simple Representation: Linear Models Regression model Used when all the inputs (attribute values) and the output are numeric Output is the sum of weighted attribute values The trick is to find good values for the weights

A Linear Regression Function for the CPU Performance Data Only the cache attribute here is used to predict the CPU performance. (It is easier to see in two dimensions.) PRP = 37.06 + 2.47CACH The 37.06 is the "bias" term and is a weight as is the cache weight of 2.47. The least squares linear regression method was used to come up with the weights. (We'll see how in Chapter 4.) The training data is used to come up with the weights. Given a test instance, plug the value of the cache attribute into the expression and the value of performance (i.e. the output/class) will be on the line. 6

Linear Models for Binary Classification The line separates the two classes Decision boundary - defines where the decision changes from one class value to the other Prediction is made by plugging in observed values of the attributes into the expression Predict one class if output >= 0, and the other class if output < 0 Boundary becomes a high-dimensional plane (hyperplane) when there are multiple attributes

A Linear Decision Boundary Separating Iris Setosas from Iris Versicolors setosa if result is >= 0 versicolor if result is < 0 2.0 – 0.5PetalLength – 0.8PetalWidth = 0

Trees “Divide-and-conquer” approach produces tree Nodes involve testing a particular attribute Usually, attribute value is compared to constant Other possibilities: Comparing values of two attributes Using a function of one or more attributes Option nodes (choose more than one branch), i.e. an instance leads to two (or more) leaves, then the alternative predictions must be combined somehow (majority voting) Leaves assign classification, set of classifications, or probability distribution to instances Unknown instance is routed down the tree

Nominal and Numeric Attributes number of children usually equal to number values ==> attribute won’t get tested more than once Other possibility: division into two subsets, so may get tested more than once

Nominal and Numeric Attributes test whether value is greater or less than constant ==> attribute may get tested several times Other possibility: three-way split (or multi-way split) Integer: less than, equal to, greater than Real: below, within (i.e. close enough to be equal), above

Missing Values Does absence of value have some significance? Yes ==> “missing” is a separate value No ==> “missing” must be treated in a special way Solution A: assign instance to most popular branch Solution B: split instance into pieces Pieces receive weights according to fraction of training instances that go down each branch Classifications from leaf nodes are combined using the weights that have percolated to them