Download presentation
Presentation is loading. Please wait.
Published byBrent Cross Modified over 8 years ago
2
CSCI 347, Data Mining Chapter 4 – Functions, Rules, Trees, and Instance Based Learning
3
Audio With PowerPoints Audio can be included in PowerPoint slides, when the slides are viewed as a slideshow. While watching a slides show, click the icon below to hear a description of the slide:
4
Functions/Linear Models
5
Example Let x be the value of a house in Butte, MT x = c + b * num_bedrooms + d * num_bathrooms + p * price_houses_in_neigh Where c, b, d, and p are coefficients “learned” from data mining algorithm 4
6
Simple Linear Regression Equation for the CPU Performance Data 5 PRP = 37.06 + 2.47CACH
7
More Precise Linear Regression Equation for the CPU Data 6 PRP = - 56.1 + 0.049 MYCT + 0.015 MMIN + 0.006 MMAX + 0.630 CACH - 0.270 CHMIN + 1.46 CHMAX
8
Linear Models Work most naturally with numeric attributes The outcome is a linear combination of attributes, a 1, a 2, …, a k, and weights w 0, w 1, …, w k : x = w 0 + w 1 *a 1 + w 2 *a 2 + … + w n *a n
9
Rules as Covering Algorithms
10
Covering Algorithms Rather than looking at what attribute to split on, start with a particular class Class by class, develop rules that “cover” the class
11
Example: Generating a Rule 10 If x > 1.2 then class = a If x > 1.2 and y > 2.6 then class = a If ??? then class = a
12
Example: Generating a Rule Possible rule set for class “b”: Could add more rules, get “perfect” rule set 11 If x > 1.2 and y > 2.6 then class = a If x 1.2 then class = b If x > 1.2 and y 2.6 then class = b
13
Decision Trees
14
Divide and conquer strategy Can be expressed recursively
15
Decision Tree Algorithm Constructing a decision tree can be expressed recursively: Select an attribute to place as the root node Make one branch for each possible value, splitting the example set into subsets, one for every value of the attribute. Repeat the process for each branch (recursion) Base case - stop if all instances have the same class, there are no more attributes to split on, or a pre-defined depth of the tree has been split.
16
Recursion Recursion – Recursion is the process of repeating items in a similar way. Example: Definition of a person’s ancestors: One’s parents are one’s ancestors (base case) The ancestor of one’s ancestors are also one’s ancestors (recursion step) Example: Definition of the Fibonacci sequence Fib(0)=0 and Fib(1)=1 (base cases) For all integers n>1, Fib(n) = Fin(n-1) + Fib(n-2) (recursion step)
17
Recursion
18
Which Attribute to Select? 17
19
Rules vs. Trees Corresponding decision tree: (produces exactly the same predictions) Covering algorithm concentrates on one class value at a time whereas decision tree learner takes all class values into account 18
20
Instance-Based Learning
21
No structure is learned Given an instance to predict, simply predict the class of its nearest neighbor Alternatively, predict the class which appears most frequently for the nearest k neighbors
22
Example Predict the class value of the following: OutlookTempHumidityWindyPlay rainyhotnormalfalse?
23
Example Predict the class value of the following: OutlookTempHumidityWindyPlay rainy89.065.0false?
24
Manhattan Distance In two dimensions: if p = (p 1, p 2 ) and q = (q 1, q 2 )
25
Euclidean Distance Ordinary distance which one would measure with a ruler In two dimensions: if p = (p 1, p 2 ) and q = (q 1, q 2 ) Uses the Pythagorean Theorem
26
Euclidean Distance In n dimensions:
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.