CSCI 347, Data Mining Chapter 4 – Functions, Rules, Trees, and Instance Based Learning.

CSCI 347, Data Mining Chapter 4 – Functions, Rules, Trees, and Instance Based Learning

Audio With PowerPoints Audio can be included in PowerPoint slides, when the slides are viewed as a slideshow. While watching a slides show, click the icon below to hear a description of the slide:

Functions/Linear Models

Example Let x be the value of a house in Butte, MT x = c + b * num_bedrooms + d * num_bathrooms + p * price_houses_in_neigh Where c, b, d, and p are coefficients “learned” from data mining algorithm 4

Simple Linear Regression Equation for the CPU Performance Data 5 PRP = 37.06 + 2.47CACH

More Precise Linear Regression Equation for the CPU Data 6 PRP = - 56.1 + 0.049 MYCT + 0.015 MMIN + 0.006 MMAX + 0.630 CACH - 0.270 CHMIN + 1.46 CHMAX

Linear Models  Work most naturally with numeric attributes  The outcome is a linear combination of attributes, a 1, a 2, …, a k, and weights w 0, w 1, …, w k : x = w 0 + w 1 *a 1 + w 2 *a 2 + … + w n *a n

Rules as Covering Algorithms

Covering Algorithms  Rather than looking at what attribute to split on, start with a particular class  Class by class, develop rules that “cover” the class

Example: Generating a Rule 10 If x > 1.2 then class = a If x > 1.2 and y > 2.6 then class = a If ??? then class = a

Example: Generating a Rule Possible rule set for class “b”: Could add more rules, get “perfect” rule set 11 If x > 1.2 and y > 2.6 then class = a If x  1.2 then class = b If x > 1.2 and y  2.6 then class = b

Decision Trees

 Divide and conquer strategy  Can be expressed recursively

Decision Tree Algorithm Constructing a decision tree can be expressed recursively:  Select an attribute to place as the root node  Make one branch for each possible value, splitting the example set into subsets, one for every value of the attribute.  Repeat the process for each branch (recursion)  Base case - stop if all instances have the same class, there are no more attributes to split on, or a pre-defined depth of the tree has been split.

Recursion Recursion – Recursion is the process of repeating items in a similar way. Example: Definition of a person’s ancestors:  One’s parents are one’s ancestors (base case)  The ancestor of one’s ancestors are also one’s ancestors (recursion step) Example: Definition of the Fibonacci sequence  Fib(0)=0 and Fib(1)=1 (base cases)  For all integers n>1, Fib(n) = Fin(n-1) + Fib(n-2) (recursion step)

Recursion

Which Attribute to Select? 17

Rules vs. Trees Corresponding decision tree: (produces exactly the same predictions) Covering algorithm concentrates on one class value at a time whereas decision tree learner takes all class values into account 18

Instance-Based Learning

 No structure is learned  Given an instance to predict, simply predict the class of its nearest neighbor  Alternatively, predict the class which appears most frequently for the nearest k neighbors

Example Predict the class value of the following: OutlookTempHumidityWindyPlay rainyhotnormalfalse?

Example Predict the class value of the following: OutlookTempHumidityWindyPlay rainy89.065.0false?

Manhattan Distance In two dimensions: if p = (p 1, p 2 ) and q = (q 1, q 2 )

Euclidean Distance Ordinary distance which one would measure with a ruler In two dimensions: if p = (p 1, p 2 ) and q = (q 1, q 2 ) Uses the Pythagorean Theorem

Euclidean Distance In n dimensions:

CSCI 347, Data Mining Chapter 4 – Functions, Rules, Trees, and Instance Based Learning.

Similar presentations

Presentation on theme: "CSCI 347, Data Mining Chapter 4 – Functions, Rules, Trees, and Instance Based Learning."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CSCI 347, Data Mining Chapter 4 – Functions, Rules, Trees, and Instance Based Learning.

Similar presentations

Presentation on theme: "CSCI 347, Data Mining Chapter 4 – Functions, Rules, Trees, and Instance Based Learning."— Presentation transcript:

Similar presentations

About project

Feedback