Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Mining CSCI 307, Spring 2019 Lecture 9

Similar presentations


Presentation on theme: "Data Mining CSCI 307, Spring 2019 Lecture 9"— Presentation transcript:

1 Data Mining CSCI 307, Spring 2019 Lecture 9
Output: Rules

2 First, install the package we need…. Manually Build a Decision Tree
We want to use the "User Classify" facility in WEKA. In WEKA 3.8.1, this must be installed. From the window that pops up, scroll down and choose "userClassifier" to install. Get a message that it was installed successfully.

3 Manually Build a Decision Tree
Use the "User Classify" facility in WEKA. Click on Classify tab Click Choose/trees/UserClassifier Visualize the data. Two-Way Split: Find a pair of attributes that discriminates the class well. Draw a polygon around them. Next slide: use petal-length and petal-width to "isolate" the Iris versicolor class. Switch to view the tree.

4

5 Can manually construct for a bit and then select a ML algorithm to finish.
This group only contains one "mistake" -- a lone virginica Need to refine this further, but only 3 versicolors contaminate this side.

6 Interactive decision tree construction (You can try this on your own)
Load segment-challenge.arff; look at dataset Select UserClassifier (tree classifier) Use the test set segment-test.arff Examine data visualizer and tree visualizer Plot region-centroid-row vs intensity-mean Rectangle, Polygon and Polyline selection tools … several selections … Rightclick in Tree visualizer and Accept the tree Given enough time, we could produce a "perfect" tree for the dataset, but would it perform well on the testdata?

7 Classification Rules Popular alternative to decision trees
If attributeONE and attributeTWO then CLASS is x Popular alternative to decision trees Antecedent (pre-condition): a series of tests (just like the tests at the nodes of a decision tree) Tests are usually logically ANDed together (but may also be general logical expressions) Consequent (conclusion): classes, set of classes, or probability distribution assigned by rule Individual rules are often logically ORed together Conflicts arise if different conclusions apply

8 From Trees to Rules Easy: converting a tree into a set of rules
One rule for each leaf: Antecedent contains a condition for every node on the path from the root to the leaf Consequent is class assigned by the leaf Produces rules that are unambiguous Doesn’t matter in which order they are executed But: resulting rules are unnecessarily complex Pruning to remove redundant tests/rules

9 Example

10 From Rules to Trees #1 More difficult: transforming a rule set into a tree Tree cannot easily express disjunction between rules Example: rules that test different attributes Symmetry needs to be broken Corresponding tree contains identical subtrees (==> “replicated subtree problem”) if a and b then x if c and d then x Need to choose a single test for the root node.

11 A Decision Tree for a Simple Disjunction
if a and b then x if c and d then x

12 From Rules to Trees #2 More difficult: but sometimes it's not Example: Exclusive Or Problem Here, want the class to be a only when have opposite attribute values.

13 The Exclusive-Or Problem
What would the rules look like for this problem? What would the tree look like? if x = 1 and y = 0 then class = a if x = 0 and y = 1 if x = 0 and y = 0 then class = b if x = 1 and y = 1

14 From Rules to Trees #3 More difficult: transforming a rule set into a tree Tree cannot easily handle Default clauses Example: Four attributes, each can be 1, 2, or 3. if x = 1 and y = 1 then class = a if z = 1 and w = 1 otherwise class = b Replicated subtree problem again.

15 Decision Tree with a Replicated Subtree
if x = 1 and y = 1 then class = a if z = 1 and w = 1 then class = a otherwise class = b

16 So maybe the "ease" of adding a new rule is an illusion and not fact.
One reason "rules" are popular: "Nuggets" of Knowledge Are rules independent pieces of knowledge? (It seems easy to add a rule to an existing rule base.) Here’s the Problem: ignores how rules are executed Two ways of executing a rule set: Ordered set of rules (“decision list”) Order is important for interpretation Unordered set of rules Rules may overlap and lead to different conclusions for the same instance Adding to a tree may cause total reshaping. So maybe the "ease" of adding a new rule is an illusion and not fact.

17 Interpreting Rules What if two or more rules conflict?
What if no rule applies to a test instance? e.g. Different rules lead to different conclusions for the same instance. This cannot happen with decision trees, or with rules that are read from decision trees, but it can happen.

18 Straightforward: A form of closed-world assumption
When class is "boolean" and only one outcome is expressed. Assumption: if instance does not belong to class “yes”, it belongs to class “no” Trick: only learn rules for class “yes” and use default rule for “no” Order of rules is not important. No conflicts! Rule can be written in disjunctive normal form if x = 1 and y = 1 then class = a if z = 1 and w = 1 then class = a otherwise class = b i.e. OR a bunch of AND conditions.

19 Association Rules.... Problem: immense number of possible associations
… can predict any attribute and combinations of attributes … are not intended to be used together as a set (versus classification rules that are intended to be used together) Problem: immense number of possible associations Output needs to be restricted to show only the most predictive associations ==> only those with high support and high confidence Coverage (AKA support): Number of instances the rule predicts correctly. Accuracy (AKA confidence): Number of instances it predicts correctly as a proportion of all instances it applies to.

20 Outlook Temperature Humidity Windy Play
Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No Example: 4 cool days with normal humidity ==> Support = 4, confidence = 100% if temperature = cool then humidity = normal

21 Example: ==> Support = ?, confidence = ?
Might not meet the minimum support and confidence threshold, so this particular rule would not be generated, but it is an example of support and confidence. Outlook Temperature Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No Example: ==> Support = ?, confidence = ? if play = yes then windy = false

22 Support and Confidence of a Rule
Support: number of instances predicted correctly Confidence: number of correct predictions, as proportion of all instances that rule applies to Example: 4 cool days with normal humidity ==> Support = 4, confidence = 100% Normally: minimum support and confidence pre-specified (e.g. 58 rules with support >= 2 and confidence >= 95% for weather data) if temperature = cool then humidity = normal

23 Interpreting Association Rules
Outlook Temperature Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No Interpretation is not obvious is not a shorthand for But 1st rule means that the following holds: if windy = false and play = no then outlook = sunny and humidity = high if windy = false and play = no then outlook = sunny if windy = false and play = no then humidity = high Recall from Discrete Structures, bind more means stronger! A formula is STRONGER if it restricts the state more. A formula is WEAKER when the fewest restrictions are in place. The state true is the weakest (true in all states). The state false is the strongest (true in no states). though in the rule if windy = false and play = no then outlook = sunny and humidity = high the consequent is stronger than this rule's consequent if humidity = high and windy = false and play = no then outlook = sunny if humidity = high and windy = false and play = no then outlook = sunny


Download ppt "Data Mining CSCI 307, Spring 2019 Lecture 9"

Similar presentations


Ads by Google