Presentation is loading. Please wait.

Presentation is loading. Please wait.

Decision Tree Algorithms Rule Based Suitable for automatic generation.

Similar presentations


Presentation on theme: "Decision Tree Algorithms Rule Based Suitable for automatic generation."— Presentation transcript:

1 Decision Tree Algorithms Rule Based Suitable for automatic generation

2 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-2 Decision trees Logical branching Historical: –ID3 – early rule- generating system Branches: –Different possible values Nodes: –From which branches emanate

3 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-3 Goal-Driven Data Mining Define goal –Identify fraudulent cases Develop rules identifying attributes attaining that goal –IF attorney = Smith, THEN better check

4 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-4 Tree Structure Sorts out data –IF THEN rules –Loan variables Age: {young, middle, old} Income: {low, average, high} Risk: {low, medium, high} Exhaustive tree enumerates all combinations –81 combinations – classify all

5 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-5 Types of Trees Classification tree –Variable values classes –Finite conditions Regression tree –Variable values continuous numbers –Prediction or estimation

6 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-6 Rule Induction Automatically process data –Classification (logical, easier) –Regression (estimation, messier) Search through data for patterns & relationships –Pure knowledge discovery Assumes no prior hypothesis Disregards human judgment

7 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-7 Example Three variables: –Age –Income –Risk Outcomes: –On-time –Late

8 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-8 Combinations VariableValueCasesOTLatePr(OT) AgeYoung12840.67 Middle5410.80 Old3301.00 IncomeLow5320.60 Average9720.78 High6510.83 RiskHigh9540.55 Average1010.00 Low10 01.00

9 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-9 Basis for Classification If a category has all outcomes of a certain kind, that makes a good rule –IF income = High, they always paid ENTROPY: Measure of content –Actually measure of randomness

10 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-10 Entropy formula Information = -{p/(p+n)}log 2 {p/(p+n)} -{n/(p+n)}log 2 {n/(p+n)} The lower the measure, the greater the information content Can use to automatically select variable with most productive rule potential

11 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-11 Entropy Young - 8/12 x -0.390 – 4/12 x -0.528 x 12/20: 0.551 Middle - 4/5 x -0.258 – 1/5 x -0.464 x 5/20: 0.180 Old - 3/3 x 0 – 0/3 x 0 x 3/20:0.000 SUM0.731 Income0.782 Risk0.446

12 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-12 Rule 1.IF(Risk = Low) THEN OT 2.ELSE LATE

13 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-13 All Rules 1.IF Risk=LowOT 2.IF Risk NOT Low & Age=MiddleLate 3.IF Risk NOT Low & Age NOT Middle & Income=HighLate 4.ELSEOT

14 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-14 Sample Case Age 36Middle Income $70K/yearAverage Risk: –Assets $42K –Debts $40K –Wants $5KAverage Rule 2 applies, says Late

15 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-15 Fuzzy Decision Trees Have assumed distinct (crisp) outcomes Many data points not that clear Fuzzy: Membership function represents belief (between 0 and 1) Fuzzy relationships have been incorporated in decision tree algorithms

16 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-16 Fuzzy Example Age Young 0.3 Middle 0.9Old 0.2 IncomeLow 0.0Average 0.8High 0.3 RiskLow 0.1Average 0.8High 0.3 Definitions: –Sum will not necessarily equal 1.0 –If ambiguous, select alternative with larger membership value –Aggregate with mean

17 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-17 Fuzzy Model IF Risk=Low Then OT –Membership function: 0.1 IF Risk NOT Low & Age=Middle Then Late –Risk MAX(0.8, 0.3) –Age 0.9 –Membership function: Mean = 0.85

18 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-18 Fuzzy Model cont. IF Risk NOT Low & Age NOT Middle & Income=High THEN Late –Risk MAX(0.8, 0.3)0.8 –Age MAX(0.3, 0.2)0.3 –Income 0.3 –Membership function: Mean = 0.433

19 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-19 Fuzzy Model cont. IF Risk NOT Low & Age NOT Middle & Income NOT High THEN Late –Risk MAX(0.8, 0.3)0.8 –Age MAX(0.3, 0.2)0.3 –Income MAX(0.0, 0.8)0.8 –Membership function: Mean = 0.633

20 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-20 Fuzzy Model cont. Highest membership function is 0.633, for Rule 4 Conclusion: On-time

21 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-21 Applications Inventory Prediction Clinical Databases Software Development Quality

22 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-22 Inventory Prediction Groceries –Maybe over 100,000 SKUs –Barcode data input Data mining to discover patterns –Random sample of over 1.6 million records –30 months –95 outlets –Test sample 400,000 records Rule induction more workable than regression –28,000 rules –Very accurate, up to 27% improvement

23 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-23 Clinical Database Headache –Over 60 possible causes Exclusive reasoning uses negative rules –Use when symptom absent Inclusive reasoning uses positive rules Probabilistic rule induction expert system –Headache: Training sample over 50,000 cases, 45 classes, 147 attributes –Meningitis: 1200 samples on 41 attributes, 4 outputs

24 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-24 Clinical Database Used AQ15, C4.5 –Average accuracy 82% Expert System –Average accuracy 92% Rough Set Rule System –Average accuracy 70% Using both positive & negative rules from rough sets –Average accuracy over 90%

25 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-25 Software Development Quality Telecommunications company Goal: find patterns in modules being developed likely to contain faults discovered by customers –Typical module several million lines of code –Probability of fault averaged 0.074 Apply greater effort for those –Specification, testing, inspection

26 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-26 Software Quality Preprocessed data Reduced data Used CART –(Classification & Regression Trees) –Could specify prior probabilities First model 9 rules, 6 variables –Better at cross-validation –But variable values not available until late Second model 4 rules, 2 variables –About same accuracy, data available earlier

27 McGraw-Hill/Irwin©2007 The McGraw-Hill Companies, Inc. All rights reserved 8-27 Decision Trees Very effective & useful Automatic machine learning –Thus unbiased (but omit judgment) Can handle very large data sets –Not affected much by missing data Lots of software available


Download ppt "Decision Tree Algorithms Rule Based Suitable for automatic generation."

Similar presentations


Ads by Google