Presentation is loading. Please wait.

Presentation is loading. Please wait.

Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project.

Similar presentations


Presentation on theme: "Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project."— Presentation transcript:

1 Section 2.1 Introduction to Enterprise Miner

2 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project in Enterprise Miner. Conduct initial data exploration using Enterprise Miner.

3 3 This demonstration illustrates opening Enterprise Miner and exploring its workspace components. Demonstration

4 4 The Scenario Determine who should be approved for a home equity loan. The target variable is a binary variable that indicates whether an applicant eventually defaulted on the loan. The input variables are variables such as the amount of the loan, amount due on the existing mortgage, the value of the property, and the number of recent credit inquiries.

5 5 This demonstration illustrates setting up a project in Enterprise Miner and conducting initial data exploration. Demonstration

6 Section 2.2 Modeling Issues and Data Difficulties

7 7 Objectives Discuss data difficulties inherent in data mining. Examine common pitfalls in model building.

8 8 Time Line Projected: Actual: Dreaded: Needed: Data PreparationData Analysis Allotted Time (Data Acquisition)

9 9 Time Line Projected: Actual: Dreaded: Needed: Data PreparationData Analysis Allotted Time (Data Acquisition)

10 10 Data Arrangement Acct type 2133 MTG 2133 SVG 2133 CK 2653 CK 2653 SVG 3544 MTG 3544 CK 3544 MMF 3544 CD 3544 LOC Acct CK SVG MMF CD LOC MTG 2133 1 1 0 0 0 1 2653 1 1 0 0 0 0 3544 1 0 1 1 1 1 Long-Narrow Short-Wide

11 11 Derived Inputs Claim Accident Date Time 11nov96 102396/12:38 22dec95 012395/01:42 26apr95 042395/03:05 02jul94 070294/06:25 08mar96 123095/18:33 15dec96 061296/18:12 09nov94 110594/22:14 Delay Season Dark 19 fall 0 333 winter 1 3 spring 1 0 summer 0 69 winter 0 186 summer 0 4 fall 1

12 12 Roll Up HH Acct Sales 4461 2133 160 4461 2244 42 4461 2773 212 4461 2653 250 4461 2801 122 4911 3544 786 5630 2496 458 5630 2635 328 6225 4244 27 6225 4165 759 HH Acct Sales 4461 2133 ? 4911 3544 ? 5630 2496 ? 6225 4244 ?

13 13 Rolling Up Longitudinal Data Frequent Flying VIP Flier Month Mileage Member 10621 Jan 650 No 10621 Feb 0 No 10621 Mar 0 No 10621 Apr 250 No 33855 Jan 350 No 33855 Feb 300 No 33855 Mar 1200 Yes 33855 Apr 850 Yes

14 14 Transactions Hard Target Search Fraud

15 15 Oversampling OK Fraud

16 16 Undercoverage Accepted Good Rejected No Follow-up Accepted Bad Next Generation

17 17 cking #cking ADB NSF dirdep SVG bal Y 1 468.11 1 1876 Y 1208 Y 1 68.75 0 0 Y 0 Y 1 212.04 0 6 0.. 0 0 Y 4301 y 2 585.05 0 7218 Y 234 Y 1 ­47.69 2 1256 238 Y 1 4687.7 0 0 0.. 1 0 Y 1208 Y... 1598 0 1 0.00 0 0 0 Y 3 89981.12 0 0 Y 45662 Y 2 585.05 0 7218 Y 234 Errors, Outliers, and Missings

18 18 Missing Value Imputation Cases Inputs ? ? ? ? ? ? ? ? ?

19 19 The Curse of Dimensionality 1–D 2–D 3–D

20 20 Dimension Reduction Input 2 Input 1 E(Target) Irrelevancy

21 Fool’s Gold My model fits the training data perfectly... I’ve struck it rich!

22 22 Data Splitting

23 23 Model Complexity Too flexible Not flexible enough

24 24 Overfitting Training SetTest Set

25 25 Better Fitting Training SetTest Set

26 Section 2.3 Introduction to Decision Trees

27 27 Objectives Explore the general concept of decision trees. Understand the different decision tree algorithms. Discuss the benefits and drawbacks of decision tree models.

28 28 Fitted Decision Tree NINQ >1 75% 2%2% 01-2 45% DELINQ DEBTINC <45 45 10% 0,1 21% >2 BAD = New Case DEBTINC = 20 NINQ = 2 DELINQ = 0 Income = 42K 45%

29 29 Divide and Conquer n = 5,000 10% BAD n = 3,350n = 1,650 Debt-to-Income Ratio < 45 yesno 21% BAD5% BAD

30 30 The Cultivation of Trees Split Search –Which splits are to be considered? Splitting Criterion –Which split is best? Stopping Rule –When should the splitting stop? Pruning Rule –Should some branches be lopped off?

31 31 Possible Splits to Consider 1 100,000 200,000 300,000 400,000 500,000 2468101214161820 Nominal Input Ordinal Input Input Levels

32 32 Splitting Criteria Left Right Perfect Split Debt-to-Income Ratio < 45 A Competing Three-Way Split 45000 0500 31961304 154346 Not Bad Bad 25211188 115162 791 223 LeftCenterRight 4500 500 4500 500 4500 500 Not Bad Bad Not Bad Bad

33 33 The Right-Sized Tree Stunting Pruning

34 34 A Field Guide to Tree Algorithms CART AID THAID CHAID ID3 C4.5 C5.0

35 35 Benefits of Trees Interpretability –tree-structured presentation Mixed Measurement Scales –nominal, ordinal, interval Regression trees Robustness Missing Values

36 36 Benefits of Trees Automatically –Detects interactions (AID) –Accommodates nonlinearity –Selects input variables Multivariate Step Function

37 37 Drawbacks of Trees Roughness Linear, Main Effects Instability

38 Section 2.4 Building and Interpreting Decision Trees

39 39 Objectives Explore the types of decision tree models available in Enterprise Miner. Build a decision tree model. Examine the model results and interpret these results. Choose a decision threshold theoretically and empirically.

40 40 This demonstration illustrates building a decision tree model with Enterprise miner and examining the results. Demonstration

41 41 Consequences of a Decision Decision 1Decision 0 Actual 1True PositiveFalse Negative Actual 0False PositiveTrue Negative

42 42 Example Recall the home equity line of credit scoring example. Presume that every two dollars loaned eventually returns three dollars if the loan is paid off in full.

43 43 Consequences of a Decision Decision 1Decision 0 Actual 1True PositiveFalse Negative (cost=$2) Actual 0False Positive (cost=$1) True Negative

44 44 Bayes Rule

45 45 Consequences of a Decision Decision 1Decision 0 Actual 1True Positive (profit=$2) False Negative Actual 0False Positive (profit=$-1) True Negative

46 46 This demonstration illustrates using the target profile to select a decision threshold. Demonstration


Download ppt "Section 2.1 Introduction to Enterprise Miner. 2 Objectives Open Enterprise Miner. Explore the workspace components of Enterprise Miner. Set up a project."

Similar presentations


Ads by Google