Presentation is loading. Please wait.

Presentation is loading. Please wait.

IDSL, Intelligent Database System Lab

Similar presentations


Presentation on theme: "IDSL, Intelligent Database System Lab"— Presentation transcript:

1 IDSL, Intelligent Database System Lab
Learning and Making Decisions When Costs and Probabilities are Both Uknown Authors:Bianca Zadrozny, Charles Elkan Advisor:Dr. Hsu Graduate:Yu-Wei Su 2019/2/25 IDSL, Intelligent Database System Lab

2 IDSL, Intelligent Database System Lab
Outline Motivation Objective Introduction MetaCost vs. direct cost-sensitive decision-making a testbed:The KDD’98 charitable donations dataset Probability estimation methods Estimaition donation amounts Experimental results Conclusion opinion 2019/2/25 IDSL, Intelligent Database System Lab

3 IDSL, Intelligent Database System Lab
Motivation Misclassification costs are different for different examples, in the same way of probabilities Problems of data unbalance in real world dataset 2019/2/25 IDSL, Intelligent Database System Lab

4 IDSL, Intelligent Database System Lab
Objective To make optimal decisions given cost and probabilities Solution of sample bias based on Nobel prize-winning economist, James Heckman 2019/2/25 IDSL, Intelligent Database System Lab

5 IDSL, Intelligent Database System Lab
Introduction Most supervised learning algorithms assume all errors(incorrect predictions) are equal—not true Cost-sensitive learning lead to the lowest expected cost Non cost-sensitive learning classified as accurate To present an alternative method call direct cost-sensitive decision-making 2019/2/25 IDSL, Intelligent Database System Lab

6 MetaCost vs. direct cost-sensitive decision-making
Each example x is associated with a cost C(i,j,x) of predicting class i for x when the true class of x is j The optimal decision concerning x is the class i that leads to the lowest expected cost 2019/2/25 IDSL, Intelligent Database System Lab

7 MetaCost vs. direct cost-sensitive decision-making( cont.)
Direct cost-sensitive decsion-making has the same central idea but two difference MetaCost is based on the assumption that costs are known in advance and are the same for all examples do not estimate probabilities using bagging, using simpler method based on single decison tree 2019/2/25 IDSL, Intelligent Database System Lab

8 A testbed:the KDD’98 charitable donations dataset
Training set consists of records with known classes;test set consists of records without known classes The overall percentage of donors among population is about 5% The donation amount for persons who respond varies from $1 to $200 2019/2/25 IDSL, Intelligent Database System Lab

9 A testbed:the KDD’98 charitable donations dataset( cont.)
In donation domain it is easier to talk consistently about benefit than than cost The optimal predicted label for example x is the class i that maximizes(j=1 mean the person does donate;j=0 not donate) 2019/2/25 IDSL, Intelligent Database System Lab

10 A testbed:the KDD’98 charitable donations dataset( cont.)
The optimal policy 2019/2/25 IDSL, Intelligent Database System Lab

11 Probability estimation methods
Deficiencies of decison tree methods Smoothing Curtailment Calibrating naive Bayes classifier scores Averaging probability estimates 2019/2/25 IDSL, Intelligent Database System Lab

12 Deficiencies of decison tree methods
Standard decision tree methods assign by default the raw training frequency p=k/n These are not accurate conditional probability estimate for at least two reasons High bias High variance Pruning methods can alleviate it but it is not suitable for unbalanced datasets 2019/2/25 IDSL, Intelligent Database System Lab

13 Deficiencies of decison tree methods( cont.)
The solution use C4.5 without pruning and without collapsing to obtain raw scores that can be transformed into accurate class membership probabilities 2019/2/25 IDSL, Intelligent Database System Lab

14 IDSL, Intelligent Database System Lab
Smoothing Using the Laplace correction method For a two-class problem, it replaces the conditional probability estimate p=k/n by p’=(k+1)/(n+2) that adjusts probabilities estimates to be closer to ½ With donation it replace the probability p=k/n by p’=(k+bm)/(n+m),where b is the base rate of the positive class and m is a parameter 2019/2/25 IDSL, Intelligent Database System Lab

15 IDSL, Intelligent Database System Lab
Smoothing( cont.) For example, a leaf contains four examples, one of which is positive, the raw C4.5 score of this leaf is 0.25. The smoothed score with m=200 and b=0.05 is 2019/2/25 IDSL, Intelligent Database System Lab

16 IDSL, Intelligent Database System Lab
Smoothing( cont.) 2019/2/25 IDSL, Intelligent Database System Lab

17 IDSL, Intelligent Database System Lab
Curtailment To overcome the problem of overfit Curtailment is not equivalent to any type of pruning 2019/2/25 IDSL, Intelligent Database System Lab

18 IDSL, Intelligent Database System Lab
Curtailment( cont.) 2019/2/25 IDSL, Intelligent Database System Lab

19 IDSL, Intelligent Database System Lab
Curtailment( cont.) 2019/2/25 IDSL, Intelligent Database System Lab

20 Calibrating naive Bayes classifier scores
Using a histogram method to obtain calibrated probabilityestimates from a naive Bayesian classifier Sort the training examples acording to their scores and divide the sorted set into b equal size bins Given a test example x, place it in a bin according to its score n(x) and then estimate the corrected probability 2019/2/25 IDSL, Intelligent Database System Lab

21 Averaging probability estimates
Combining the probability estimates given by different classifiers throught averaging can reduce the variance of the probability estimates[ Tumer and Ghosh,1995] Where is the variance of each original clasifier,N is the number of classifiers and is the correlatin factor among all classifiers 2019/2/25 IDSL, Intelligent Database System Lab

22 Estimaition donation amounts
For non-donors in the training set it should impute a donation amount of zero since their actual donation amount is zero as analogous to donation probability It is also wrong to using the same donation estimate for all test examples means that the decision about donate is based on the probability 2019/2/25 IDSL, Intelligent Database System Lab

23 Estimaition donation amounts( cont.)
These costs or benefits must be estimated for each example Using least-squares multiple linear regression(MLB) to estimate donaition Lastgift:dollar amount of most recent gift Ampergift:average gift amount in responses to the last 22 promotions 2019/2/25 IDSL, Intelligent Database System Lab

24 Estimaition donation amounts( cont.)
The problem of sample selection bias Donation amounts estimated by the regression equation tend to be too low for test examples that have a low probability of donation 2019/2/25 IDSL, Intelligent Database System Lab

25 Estimaition donation amounts( cont.)
Heckman correction To learn a probit linear model to estimate conditional probabilities P(j=1|x) To estimate y(x) by llinear regression using only the training examples x for which j(x)=1,but including value of P(j=1|x) Second step of Heckman’s procedure in this paper is obtain by decision tree or a navie Bayes classifier 2019/2/25 IDSL, Intelligent Database System Lab

26 IDSL, Intelligent Database System Lab
Experimental results 2019/2/25 IDSL, Intelligent Database System Lab

27 IDSL, Intelligent Database System Lab
Conclusion The method of cost-sensitive learning that performs systematically better than MetaCost in experiments To provide a solution to the fundamental problem of costs being different for different examples To identify and solve the problem of sample selection bias 2019/2/25 IDSL, Intelligent Database System Lab

28 IDSL, Intelligent Database System Lab
Opinion Frequency is not the only metric Positive and negative classes are not 1 and 0 question 2019/2/25 IDSL, Intelligent Database System Lab


Download ppt "IDSL, Intelligent Database System Lab"

Similar presentations


Ads by Google