Presentation is loading. Please wait.

Presentation is loading. Please wait.

M28- Categorical Analysis 1  Department of ISM, University of Alabama, 1992-2003 Categorical Data.

Similar presentations


Presentation on theme: "M28- Categorical Analysis 1  Department of ISM, University of Alabama, 1992-2003 Categorical Data."— Presentation transcript:

1

2 M28- Categorical Analysis 1  Department of ISM, University of Alabama, 1992-2003 Categorical Data

3 M28- Categorical Analysis 2  Department of ISM, University of Alabama, 1992-2003 Lesson Objective  Understand basic rules of probability.  Calculate marginal and conditional probabilities.  Determine if two categorical variables are independent.

4 M28- Categorical Analysis 3  Department of ISM, University of Alabama, 1992-2003   Recall Rule of Thumb: Quantitative variables: averages or differences have meaning. Ex: weight, height, income, age

5 M28- Categorical Analysis 4  Department of ISM, University of Alabama, 1992-2003   Recall Rule of Thumb: Categorical variables: classify people or things. Ex: gender, race, occupation, political affiliation, country of origin

6 M28- Categorical Analysis 5  Department of ISM, University of Alabama, 1992-2003 Note: Sometimes quantitative variables are expressed as categorical. Income (Family Economic Income) : Class Definition 1. Less than $30,000 2. $30,000 but less than $100,000 3. $100,000 or more.

7 M28- Categorical Analysis 6  Department of ISM, University of Alabama, 1992-2003 Relationships between variables

8 M28- Categorical Analysis 7  Department of ISM, University of Alabama, 1992-2003 Relationship between two quantitative variables? Is relationship linear (scatterplot)?  Use Correlation &  Least Squares Regression.   Data transformations.

9 M28- Categorical Analysis 8  Department of ISM, University of Alabama, 1992-2003 Best graphical tool for examining the relationship between a quantitative variable and a categorical variable, (i.e., comparing distributions). Recall: Boxplots USFar EastEurope Weight “Do the distributions of weights vary for different countries of origin?” Example: Weight vs. Country of Origin Boxplot can be used to answer:

10 M28- Categorical Analysis 9  Department of ISM, University of Alabama, 1992-2003 Relationship between two categorical variables? Use two-way frequency tables: Look at marginal probabilities and conditional probabilities.

11 10 Data M28- Categorical Data  Department of ISM, University of Alabama, 1995-2003 STATISTICSSTATISTICS is the science of transforming data into information to make decisions in the face of uncertainty.

12 M28- Categorical Analysis 11  Department of ISM, University of Alabama, 1992-2003 A numerical measure of the likelihood that an outcome or an event occurs. P(A) = probability of event A Probability How do we measure "uncertainty"?

13 M28- Categorical Analysis 12  Department of ISM, University of Alabama, 1992-2003 Three Methods for Assessing Probability  Classical  Relative Frequency  Subjective

14 M28- Categorical Analysis 13  Department of ISM, University of Alabama, 1992-2003 P(A) = 0  impossible event P(A) = 1  certain event 2. Sum of the probabilities of all possible outcomes must equal 1. (Binomial, Poisson) 1.0 < P(A) < 1 _ _ Probability requirements for discrete variables:

15 M28- Categorical Analysis 14  Department of ISM, University of Alabama, 1992-2003 Conditional probability: The chance one event happens, given that another event will occur. P(A | B) = P(A and B) P(B) All outcomes belonging to BOTH A AND B Those outcomes in the restricted group, B =

16 M28- Categorical Analysis 15  Department of ISM, University of Alabama, 1992-2003 Problem: Credit Card Manager New credit test to determine credit worthiness. Credit test checked against 500 previous customers.

17 M28- Categorical Analysis 16  Department of ISM, University of Alabama, 1992-2003 350 50 2080 Passed (P) Failed (F) Good (G) Default (D) 400 100 370130 500 Credit Test A Credit History

18 M28- Categorical Analysis 17  Department of ISM, University of Alabama, 1992-2003 P(D)  What is the probability of a customer defaulting given that he fails test A? What is the probability of a customer defaulting? P(D | F)  P(Defaults given failed test A) = P(Defaults) = 350 50 2080 PF G D 400 100 370130500

19 M28- Categorical Analysis 18  Department of ISM, University of Alabama, 1992-2003 General Rules: P(A and B) = P(A)  P(B|A) = P(B)  P(A|B) P(A or B) = P(A) + P(B) - P(A and B)

20 M28- Categorical Analysis 19  Department of ISM, University of Alabama, 1992-2003 P(Fails AND Defaults) = P(F)  P(D|F) 350 50 2080 PF G D 400 100 370130500

21 M28- Categorical Analysis 20  Department of ISM, University of Alabama, 1992-2003 P(Fails OR Defaults) = P(F) + P(D)  -  P(D AND F) Note: The “overlap” group would be counted twice if no subtraction. 350 50 2080 PF G D 400 100 370130500

22 M28- Categorical Analysis 21  Department of ISM, University of Alabama, 1992-2003 Does knowledge of “test A result” help you make a better decision? P(D)  P(D | F)  Do you want to know the test A results before you give the loan? “Credit test A results” and “defaulting” are ____________ on each other.

23 M28- Categorical Analysis 22  Department of ISM, University of Alabama, 1992-2003 A “Newer” Credit Test. Is it even better? A different sample of 500 credit records

24 M28- Categorical Analysis 23  Department of ISM, University of Alabama, 1992-2003 340 60 8515 Passed (P) Failed (F) Good (G) Default (D) 400 100 425 75 500 Credit Test B Credit History

25 M28- Categorical Analysis 24  Department of ISM, University of Alabama, 1992-2003 P(D)  What is the probability of a customer defaulting given that he fails test B? What is the probability of a customer defaulting? P(D | F)  P(Defaults given failed test B) = P(Defaults) = 340 60 8515 PF G D 400 100 425 75500

26 M28- Categorical Analysis 25  Department of ISM, University of Alabama, 1992-2003 Does knowledge of “test B result” help you make a better decision? P(D)  P(D | F)  Test B tells me. “Credit test B results” and “defaulting” are of each other.

27 M28- Categorical Analysis 26  Department of ISM, University of Alabama, 1992-2003 Independence

28 M28- Categorical Analysis 27  Department of ISM, University of Alabama, 1992-2003 Two events are independent if the occurrence, or non-occurrence, of one does not affect the chances of the other occurring, or not occurring. Otherwise, we say the events are dependent.

29 M28- Categorical Analysis 28  Department of ISM, University of Alabama, 1992-2003 independent If A and B independent, then P(A and B) = P(A)  P(B) P(A or B) = P(A) + P(B) - P(A)  P(B) P(A|B) = P(A) P(B|A) = P(B) Note: The condition does NOT change the probability.

30 M28- Categorical Analysis 29  Department of ISM, University of Alabama, 1992-2003 Survey of randomly selected people voters in Jan. 2001: Q1: Did you vote in the 2000 election? Q2: Do you favor an amendment to require a balanced budget? Q3: To which political party do you belong ?

31 M28- Categorical Analysis 30  Department of ISM, University of Alabama, 1992-2003 Political Party: Republican Democrat Other Total Do you favor amendment for a balanced budget? Yes No Total 90 44 48 182 218 400 172 148 80 82 104 32

32 Sample size Republican Democrat Other Total Party: Favor amendment Yes No Total 90 82 172 44 104 148 48 32 80 182 218 400 Marginal totals for opinion. Marginal totals for Party.

33 What proportion favor the amend.? What proportion claim to be Rep? and What proportion favor the amend. and are Other? Yes No Total Party Favor amend. 90 82 172 44 104 148 48 32 80 182 218 400 Repub Demo Other Total

34 What proportion favor the amend, given those that claim to be Rep? Of those that claim to be Democrat, what proportion favor the amend. Considering only those opposed, what proportion are not Republican? Yes No Total Party Favor amend. 90 82 172 44 104 148 48 32 80 182 218 400 Repub Demo Other Total

35 M28- Categorical Analysis 34  Department of ISM, University of Alabama, 1992-2003 Restrict subjects to only those that meet a condition. Within this restricted group, what is the distribution of some other var.? Distribution of “opinion” given those that claim to be Republican: P( Yes | Rep. ) =.523 P( No | Rep. ) =.477 90 172 82 172 “given that” Conditional Distribution:

36 M28- Categorical Analysis 35  Department of ISM, University of Alabama, 1992-2003 Is there a relationship between the party and the opinion on the amendment? What would you expect to happen if no relationship existed?

37 M28- Categorical Analysis 36  Department of ISM, University of Alabama, 1992-2003 Three Conditional Distributions: P( Yes | Rep.) =.523, P( No | Rep.) = P( Yes | Demo) =.297, P( No | Demo) = P( Yes | Other) =.600, P( No | Other) = Marginal Distribution: P( Yes ) =.455, P( No ) =.545 Is there a relationship? Why? or Why not?

38 M28- Categorical Analysis 37  Department of ISM, University of Alabama, 1992-2003 If there is NO relationship (i.e., independence) between the party and the opinion, then “the three conditional probabilities should be the close to each other and close to the marginal probability.”

39 M28- Categorical Analysis 38  Department of ISM, University of Alabama, 1992-2003 Three Conditional Probabilities: P( Yes | Rep.) =.523 P( Yes | Demo) =.297 P( Yes | Other) =.600 Marginal Probability: P( Yes ) =.455 Not close; therefore, party” and the “opinion” are Not close; therefore, “party” and the “opinion” are ____________. Are these close to each other? AND close to the “marginal”?

40 M28- Categorical Analysis 39  Department of ISM, University of Alabama, 1992-2003 Visual Displays Create with “Pivot Tables” in Excel.

41 M28- Categorical Analysis 40  Department of ISM, University of Alabama, 1992-2003 Rep. Demo. Other Barchart- Clustered Frequency Yes

42 M28- Categorical Analysis 41  Department of ISM, University of Alabama, 1992-2003 Rep. Demo. Other Barchart- Stacked Frequency Yes

43 M28- Categorical Analysis 42  Department of ISM, University of Alabama, 1992-2003 Rep. Demo. Other Barchart- Percents Percent Yes

44 M28- Categorical Analysis 43  Department of ISM, University of Alabama, 1992-2003 Summary For two categorical variables:  Must use conditional probabilities to determine if a relationship exists.  Cannot use correlation.  Visual display: Stacked percentage bar charts

45 M28- Categorical Analysis 44  Department of ISM, University of Alabama, 1992-2003 Quant. vs. Quant numerical graphical LS regression line, r, r-sq, std error Scatterplot, residual plots X-bar and s for each category Side-by-side box plots Two-way table, conditional & marginal distributions Bar chart : stacked, percent. Cat. vs. Cat. Quant. vs. Cat. Variables Associations between TWO Variables

46 M28- Categorical Analysis 45  Department of ISM, University of Alabama, 1992-2003 The End


Download ppt "M28- Categorical Analysis 1  Department of ISM, University of Alabama, 1992-2003 Categorical Data."

Similar presentations


Ads by Google