Presentation is loading. Please wait.

Presentation is loading. Please wait.

Why this is useful  Failure as a statistician/ analyst often is failure to clearly communicate  Need to communicate results to non-technical decision-

Similar presentations


Presentation on theme: "Why this is useful  Failure as a statistician/ analyst often is failure to clearly communicate  Need to communicate results to non-technical decision-"— Presentation transcript:

1 Why this is useful  Failure as a statistician/ analyst often is failure to clearly communicate  Need to communicate results to non-technical decision- makers – politicians, judges  Can provide insight into data for both internal (YOU) and external uses

2 Descriptive uses of categorical data  Describe sample  Check Data Quality  Answer Descriptive Questions

3 Please pay attention …. Categorical data can be either nominal or ordinal. It is perfectly reasonable to discuss whether ordinal data are skewed (and often, one of the most interesting findings is that it is) With categorical data, it is also useful to look at distributions

4 Questions related to distributions What is the distribution of students’ expectations about their own likely educational attainment? Is the distribution of students’ expectations about their own likely educational attainment skewed?

5 Easy creation of charts & tables with SAS Enterprise Guide  Just pointing and clicking  Also available free to university researchers and students via SAS on-demand  Characterize data tasks gives frequency distributions for all categorical variables and charts for ALL variables

6 Answer the questions The distribution of students’ expectations for educational attainment are shown above. The median expectation was to finish a Bachelor’s degree. Only 17.1% of students expected to complete less than a four-year degree. The distribution of educational expectations is very skewed.

7 Categorical data that is in order The distribution of homework hours is somewhat positively skewed Mean = 2.68 Median = 1-2 hours ( Category 3) Mode = 2

8 Getting the data Figure 1.1 FILE> OPEN> DATA

9 Tasks> describe> characterize data Figure 1.5 ALWAYS DO THIS !!

10 Just click through the windows and accept all of the defaults.

11

12 Some Coding ODS GRAPHICS ON ; * PROC FREQ DATA = dsname ; TABLES varname ; Will produce histograms and one-way tables * Not needed in SAS 9.3

13 (Sounds more impressive than it is) Bi-variate categorical data analysis

14 Homes without computers have fewer books

15 Children of mother’s with more education are less likely to fail Notice how the further down the column you go, the smaller the column percentages

16 Bi-variate distributions  Is there a relationship between school failure and mother’s education?  Is there a relationship between the number of books (this was a category) in the home and whether a family has a computer  Is there a relationship between mother’s education and father’s education

17 Answer by trend, proportion, odds  The trend of the data showed a lower likelihood of a student failing a grade the higher the educational level of the moms surveyed.  At the 0-11 yrs of education level 73% of the students never failed a grade while at 16 yrs or more it was 88% never having failed.  Students whose mothers had not finished high school were more than twice as likely to fail a grade as children of college graduate mothers

18 Mothers tend to be married to fathers with similar education Note that the highest row percentages tend to be in the diagonals where the parents’ education is the same

19 Some More Coding PROC FREQ DATA = dsname ; TABLES varname1 * varname2 / AGREE ; FOR CORRELATED DATA

20 Correlated Data

21 McNemar’s Test

22 Correlated Data

23 Cohen’s Kappa 1.0 = perfect agreement Negative Kappa is not an error, it means the two agree less than chance = Probability observed – Probability expected 1 – Probability expected

24 Chi-square (wrong) Notice you do NOT get identical p-value

25 Fisher (wrong) Notice you do NOT get identical p-value

26 Some More Coding PROC FREQ DATA = dsname ; TABLES varname1 * varname2 / chisq ;

27 Chi-square (right)

28 Right

29


Download ppt "Why this is useful  Failure as a statistician/ analyst often is failure to clearly communicate  Need to communicate results to non-technical decision-"

Similar presentations


Ads by Google