Why this is useful  Failure as a statistician/ analyst often is failure to clearly communicate  Need to communicate results to non-technical decision-

Slides:



Advertisements
Similar presentations
Bivariate Analysis Cross-tabulation and chi-square.
Advertisements

Chapter 13: The Chi-Square Test
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
Statistics for the Social Sciences
QUANTITATIVE DATA ANALYSIS
Chapter18 Determining and Interpreting Associations Among Variables.
Chapter 13 Analyzing Quantitative data. LEVELS OF MEASUREMENT Nominal Measurement Ordinal Measurement Interval Measurement Ratio Measurement.
Chapter 14 Analyzing Quantitative Data. LEVELS OF MEASUREMENT Nominal Measurement Nominal Measurement Ordinal Measurement Ordinal Measurement Interval.
Chi-square Test of Independence
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Social Research Methods
Brown, Suter, and Churchill Basic Marketing Research (8 th Edition) © 2014 CENGAGE Learning Basic Marketing Research Customer Insights and Managerial Action.
Cross-Tabulations.
Statistics for CS 312. Descriptive vs. inferential statistics Descriptive – used to describe an existing population Inferential – used to draw conclusions.
LEVEL OF MEASUREMENT Data is generally represented as numbers, but the numbers do not always have the same meaning and cannot be used in the same way.
Problem 1: Relationship between Two Variables-1 (1)
Correlation Question 1 This question asks you to use the Pearson correlation coefficient to measure the association between [educ4] and [empstat]. However,
8/15/2015Slide 1 The only legitimate mathematical operation that we can use with a variable that we treat as categorical is to count the number of cases.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 15 The.
Measures of Central Tendency
Learning Objective Chapter 13 Data Processing, Basic Data Analysis, and Statistical Testing of Differences CHAPTER thirteen Data Processing, Basic Data.
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.
Statistical Analysis I have all this data. Now what does it mean?
With Statistics Workshop with Statistics Workshop FunFunFunFun.
Bivariate Relationships Analyzing two variables at a time, usually the Independent & Dependent Variables Like one variable at a time, this can be done.
14 Elements of Nonparametric Statistics
9/23/2015Slide 1 Published reports of research usually contain a section which describes key characteristics of the sample included in the study. The “key”
Smith/Davis (c) 2005 Prentice Hall Chapter Four Basic Statistical Concepts, Frequency Tables, Graphs, Frequency Distributions, and Measures of Central.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Chapter 15 Data Analysis: Testing for Significant Differences.
SW388R6 Data Analysis and Computers I Slide 1 Central Tendency and Variability Sample Homework Problem Solving the Problem with SPSS Logic for Central.
1 Applied Statistics Using SAS and SPSS Topic: Chi-square tests By Prof Kelly Fan, Cal. State Univ., East Bay.
Proc freq: Five secrets* *Okay, well, lesser known facts.
Tests of Significance June 11, 2008 Ivan Katchanovski, Ph.D. POL 242Y-Y.
Chapter 16 The Chi-Square Statistic
Categorical Data Analysis: When life fits in little boxes AnnMaria DeMars, PhD.
Measures of Central Tendency: The Mean, Median, and Mode
AnnMaria De Mars, Ph.D. The Julia Group Santa Monica, CA Categorical data analysis: For when your data DO fit in little boxes.
Descriptive statistics Petter Mostad Goal: Reduce data amount, keep ”information” Two uses: Data exploration: What you do for yourself when.
Chi-square Test of Independence
Categorical data analysis: An overview of statistical techniques AnnMaria De Mars The Julia Group AnnMaria De Mars The Julia Group.
Fundamental Statistics in Applied Linguistics Research Spring 2010 Weekend MA Program on Applied English Dr. Da-Fu Huang.
Chapter 11: Chi-Square  Chi-Square as a Statistical Test  Statistical Independence  Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
Practice Problem: Lambda (1)
Chapter 6: Analyzing and Interpreting Quantitative Data
12/23/2015Slide 1 The chi-square test of independence is one of the most frequently used hypothesis tests in the social sciences because it can be used.
Leftover Slides from Week Five. Steps in Hypothesis Testing Specify the research hypothesis and corresponding null hypothesis Compute the value of a test.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
STATISTICS STATISTICS Numerical data. How Do We Make Sense of the Data? descriptively Researchers use statistics for two major purposes: (1) descriptively.
Describing Distributions Statistics for the Social Sciences Psychology 340 Spring 2010.
Chi-Square X 2. Review: the “null” hypothesis Inferential statistics are used to test hypotheses Whenever we use inferential statistics the “null hypothesis”
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
Review: Stages in Research Process Formulate Problem Determine Research Design Determine Data Collection Method Design Data Collection Forms Design Sample.
QM Spring 2002 Business Statistics Bivariate Analyses for Qualitative Data.
1 Hypothesis testing & Chi-square COMM Nan Yu Fall 2007.
STATISTICS STATISTICS Numerical data. How Do We Make Sense of the Data? descriptively Researchers use statistics for two major purposes: (1) descriptively.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Chapter 15 Analyzing Quantitative Data. Levels of Measurement Nominal measurement Involves assigning numbers to classify characteristics into categories.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Chapter 4: Measures of Central Tendency. Measures of central tendency are important descriptive measures that summarize a distribution of different categories.
POLS 7000X STATISTICS IN POLITICAL SCIENCE CLASS 9 BROOKLYN COLLEGE-CUNY SHANG E. HA Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for.
STATS DAY First a few review questions. Which of the following correlation coefficients would a statistician know, at first glance, is a mistake? A. 0.0.
Statistical Reasoning
Social Research Methods
STATS DAY First a few review questions.
MG3117 Issues and Controversies in Accounting
Applied Statistics Using SPSS
Chapter 11 Analyzing the Association Between Categorical Variables
Applied Statistics Using SPSS
Presentation transcript:

Why this is useful  Failure as a statistician/ analyst often is failure to clearly communicate  Need to communicate results to non-technical decision- makers – politicians, judges  Can provide insight into data for both internal (YOU) and external uses

Descriptive uses of categorical data  Describe sample  Check Data Quality  Answer Descriptive Questions

Please pay attention …. Categorical data can be either nominal or ordinal. It is perfectly reasonable to discuss whether ordinal data are skewed (and often, one of the most interesting findings is that it is) With categorical data, it is also useful to look at distributions

Questions related to distributions What is the distribution of students’ expectations about their own likely educational attainment? Is the distribution of students’ expectations about their own likely educational attainment skewed?

Easy creation of charts & tables with SAS Enterprise Guide  Just pointing and clicking  Also available free to university researchers and students via SAS on-demand  Characterize data tasks gives frequency distributions for all categorical variables and charts for ALL variables

Answer the questions The distribution of students’ expectations for educational attainment are shown above. The median expectation was to finish a Bachelor’s degree. Only 17.1% of students expected to complete less than a four-year degree. The distribution of educational expectations is very skewed.

Categorical data that is in order The distribution of homework hours is somewhat positively skewed Mean = 2.68 Median = 1-2 hours ( Category 3) Mode = 2

Getting the data Figure 1.1 FILE> OPEN> DATA

Tasks> describe> characterize data Figure 1.5 ALWAYS DO THIS !!

Just click through the windows and accept all of the defaults.

Some Coding ODS GRAPHICS ON ; * PROC FREQ DATA = dsname ; TABLES varname ; Will produce histograms and one-way tables * Not needed in SAS 9.3

(Sounds more impressive than it is) Bi-variate categorical data analysis

Homes without computers have fewer books

Children of mother’s with more education are less likely to fail Notice how the further down the column you go, the smaller the column percentages

Bi-variate distributions  Is there a relationship between school failure and mother’s education?  Is there a relationship between the number of books (this was a category) in the home and whether a family has a computer  Is there a relationship between mother’s education and father’s education

Answer by trend, proportion, odds  The trend of the data showed a lower likelihood of a student failing a grade the higher the educational level of the moms surveyed.  At the 0-11 yrs of education level 73% of the students never failed a grade while at 16 yrs or more it was 88% never having failed.  Students whose mothers had not finished high school were more than twice as likely to fail a grade as children of college graduate mothers

Mothers tend to be married to fathers with similar education Note that the highest row percentages tend to be in the diagonals where the parents’ education is the same

Some More Coding PROC FREQ DATA = dsname ; TABLES varname1 * varname2 / AGREE ; FOR CORRELATED DATA

Correlated Data

McNemar’s Test

Correlated Data

Cohen’s Kappa 1.0 = perfect agreement Negative Kappa is not an error, it means the two agree less than chance = Probability observed – Probability expected 1 – Probability expected

Chi-square (wrong) Notice you do NOT get identical p-value

Fisher (wrong) Notice you do NOT get identical p-value

Some More Coding PROC FREQ DATA = dsname ; TABLES varname1 * varname2 / chisq ;

Chi-square (right)

Right