Pearson’s r & X 2 Correlation vs. X² (which, when & why) Qualitative/Categorical and Quantitative Variables Scatterplots for 2 Quantitative Variables Research.

Slides:

Advertisements

Similar presentations

Parametric & Nonparametric Models for Tests of Association Models we will consider X 2 Tests for qualitative variables Parametric tests Pearson’s correlation.

Advertisements

Pearson’s X 2 Correlation vs. X² (which, when & why) Qualitative/Categorical and Quantitative Variables Contingency Tables for 2 Categorical Variables.

1 Objective Investigate how two variables (x and y) are related (i.e. correlated). That is, how much they depend on each other. Section 10.2 Correlation.

Describing Relationships Using Correlation and Regression

Chapter 6: Correlational Research Examine whether variables are related to one another (whether they vary together). Correlation coefficient: statistic.

Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.

1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 25, Slide 1 Chapter 25 Comparing Counts.

Multiple Group X² Designs & Follow-up Analyses X² for multiple condition designs Pairwise comparisons & RH Testing Alpha inflation Effect sizes for k-group.

Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric.

Multiple Group X² Designs & Follow-up Analyses X² for multiple condition designs Pairwise comparisons & RH Testing Alpha inflation Effect sizes for k-group.

Lecture 11 Psyc 300A. Null Hypothesis Testing Null hypothesis: the statistical hypothesis that there is no relationship between the variables you are.

Chi-square Test of Independence

Multivariate Analyses & Programmatic Research Re-introduction to Multivariate research Re-introduction to Programmatic research Factorial designs  “It.

Simple Correlation Scatterplots & r Interpreting r Outcomes vs. RH:

Some Details about Bivariate Stats Tests Conceptualizing the four stats tests Conceptualizing NHST, critical values and p-values NHST and Testing RH: Distinguishing.

10-2 Correlation A correlation exists between two variables when the values of one are somehow associated with the values of the other in some way. A.

Social Research Methods

Parametric & Nonparametric Models for Univariate Tests, Tests of Association & Between Groups Comparisons Models we will consider Univariate Statistics.

PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.

Summary of Quantitative Analysis Neuman and Robson Ch. 11

Pearson’s r Scatterplots for 2 Quantitative Variables Research and Null Hypotheses for r Casual Interpretation of Correlation Results (& why/why not) Computational.

Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.

Effect Sizes, Power Analysis and Statistical Decisions Effect sizes -- what and why?? review of statistical decisions and statistical decision errors statistical.

Week 11 Chapter 12 – Association between variables measured at the nominal level.

Statistical Inference Dr. Mona Hassan Ahmed Prof. of Biostatistics HIPH, Alexandria University.

Categorical Data Prof. Andy Field.

WG ANOVA Analysis of Variance for Within- groups or Repeated measures Between Groups and Within-Groups ANOVA WG ANOVA & WG t-tests When to use each Similarities.

Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.

CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.

Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.

Correlation and Regression

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-1 Review and Preview.

HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 10.7.

Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means.

ANOVA Analysis of Variance Picking the correct bivariate statistical test When, Why to Use ANOVA Summarizing/Displaying Data -- 1 qual & 1 quant var How.

Chapter SixteenChapter Sixteen. Figure 16.1 Relationship of Frequency Distribution, Hypothesis Testing and Cross-Tabulation to the Previous Chapters and.

LECTURE 19 THURSDAY, 14 April STA 291 Spring

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.

Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.

Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.

Chapter 16 The Chi-Square Statistic

Psych 230 Psychological Measurement and Statistics Pedro Wolf September 23, 2009.

McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.

CHI SQUARE TESTS.

Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.

Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.

Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.

Inferential Statistics. Explore relationships between variables Test hypotheses –Research hypothesis: a statement of the relationship between variables.

Simple Correlation Scatterplots & r –scatterplots for continuous - binary relationships –H0: & RH: –Non-linearity Interpreting r Outcomes vs. RH: –Supporting.

Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.

Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.

Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.

ContentFurther guidance  Hypothesis testing involves making a conjecture (assumption) about some facet of our world, collecting data from a sample,

Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.

Chapter 7 Calculation of Pearson Coefficient of Correlation, r and testing its significance.

Learning from Categorical Data

Outline of Today’s Discussion 1.The Chi-Square Test of Independence 2.The Chi-Square Test of Goodness of Fit.

Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.

Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 FINAL EXAMINATION STUDY MATERIAL III A ADDITIONAL READING MATERIAL – INTRO STATS 3 RD EDITION.

NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.

Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.

Comparing Observed Distributions A test comparing the distribution of counts for two or more groups on the same categorical variable is called a chi-square.

Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.

Logic of Hypothesis Testing

Qualitative data – tests of association

M248: Analyzing data Block D.

Review for Exam 2 Some important themes from Chapters 6-9

Presentation transcript:

Pearson’s r & X 2 Correlation vs. X² (which, when & why) Qualitative/Categorical and Quantitative Variables Scatterplots for 2 Quantitative Variables Research and Null Hypotheses for r Casual Interpretation of Correlation Results (and why/why not) Contingency Tables for 2 Categorical Variables Research and Null Hypotheses for X 2 Causal Interpretation for X 2 Results

Pearson’s r Vs. X 2 n Pearson’s Correlation (r) –2 quantitative variables –LINEAR relationship –range = -1 to +1 n Pearson’s Chi Square (X 2 ) –2 qualitative variables –PATTERN of relationship –range = 0 to + infinity Turtle Type Painted Snapper Food Preference crickets “duck weed” Hours of Study Time Test Performance (%)

Practice -- would you use r or X 2 for each of the following bivariate analyses? Hint: Start by determining if each variable is qual or quant ! n GPA & GRE n Age & Shoe Size n Preferred Pet Type & Preferred Toy Type n Leg Length & Hair Length n Age and Preferred Type of Pet n Gender & Preferred Type of Car n Grade (%) & Hrs. Study r X² r r r ANOVA -- psyche!

Displaying the data for a correlation: With two quantitative variables we can display the bivariate relationship using a “scatterplot” Age of Puppy (weeks) Amount Puppy Eats (pounds) Puppy Age (x) Eats (y) Sam Ding Ralf Pit Seff … Toby

When examining a scatterplot, we look for three things... linearity linear non-linear or curvilinear direction (if linear) positive negative strength strong moderate weak linear, positive, weak linear, negative, moderate nonlinear, strong Hi Lo

Sometimes a scatterplot will show only the “envelope” of the data, not the individual data points. Describe each of these bivariate patterns... Hi Lo Hi Lo No relationship linear, positive, weak linear, negative, stronglinear, positive, moderate

The Pearson’s correlation ( r ) summarizes the direction and strength of the linear relationship shown in the scatterplot n r has a range from to a perfect positive linear relationship 0.00 no linear relationship at all a perfect negative linear relationship n r assumes that the relationship is linear if the relationship is not linear, then the r-value is an underestimate of the strength of the relationship at best and meaningless at worst For a non-linear relationship, r will be based on a “rounded out” envelope -- leading to a misrepresentative r

Stating Hypotheses with r... Every RH must specify... –the variables –the direction of the expected linear relationship –the population of interest –Generic form... There is a no/a positive/a negative linear relationship between X and Y in the population represented by the sample. Every H0: must specify... –the variables –that no linear relationship is expected –the population of interest –Generic form... There is a no linear relationship between X and Y in the population represented by the sample.

What “retaining H0:” and “Rejecting H0:” means... n When you retain H0: you’re concluding… is not –The linear relationship between these variables in the sample is not strong enough to allow me to conclude there is a relationship between them in the population represented by the sample. n When you reject H0: you’re concluding… is –The linear relationship between these variables in the sample is strong enough to allow me to conclude there is a relationship between them in the population represented by the sample.

Deciding whether to retain or reject H0: when using r... When computing statistics by hand –compute an “obtained” or “computed” r value –look up a “critical r value” –compare the two if |r-obtained| < r-critical Retain H0: if |r-obtained| > r-critical Reject H0: When using the computer –compute an “obtained” or “computed” r value –compute the associated p-value (“sig”) –examine the p-value to make the decision if p >.05Retain H0: if p <.05Reject H0:

Practice with Pearson’s Correlation (r) The RH: was that older adolescents would be more polite. obtained r =.453 critical r=.254 Retain or Reject H0: ??? Reject -- |r| > r-critical Support for RH: ??? Yep ! Correct direction !! A sample of 84 adolescents were asked their age and to complete the Politeness Quotient Questionnaire

Again... The RH: was that older professors would receive lower student course evaluations. obtained r = p =.431 Retain or Reject H0: ??? Retain -- p >.05 Support for RH: ??? No! There is no linear relationship A sample of 124 Introductory Psyc students from 12 different sections completed the Student Evaluation. Profs’ ages were obtained (with permission) from their files.

Statistical decisions & errors with correlation... In the Population - r r = 0 + r Statistical Decision - r (p <.05) r = 0 (p >.05) + r (p <.05) Correct H0: Rejection & Direction Correct H0: Retention Correct H0: Rejection & Direction Type II “Miss” Type I “False Alarm” Type I “False Alarm” Type III “Mis-specification” Type III “Mis-specification” Remember that “in the population” is “in the majority of the literature” in practice!!

About causal interpretation of correlation results... We can only give a causal interpretation of the results if the data were collected using a true experiment –random assignment of subjects to conditions of the “causal variable” (IV) -- gives initial equivalence. –manipulation of the “causal variable” (IV) by the experimenter -- gives temporal precedence –control of procedural variables -- gives ongoing eq. Most applications of Pearson’s r involve quantitative variables that are subject variables -- measured from participants In other words -- a Natural Groups Design -- with... no random assignment -- no initial equivalence no manipulation of “causal variable” (IV) -- no temporal precendence no procedural control -- no ongoing equivalence Under these conditions causal interpretation of the results is not appropriate !!

Moving on to X 2 … with two qualitative variables we can display the bivariate relationship using a “contingency table” Puppy Type (col) Play (row) Sam Ding Ralf Pit Seff … Toby work hunt work hunt.. hunt tug chase tug chase.. chase Type of Dog Hunting Working Favorite Play Sock-Tug Ball-Chase

When examining a contingency table, we look for two things... whether or not there is a pattern if so, which row tends to “go with” which column? Columns A B Rows no pattern Columns A B Rows Pattern: A&2 B&1 Columns A B Rows Pattern: A&1 B&2

Describe each of the following... Boys Girls Chips Crackers boys prefer chips & girls prefer crackers Boys Girls Chips Crackers no pattern Boys Girls Chips Crackers boys prefer crackers & girls prefer chips Boys Girls Chips Crackers girls prefer crackers & boys have no preference

The Pearson’s Chi-square ( X² ) summarizes the relationship shown in the contingency table n X² has a range from 0 to  (infinity) 0.00 absolutely no pattern of relationship “smaller” X² -- weaker pattern of relationship “larger” X² - stronger pattern of relationship n However... –The relationship between the size of X² and strength of the relationship is more complex than for r (with linear relationships) you will seldom see X² used to express the strength of the bivariate relationship

Stating Hypotheses with X 2... Every RH must specify... –the variables –the specific pattern of the expected relationship –the population of interest –Generic form... There is a pattern of relationship between X & Y, such that in the population represented by the sample. Every H0: must specify... –the variables –that no pattern of relationship is expected –the population of interest –Generic form... There is a no pattern of relationship between X and Y in the population represented by the sample.

Deciding whether to retain or reject H0: when using X 2 When computing statistics by hand –compute an “obtained” or “computed” X 2 value –look up a “critical X 2 value” –compare the two if X 2 -obtained < X 2 -critical Retain H0: if X 2 -obtained > X 2 -critical Reject H0: When using the computer –compute an “obtained” or “computed” X 2 value –compute the associated p-value (“sig”) –examine the p-value to make the decision if p >.05Retain H0: if p <.05Reject H0:

What “Retaining H0:” and “Rejecting H0:” means... n When you retain H0: you’re concluding… is not –The pattern of the relationship between these variables in the sample is not strong enough to allow me to conclude there is a relationship between them in the population represented by the sample. n When you reject H0: you’re concluding… is –The pattern of the relationship between these variables in the sample is strong enough to allow me to conclude there is a relationship between them in the population represented by the sample.

Statistical decisions & errors with X 2... In the Population that specific no any other pattern pattern pattern Statistical Decision that specific pattern (p <.05) no pattern (p >.05) any other pattern (p <.05) Correct H0: Retention Type II “Miss” Type I “False Alarm” Type I “False Alarm” Type III “Mis-specification” Type III “Mis-specification” Remember that “in the population” is “in the majority of the literature” in practice!! Correct H0: Rejection & Pattern

Testing X 2 RH: -- different “kinds” of RH: & it matters!!! “Pattern” type RH: RH: More of those who do the “on web” exam preparation assignment will pass the exam, whereas more of those who do the “on paper” version fill fail the exam. “Proportion” type RH: RH: A greater proportion of those who do the “on web” exam preparation than of those who do the “on paper” version will pass the exam. “Implied Proportion” Type of RH: RH: Those who do the “on web” exam preparation will do better than those who do the “on paper” version.

Testing X 2 RH: -- different “kinds” of RH: & it matters!!! “Pattern” type RH: RH: More girls will prefer crackers and more boys will prefer chips. “Proportion” type RH: RH: A greater proportion of girls than of boys will prefer crackers. Boys Girls Chips Crackers Both RH:s supported !! Girls 44/60 =.73 Boys 12/42 =.29 Girls 44 > 16 & Boys 12 < 3 Boys Girls Chips Crackers Only “Proportion” RH supported !! Girls 44/60 =.73 Boys 32/62 =.52 Girls 44 > 16 But.. Boys 32 = 30 X 2 =19.93, p<.001X 2 =6.12, p=.013

Testing X 2 RH: -- one to watch out for… You’ll get…  This is not a good way to express a X 2 RH: !!!! RH: More of those who do the “on web” exam preparation assignment will perform better on the exam than those who do the “on paper” version. Sometime, instead of … RH: A greater proportion of those do the “on web” exam preparation than of those who do the “on paper” version will pass the exam. You have to be careful about these kinds of “frequency” RH:!!! X 2 works in terms of proportions, not frequencies! And, because you might have more of one group than another, this can cause confusion and problems…

Testing X 2 RH: -- one to watch out for… You’ll get…  This is not a good way to express a X 2 RH: !!!! RH: More girls than boys will prefer crackers. Boys Girls Chips Crackers Instead of … RH: A greater proportion of girls than of boys will prefer crackers. X 2 =9.00, p=.003 The number of boys & girls is same 20 = 20 … But X 2 tests for differential proportion of that category not for differential number of that category… Girls 20/30 =.66 >.33 = 20/40 Boys

About causal interpretation of X²... Applications of Pearson’s X² are a mixture of the three designs you know –Natural Groups Design –Quasi-Experiment –True Experiment But only those data from a True Exp can be given a causal interpretation … –random assignment of subjects to conditions of the “causal variable” (IV) -- gives initial equivalence. –manipulation of the “causal variable” (IV) by the experimenter -- gives temporal precedence –control of procedural variables - gives ongoing eq. You must be sure that the design used in the study provides the necessary evidence to support a causal interpretation of the results !!

RH: Those who do the “on web” exam preparation assignment will perform better on the exam than those who do the “on paper” version. Practice with Statistical and Causal Interpretation of X² Results Paper Web Fail Pass X 2 obtained = 28.78, p <.001 Retain or Reject H0: ??? Support for RH: ??? Reject! Design: Before taking the test, students were asked whether they had chosen to complete the “on Web” or the “on paper” version of the exam prep. The test was graded pass/fail. Type of Design ??? Causal Interpretation? Natural Groups Design Nope! CAN What CAN we say from these data ??? There’s an association between type of prep and test performance. Yep ! 37/51 of Web folks passed versus 11/54 of Paper folks !!

RH: Those who do the “on web” exam preparation assignment will perform better on the exam than those who do the “on paper” version. Again... Paper Web Fail Pass X 2 obtained =.26, p =.612 Retain or Reject H0: ??? Support for RH: ??? Retain! Nope ! Design: Students in the morning laboratory section were randomly assigned to complete the “on Web” version of the exam prep, while those in the afternoon section completed the “on paper” version. Student’s were “monitored” to assure the completed the correct version. The test was graded pass/fail. Type of Design ??? Causal Interpretation? Quasi Experiment Nope! CAN What CAN we say from these data ??? There’s no association between type of prep and test performance.

RH: More of those who do the “on web” exam preparation assignment will pass the exam and more of those who do the “on paper” version will fail. Yet again... Paper Web Fail Pass X 2 obtained = 6.12, p =.013 Retain or Reject H0: ??? Support for RH: ??? Reject! Design: One-half of the students in the T-Th AM lecture section were randomly assigned to complete the “on Web” version of the exam prep, while the other half of that section completed the “on paper” version. Students were “monitored” to assure the completed the correct version. The test was graded pass/fail. Only data from students in the T-TH AM class were included in the analysis. Type of Design ??? Causal Interpretation? True Experiment Yep! CAN What CAN we say from these data ??? That type of prep nfluences test performance. Partial: 37 > 14, but 23 = 21