Correlation MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central objective of the scientific enterprise.

Slides:



Advertisements
Similar presentations
Chapter 3 Examining Relationships Lindsey Van Cleave AP Statistics September 24, 2006.
Advertisements

CHAPTER 11, COMPARATIVE AND HISTORICAL RESEARCH
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Three Types of Unobtrusive Research 1.Content analysis - examine written documents such as editorials. 2.Analyses of existing statistics. 3.Historical/comparative.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Chapter 4 The Relation between Two Variables
IB Math Studies – Topic 6 Statistics.
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
PSY 307 – Statistics for the Behavioral Sciences
Unobtrusive Research 1.Content analysis - examine written documents such as editorials. 2.Analyses of existing statistics. 3.Historical/comparative analysis.
Lecture 11 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Correlation “A statistician is someone who loves to work with numbers but doesn't have the personality to be an accountant.”
SIMPLE LINEAR REGRESSION
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
SIMPLE LINEAR REGRESSION
Correlation MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central objective of the scientific enterprise.
Crash Course in Correlation and Regression MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Correlation and Regression Analysis
Relationships Among Variables
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
T-tests and ANOVA Statistical analysis of group differences.
Linear Regression Modeling with Data. The BIG Question Did you prepare for today? If you did, mark yes and estimate the amount of time you spent preparing.
Lecture 16 Correlation and Coefficient of Correlation
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Correlation.
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
Chapter 15 Correlation and Regression
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Statistical Analysis A Quick Overview. The Scientific Method Establishing a hypothesis (idea) Collecting evidence (often in the form of numerical data)
● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Data Analysis (continued). Analyzing the Results of Research Investigations Two basic ways of describing the results Two basic ways of describing the.
Statistical Analysis Topic – Math skills requirements.
Basic Statistics Correlation Var Relationships Associations.
Three Types of Unobtrusive Research 1.Content analysis - examine written documents such as editorials. 2.Analyses of existing statistics. 3.Historical/comparative.
Chapter 10 Correlation and Regression
Correlation.
Psych 230 Psychological Measurement and Statistics Pedro Wolf September 23, 2009.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Objectives 2.1Scatterplots  Scatterplots  Explanatory and response variables  Interpreting scatterplots  Outliers Adapted from authors’ slides © 2012.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Correlation MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central objective of the scientific enterprise.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Chapter 14 Correlation and Regression
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Chapter 11 Unobtrusive Research Content analysis Analyzing existing statistics Historical/comparative analysis.
CORRELATION ANALYSIS.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Pearson’s Correlation The Pearson correlation coefficient is the most widely used for summarizing the relation ship between two variables that have a straight.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Statistical analysis.
Practice As part of a program to reducing smoking, a national organization ran an advertising campaign to convince people to quit or reduce their smoking.
Regression and Correlation
Statistical analysis.
Elementary Statistics
CHAPTER 10 Correlation and Regression (Objectives)
Correlation and Regression
Lecture Notes The Relation between Two Variables Q Q
SIMPLE LINEAR REGRESSION
SIMPLE LINEAR REGRESSION
Presentation transcript:

Correlation MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central objective of the scientific enterprise. Scientists spend most of their time figuring out how one thing relates to another and structuring these relationships into explanatory theories. The question of association comes up in normal discourse as well, as in "like father like son“.

Scatterplots A. scatter diagram A list of 1,078 pairs of heights would be impossible to grasp. [so we need some method that can examine this data and convert it into a more conceivable format]. One method is plotting the data for the two variables (father's height and son's height) in a graph called a scatter diagram.

B. The Correlation Coefficient This scatter plot looks like a cloud of points which visually can give us a nice representation and a gut feeling on the strength of the relationship, and is especially useful for examining outliners or data anomalies, but statistics isn't too fond of simply providing a gut feeling. Statistics is interested in the summary and interpretation of masses of numerical data - so we need to summarize this relationship numerically. How do we do that - yes, with a correlation coefficient. The correlation coefficient ranges from +1 to -1

r = 1.0

r =.85

r =.42

R =.17

R = -.94

R = -.54

R = -.33

Computing the Pearson's r correlation coefficient Definitional formula is: Convert each variable to standard units (zscores). The average of the products give the correlation coefficient. But this formula requires you to calculate z-scores for each observation, which means you have to calculate the standard deviation of X and Y before you can get started. For example, look what you have to do for only 5 cases.

Dividing the Sum of ZxZy (2.50) by N (5) get you the correlation coefficient =.50

The above formula can also be translated into the following – which is a little easier to decipher but is still tedious to use.

Or in other words …..

Therefore through some algebraic magic we get the computational formula, which is a bit more manageable.

Interpreting correlation coefficients Strong Association versus Weak Association: strong: knowing one helps a lot in predicting the other. Weak, information about one variables does not help much in guessing the other. 0 = none;.25 weak;.5 moderate;.75 < strong Index of Association R-squared defined as the proportion of the variance of one variable accounted for by another variable a.k.a PRE STATISTIC (Proportionate Reduction of Error))

Significance of the correlation Null hypothesis? Formula: Then look to Table C in Appendix B Or just look at Table F in Appendix B

Limitations of Pearson's r 1) at best, one must speak of "strong" and "weak," "some" and "none"-- precisely the vagueness statistical work is meant to cure. 2) Assumes Interval level data: Variables measured at different levels require that different statistics be used to test for association.

3) Outliers and nonlinearity The correlation coefficient does not always give a true indication of the clustering. There are two main exceptional cases: Outliers and nonlinearity. r =.457r =.336

4. Assumes a linear relationship

4) Christopher Achen in 1977 argues (and shows empirically) that two correlations can differ because the variance in the samples differ, not because the underlying relationship has changed. Solution? Regression analysis

Three Types of Unobtrusive Research 1.Content analysis - examine written documents such as editorials. 2.Analyses of existing statistics. 3.Historical/comparative analysis - historical records.

What is Content Analysis? Study of recorded human communication Topic Appropriate for CA –“who says what, to whom, how, and with what” –Effects of the Media

Example Investigated the media’s role in framing the welfare privatization debate with a content analysis of ABC, CBS & NBC evening news & special programs from 1/1/94 to 8/22/96. Specials include Nightline, 20/20 and This Week with David Brinkley on ABC; 60 Minutes, 48 Hours and Face the Nation on CBS. Searched LexisNexis and the Vanderbilt Television Archives for all transcripts pertaining to the issue of how welfare should be administered, and found 191 stories. At the time of the study NBC’s transcripts are not available on LexisNexis prior to Authors searched for stories using the Vanderbilt News Archives and then purchased pre-1997 transcripts from Burrell’s Transcripts.

Coding, Counting and Record Keeping Unit of Analysis Manifest vs. Latent Content coding Analysis: –Counting –Qualitative evaluation

Coding: Pro-Privatization Frames CAUSE OF PROBLEM/PROBLEM/SOLUTION 9. Delivery / dependency / faith-based 10. Delivery / economic costs / faith-based 11. Delivery / dependency / non-profits 12. Delivery / econ. costs / non-profits 13. Delivery / dependency / for-profits 14. Delivery / econ. costs / for-profits 16. Gen govt / dependency / faith-based 17. Gen govt / econ. costs / faith-based 18. Gen govt / dependency / non-profits

Coding: Anti-Privatization Frames CAUSE OF PROBLEM/PROBLEM/SOLUTION 3. Privatization / job loss / don’t privatize 4. Privatization / job loss / don’t devolve 5. Privatization / accountability / don’t privatize 6. Privatization / accountability / don’t devolve 11. Secular / job loss / don’t privatize 12. Secular / job loss / don’t devolve 13. Secular / accountability / don’t privatize

Hypothesis & Findings Authors hypothesized that mainstream (corporate owned) media would be biased toward privatization. Findings did not support such a hypothesis. Media coverage was remarkably balanced (with slight leaning against privatization)

Strengths of Content Analysis Economy of time and money. Easy to repeat a portion of the study if necessary. Permits study of processes over time. Researcher seldom has any effect on the subject being studied. Reliability.

Weaknesses of Content Analysis Limited to the examination of recorded communications. Problems of validity are likely.

Analyzing Existing Statistics Can be the main source of data or a supplemental source of data. Often existing data doesn't cover the exact question. Reliability is dependent on the quality of the statistics. Examples: Census data, Crime Stats

Analyzing Existing Statistics Can be the main source of data or a supplemental source of data. Often existing data doesn't cover the exact question. Reliability is dependent on the quality of the statistics. Examples: Census data, Crime Stats

Problems with Existing Statistics Problems with Validity –What’s available v. what is needed Problems with Reliability –Moreno Valley Example