Crash Course in Correlation and Regression MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central.

Slides:



Advertisements
Similar presentations
Chapter 3 Examining Relationships Lindsey Van Cleave AP Statistics September 24, 2006.
Advertisements

Lesson 10: Linear Regression and Correlation
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Chapter 3 Bivariate Data
Statistics for the Social Sciences
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
PPA 501 – Analytical Methods in Administration Lecture 8 – Linear Regression and Correlation.
Unobtrusive Research 1.Content analysis - examine written documents such as editorials. 2.Analyses of existing statistics. 3.Historical/comparative analysis.
Correlation MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central objective of the scientific enterprise.
PPA 415 – Research Methods in Public Administration
Examining Relationship of Variables  Response (dependent) variable - measures the outcome of a study.  Explanatory (Independent) variable - explains.
SIMPLE LINEAR REGRESSION
Reminders  HW2 due today  Exam 1 next Tues (9/27) – Ch 1-5 –3 sections: Short answers (concepts, definitions) Calculations (you’ll be given the formulas)
RESEARCH STATISTICS Jobayer Hossain Larry Holmes, Jr November 6, 2008 Examining Relationship of Variables.
SIMPLE LINEAR REGRESSION
Correlation MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central objective of the scientific enterprise.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 3 Correlation and Prediction.
SIMPLE LINEAR REGRESSION
Soc 3306a Lecture 8: Multivariate 1 Using Multiple Regression and Path Analysis to Model Causality.
ASSOCIATION BETWEEN INTERVAL-RATIO VARIABLES
Linear Trend Lines Y t = b 0 + b 1 X t Where Y t is the dependent variable being forecasted X t is the independent variable being used to explain Y. In.
Chapter 6 & 7 Linear Regression & Correlation
Linear Trend Lines = b 0 + b 1 X t Where is the dependent variable being forecasted X t is the independent variable being used to explain Y. In Linear.
Soc 3306a Multiple Regression Testing a Model and Interpreting Coefficients.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.
Chapter 17 Partial Correlation and Multiple Regression and Correlation.
Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Examining Relationships in Quantitative Research
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Linear Trend Lines = b 0 + b 1 X t Where is the dependent variable being forecasted X t is the independent variable being used to explain Y. In Linear.
Chapter 4 Prediction. Predictor and Criterion Variables  Predictor variable (X)  Criterion variable (Y)
Chapter 16 Data Analysis: Testing for Associations.
Examining Relationships in Quantitative Research
Correlation MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central objective of the scientific enterprise.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
Regression Continued. Example: Y [team finish] =  +  X [spending] Values of the Y variable (team finish) are a function of some constant, plus some.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.3 Predicting the Outcome.
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
Chapter 11: Linear Regression and Correlation Regression analysis is a statistical tool that utilizes the relation between two or more quantitative variables.
Chapter 3 The Ethics and Politics of Social Research.
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
The following data represents the amount of Profit (in thousands of $) made by a trucking company dependent on gas prices. Gas $
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Multiple Independent Variables POLS 300 Butz. Multivariate Analysis Problem with bivariate analysis in nonexperimental designs: –Spuriousness and Causality.
Descriptive measures of the degree of linear association R-squared and correlation.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Lecture 10 Regression Analysis
LSRL.
Least Squares Regression Line.
Bivariate & Multivariate Regression Analysis
The following data represents the amount of Profit (in thousands of $) made by a trucking company dependent on gas prices. Gas $
Chapter 3.2 LSRL.
Least Squares Regression Line LSRL Chapter 7-continued
Unit 3 – Linear regression
Warm-up: This table shows a person’s reported income and years of education for 10 participants. The correlation is .79. State the meaning of this correlation.
Chapter 5 LSRL.
Chapter 5 LSRL.
SIMPLE LINEAR REGRESSION
SIMPLE LINEAR REGRESSION
Chapter 3 Correlation and Prediction
Regression Part II.
Presentation transcript:

Crash Course in Correlation and Regression MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central objective of the scientific enterprise. Scientists spend most of their time figuring out how one thing relates to another and structuring these relationships into explanatory theories.

Scatterplots A. scatter diagram A list of 1,000 data points would be impossible to grasp. [so we need some method that can examine this data and convert it into a more conceivable format]. One method is plotting the data for two variables (education and income; father’s height and son’s height; team spending in baseball and % wins) in a graph called a scatter diagram.

r = 1.0

r =.85

r =.42

R =.17

R = -.94

R = -.54

R = -.33

Formula for the Correlation Coefficient

Interpreting correlation coefficients Ranges from -1 to +1. [0 = no association;.25 weak;.5 moderate;.75 < strong] Square correlation coefficient to creat “R-squared” defined as the proportion of the variance of one variable accounted for by another variable a.k.a PRE STATISTIC (Proportionate Reduction of Error) Which bring us to Regression

MLB spending and performance example (Hoover & Donovan 2001): Y [team finish] =  +  X [spending] Expressing the model in words: values of the Y variable (team finish: 1 st place, 2 nd place, etc.) are a function of some constant (  ), plus some amount of the X variable (spending). How much change in the Y variable (team finish) is associated with a change in the X variable (spending). The answer lies in β (beta), a.k.a the regression coefficient. In the baseball example, it would be the amount of improvement in team finish associated with an additional $1 million in spending on players’ salaries.

Hoover and Donovan using 1999 MLB season data and a bivariate regression found: Team finish = 4.4 – 0.03 x spending (in $millions) Interpretation: The beta (a.k.a the slope) suggests the relationship between spending and team finish was –0.03. Or, for each million dollars that a team spends, there is only a 3 percent change in division position. These results show that a team spending $70 million on players will finish close to second place. We can also show that any given team would have to spend almost $34 million more to improve its team finish by one position (-0.03 x $34 million = 1.02). The correlation was which means that spending explains only 15 percent of variation in the team’s finish (r- squared =.15 = x -0.39).

Another Baseball Example Testing Causality Between Team Performance and Payroll : The Cases of Major League Baseball and English Soccer By Stephen Hall, Stefan Szymanski and Andrew S. Zimbalist Journal of Sports Economics 2002

Multiple Regression Multiple regression contains a single dependent variable and two or more independent variables. Multiple regression is particularly appropriate when the causes (independent variables) are inter-correlated, which again is usually the case.

Multivariate Regression is a powerful tool to examine how multiple factors (independent variables) influence a dependent variable. It differs from bivariate regression in that it can identify the independent effect a variable has on a dependent variable by holding all other variables constant? What other variables would we include in the baseball model to predict winning %?

Y X1 X2

X1 Y X2 c

In figure 1 the fact that X1 and X2 do not overlap means that they are not correlated, but each is correlated with Y. This is great and means we don’t need sophisticated analysis, just two separate bivariate regressions. In figure 2, X1 and X2 are correlated. The area C is created by the correlation between X1 and X2; c represents the proportion of the variance in Y that is shared jointly with X1 and X2. How do we deal with C? We can’t count it twice or we will get a variation that is greater than 100%. Multivariate Regression