Statistics for Political Science Levin and Fox Chapter 11:

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Chapter 10 Regression. Defining Regression Simple linear regression features one independent variable and one dependent variable, as in correlation the.
Scatter Diagrams and Linear Correlation
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
Bivariate Regression  Assumptions  Each variable is interval/ratio  There is linear (straight line) relationship between the variables.  Normal distribution.
Correlation and Regression Analysis
Chapter 8: Bivariate Regression and Correlation
Introduction to Linear Regression and Correlation Analysis
Chapter 6 & 7 Linear Regression & Correlation
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Correlation & Regression Analysis
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Lecture Slides Elementary Statistics Twelfth Edition
Topics
Regression and Correlation
Describing Relationships
CHAPTER 3 Describing Relationships
Statistics 101 Chapter 3 Section 3.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
LSRL Least Squares Regression Line
CHAPTER 10 Correlation and Regression (Objectives)
Correlation and Simple Linear Regression
Correlation and Regression
Exam 5 Review GOVT 201.
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Correlation and Simple Linear Regression
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Correlation and Regression
Chapter 3: Describing Relationships
Product moment correlation
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
3.2 – Least Squares Regression
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Introduction to Regression
Warm-up: Pg 197 #79-80 Get ready for homework questions
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
9/27/ A Least-Squares Regression.
Chapter 3: Describing Relationships
Chapter Thirteen McGraw-Hill/Irwin
Chapter 3: Describing Relationships
Homework: PG. 204 #30, 31 pg. 212 #35,36 30.) a. Reading scores are predicted to increase by for each one-point increase in IQ. For x=90: 45.98;
Chapter 14 Multiple Regression
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Statistics for Political Science Levin and Fox Chapter 11: Regression Analysis Statistics for Political Science Levin and Fox Chapter 11:

Regression Analysis Regression Analysis: Regression analysis makes the importance of the variance more clear. Goal of Research: Explain Variation Example: Judge A versus Judge B Why do some defendants get longer sentences than others?

Regression Analysis Judge Example: Sentences What if a specific judge handed down the following sentence (in months) : 12, 13, 15, 19, 26, 27, 29, 31, 40, 48 How do we explain the variation? What factors contributed to some defendants getting 48 months while others got only 12 months? Mean = 26 Months

Regression Analysis Judge Example: Sentences The mean sentence tells us something about the judge’s sentencing pattern, but it does not help explain the wide variation in sentences. Variance: S2 Is measured by calculating “the mean of the squared deviation.” S2= Σ(X - χ )2 N S2 = 125 months

Regression Analysis Judge Example: Sentences: Prior Convictions? How much of this variance is the consequence of a defendant’s prior convictions? Regression: It enables it us to quantify the relative importance of any proposed factor or variable, in this case prior convictions. Cause (IV): Prior Convictions Effect (DV): Sentence Length. Prior Convictions (IV: Cause) Sentence Length (DV: effect).

Regression Analysis Regression Model: Y = a + bX + e Y = DV: Sentence Length (response variable). X = IV: Prior Convictions (predictor variable). a = Y-intercept: base-line: No Priors (What Y (DV: sentence length) is when X (IV: priors)= zero). b = Slope (regression coefficient) for X. (Amount that Y (DV: sentence length) changes for each change in one unit of X (IV: priors)). e = error term (what is unpredictable).

Y-Intercept (baseline) (Regression coefficient) Regression Analysis Regression Model: How much is Sentence (DV) effected by the number of a defendants prior convictions (IV: Cause)? DV: Effect Y-Intercept (baseline) Slope (Regression coefficient) IV (Cause) Error Term Y = a + bX + e Sentence Length. No Priors. (Y when X=0) Amount Y changes for change in X Number of Priors Unpredictable

Regression Analysis Regression Model: Y = a + bX + e: Calculating each variable: Priors (IV) (X) Sentence Length (DV) (y) 3 1 6 5 4 10 8 N = 10, X = 40, Mean X =4 12 13 15 19 26 27 29 31 40 48 y = ?, Mean Y =26 (months)

Research Questions: Regression Model: Y = a + bX + e Plug into regression formula: (adding mean for X and Mean for Y) = 4 (mean of priors) = 26 (mean of sentences (mean of Y)) Y = DV: Sentence Length. X = IV: Prior Convictions. b [regression coefficient] = a [y-intercept] =

Research Questions: Regression Model: Y = a + bX + e Plug into regression formula: (adding mean for X and Mean for Y) = 4 (mean of priors) = 26 (mean of sentences (mean of Y)) Y = DV: Sentence Length. X = IV: Prior Convictions. b [regression coefficient] = a [y-intercept] = Regression Formula

Research Questions: Regression Model: Y = a + bX + e Plug into regression formula: (adding mean for X and Mean for Y) = 4 (mean of priors) = 26 (mean of sentences (mean of Y)) Y = DV: Sentence Length. X = IV: Prior Convictions. b [regression coefficient] = a [y-intercept] = Adding mean for X (priors) and Y (sentence)

Research Questions: Regression Model: Y = a + bX + e Calculate b [regression coefficient]: = 4 (mean of priors) = 26 (mean of sentences) Y = DV: Sentence Length. X = IV: Prior Convictions. b [regression coefficient] = Σ (X – χ)(Y – y) = 300 = 3 Σ(X – χ)2 100 a [y-intercept] =

Regression Analysis Calculating: b [regression coefficient]: Y = a + bX + e = 4 (mean of priors) = 26 (mean of sentences (mean of Y)) 300 = 3 100

Regression Model Regression Model: Y = a + bX + e Calculating a [y-intercept] = 4 (mean of priors) = 26 (mean of sentences) Y = DV: Sentence Length. X = IV: Prior Convictions. b [regression coefficient] = Σ (X – χ)(Y – y) = 300 = 3 Σ(X – χ)2 100 a [y-intercept] = 26 – (3)(4) = 14 Y = a + bX + e Y = 14 + 3X (Y = DV: Sentence, X = Prior Convictions)

Regression Analysis: Alternative Method Regression Model: Y = a + bX + e Alternative way to Calculate b [regression coefficient] = 4 (mean of priors) = 26 (mean of sentences) Y = DV: Sentence Length. X = IV: Prior Convictions. b [regression coefficient] = a [y-intercept] = SP SSx or: b =

Regression Analysis Calculating: b [regression coefficient] Y = a + bX + e = 4 (mean of priors) = 26 (mean of sentences (mean of Y)) 300 = 3 (this approach is implied in long-hand 100 calculations.)

Regression Analysis Calculating Regression Coefficient: (Sum of Products and Sum of Squares for x) Hh

Regression Analysis Calculating: a [Y-intercept] with SP and SSx data: 4 Y = a + bX

Regression Analysis Regression Line: It is a line that “falls closest to all the points in a scatter plot.” It crosses the Y axis at the Y-intercept and traces the slope (b) for the independent variable (X: Priors). Y = a + bX + e Y = 14 + 3X …

Regression Analysis Predicted and Actual Values (262) A regression line represents a “predicted rather than an actual value.”

Regression Analysis Defining Error: Residual X (IV) values will give you a predicted value for Y (DV) which may in fact be different from the actual value of Y. Ý = Predicted Y Y = a + bX Y = Observed Y Residual is the Difference Between Ý and Y. e = Y – Ý

Regression Analysis Plotting a Regression Line: To plot a regression line you need to locate and then connect at least two points. Easiest Line: Y-intercept and χ and y Mean The easiest way to do this is to draw a line from the y-intercept (a) (X = 0, Y = a) and then through the χ (IV: priors) and y mean (average sentence (DV)). a = (Y-intercept: base-line: No Priors) = 14 χ = (mean of IV: prior convictions) = 4 y = (mean of DV: sentences (in months)) = 26

Regression Analysis Plotting a Regression Line: To plot a regression line you need to locate and then connect at least two points. Easiest Line: Y-intercept and χ and y Mean The easiest way to do this is to draw a line from the y-intercept (a) (X = 0, Y = a) and then through the χ (IV: priors) and y mean (average sentence (DV)). a = (Y-intercept: base-line: No Priors) = 14 χ = (mean of IV: prior convictions) = 4 y = (mean of DV: sentences (in months)) = 26 Means for X and Y

Regression Analysis

Regression Analysis Mean for Y (Sentence) = 26 a [y intercept]= 14 Mean for X (Priors) = 4

Regression Analysis Plotting the Regression Line: Figure 11.2 If the Y-intercept and X and Y means are two close to plot a line, you can insert a larger value for X and then plug it into the equation. Example: 10 Priors Y (Ý) = DV: Sentence Length: ? X = IV: Prior Convictions: 10 a = Y-intercept: base-line: No Priors: 14 b = Slope (regression coefficient) for X = 3 Y = a + bX Ý = 14 + 3X Ý = 14 + 3(10) = 44

Regression Analysis Plotting the Regression Line: Figure 11.2 If the Y-intercept and X and Y means are two close to plot a line, you can insert a larger value for X and then plug it into the equation. Example: 13 Priors Y (Ý) = DV: Sentence Length: ? X = IV: Prior Convictions: 13 a = Y-intercept: base-line: No Priors: 14 b = Slope (regression coefficient) for X = 3 Y = a + bX Ý = 14 + 3X Ý = 14 + 3(13) = 53

Regression Analysis The chart itself can predict how changes in X (priors) will effect Y (sentence): 13 Priors = 53 months.

Regression Analysis Requirements of Regression: It is assumed that both variables are measured at the interval level. Regression assumes a straight-line relationship. Extremely deviant cases in scatter plot are removed from the analysis. Sample members must be chosen randomly in order to employ tests of significance. To test the significance of the regression line, one must also assume normality for both variables or else have a large sample.

Regression Analysis: Review Interpreting the Regression Line: Regression analysis allows make predictions about one variable (IV: cause (X)) will effect another (DV: effect (Y)). Example: Priors Convictions and Sentence Length The Y-intercept tells us what Y (DV) is when the X (IV) is zero. If you have no priors (X (IV)), than the average sentence is 14 months. The regression coefficient b tells us how much Y (DV) (sentence length) will increase or decrease of unit change in X (IV) (prior). As such, we can also predict what the sentence length of will be for a defendant based on their number of prior convictions.

Regression Analysis Interpreting the Regression Line: Regression analysis allows make predictions about one variable (IV: cause (X)) will effect another (DV: effect (Y)). Example: 5 Priors Y = a + bX Ý = 14 + 3X Ý = 14 + 3(5) = 29

Extra: Regression Analysis Scatterplot (or Scatter Diagram): The scatterplot provides a visual means of “displaying relationship between two interval-ratio variables.” Example: GNP and % Willingness to Pay for Environmental Protection Hypothesis: The higher a country’s GNP (IV) the more willing it will be to pay higher prices for Environmental Protection (DV). IV: GNP DV: Willingness to Pay for EP Direction: positive

Extra: Regression Analysis GNP and Willingness to Pay for Environ Protect: Figure 8.1

Extra: Regression Analysis GNP and % Willingness to Pay for Environmental Protection: It appears as though there is a positive relationship between GNP and % willingness to pay for environmental protection.

Extra: Regression Analysis GNP and Environment is Scared: Negative Relationship: Fig 8.2

Extra: Regression Analysis GNP and Environment is Scared: Negative Relationship: Fig 8.7

Extra: Regression Analysis Linear Relations and Prediction Rules: Though the relationship between GNP and willingness to pay for EP, there is a clear trend. Linear Relationship: It “allows us to approximate the observations displayed in a scatter diagram with a straight line.” Prefect Linear Relationship: Deterministic Relationship It “provides a predicted value of Y (vertical axis) for any value of X (horizontal axis).

Extra: Regression Analysis Example of perfect Linear Relationship Take the examples of Teachers’ salaries and seniority where seniority determines salaries. Y = a + bX + e Y = DV: Salary X = IV: Seniority a = Y-intercept: base-line: starting salary (What Y is when X = zero). b = Slope (regression coefficient) for X. (Amount that Y changes for each change in one unit of X).

Extra: Regression Analysis Example of perfect Linear Relationship Using this formula we can determine what an individual teacher’s salary (DV: Effect) will be starting with a baseline (a) of $12,000 and an extra $2000 for each year on the job (X: IV: Cause). Y = a + bX becomes: Y = 12,000 + 2000X (because this is a deterministic argument -seniority determines salary- there is no error).

Extra: Regression Analysis Seniority and Salary: Figure 8.5 Y = 12,000 + 2000(7) Y = $26,000

STOP STOP: Material Beyond this Point is NOT required for Exam 5.

Regression Analysis Constructing the Straight-Line Graphs We can demonstrate a linear relationship by drawing a straight line on a scatterplot. How do we know where to draw the line on a scatterplot? …

Regression Analysis Example: GNP and Environmental Protection (Figure 8.4)

Regression Analysis Best-Fitting Line: The best-fitting line is the line with the least error. (270) Defining Error: Residual X (IV) values will give you a predicted value for Y (DV) which may in fact be different from the actual value of Y. …

Regression Analysis Defining Error: Residual Ý = Predicted Y Y = a + bX Y = Observed Y Residual is the Difference Between Ý and Y. e = Y - Ý …

Regression Analysis Canada: GNP and Environmental Protection Ý = 40 Ý = 40 Y = a + bX Y = 41.8 Residual is the Difference Between Ý and Y. e = 41.8 – 40 e = 1.8 46

Regression Analysis Canada: GNP and Environmental Protection (Figure 8.3)

Regression Analysis Error Residual How do draw a line that minimizes e for all individual observation? Residual Sum of Squares ∑e2 = ∑(Y – Ý)2 Least-Squares Line “The best-fitting line is that line where the sum of the squared residuals, ∑e2, is at a minimum.” 48