Download presentation
Presentation is loading. Please wait.
1
Unit 3 – Linear regression
2
Linear relationships Variables Response Variable Explanatory Variable
When dealing with linear relationships we need 2 variables! Variables
3
Strength + Direction + Form
There is a … Strength + Direction + Form …relationship between x and y. Linear Curved Scattered Weak Moderate Strong Positive Negative Describing a Linear Relationship
4
Finding the line of best fit
𝑦 = 𝐵 𝑜 + 𝐵 1 x Slope = 𝑆 𝑦 𝑆 𝑥 * r Y-intercept X value Predicted Y Slope Equation
5
CAUTION: When using a line of best fit
Extrapolation : using your equation to predict outside the range of the data you used to come up with your equation. Lurking variables : an underlying variable that is causing the relationship to look different than it is in reality Outliers : your equation is strongly influenced by outliers Claiming X causes Y. They just have a linear relationship Warning!
6
Sentences Interpreting Linear Relationship Output
Slope As __X___ increases by 1, ___Y___ will be expected to increase/decrease by ___slope_____ Y-intercept When ___X__ is 0, ___Y___ will be expected to be ___Y-intercept___ R-squared ___ 𝑹 𝟐 (as percent) __ of the variability in ___Y___ can be explained by ___X__ Interpreting Linear Relationship Output
7
What is a residual? It’s the difference between the actual value and the predicted value. Residual = 𝑦− 𝑦 Residual
8
What do you want residuals to look like
What do you want residuals to look like? (To be confident your line is a good fit) Scattered or have a pattern? All above 0, all below 0, or half and half? Equal variance around 0, or not equal variance around 0. Residuals
9
SO…. How do you know if you can do a linear regression
relationship between X and Y is linear (can check by looking at a scatterplot of x and y or the residual plot) no obvious lurking variables (you can kind of assume this for this course) simple random sample (was the data taken from a SRS) constant variance of the residuals (plot of residuals) residuals vary according to a normal distribution (normal quantile plot of residuals) Warning
10
Confidence Intervals for the slope
𝑏 1 ± 𝑡 𝛼 2 ;𝑛−2 ∗𝑠𝑒 𝑏 1 standard error se(b1) can be obtained from the JMP output. 𝑏 1 is the predicted value of the slope Degrees of freedom = n-2 I am ___% confident that the true slope between __X__ and __Y__ is between _____ and ______
11
Hypothesis testing for the slope
𝐻 0 : 𝐵 1 =0 𝐻 𝐴 : 𝐵 1 ≠0 Obtain a t-stat Obtain a p-value. Decide to reject Ho. If you reject Ho, then there is sufficient evidence to suggest there is a linear relationship between ___explanatory___ and ___response___
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.