Download presentation
Presentation is loading. Please wait.
Published byMeryl Snow Modified over 9 years ago
1
Mahrita.Harahap@uts.edu.au Maths Study Centre CB01.16.11 Open 11am – 5pm Semester Weekdays https://www.uts.edu.au/future-students/science/student-experience/maths-study- centre https://www.uts.edu.au/future-students/science/student-experience/maths-study- centre www.khanacademy.org www.mahritaharahap.wordpress.com/teachingareas/regression START>ALL PROGRAMS>IBM SPSS>IBM SPSS STATISTICS 19 Marking Scheme: 0 if less than 50% attempted, 1 for more than 50% attempted but less than 50% correct, 2 if more than 50% correct.
2
Feedback for Lab 1
3
The key point is that the prediction interval tells you about the distribution of values, not the uncertainty in determining the population mean. Prediction intervals must account for both the uncertainty in knowing the value of the population mean, plus data scatter. So a prediction interval is always wider than a confidence interval. g)
4
Model Assumptions: Residual Plots One of the assumptions of the model is that the random errors (residuals) are random and normally distributed To assess the normality assumption of the residuals we may look at the normal probability plot. The normal probability plot is constructed by plotting the expected values of the residuals under the normality assumption (line) against the actual values of the residuals. If the normal assumption is valid, the residuals should lie approximately on the straight line. Any non-linear trend indicates the violation of the normal assumption. This plot may also show outliers.
5
Model Assumptions: Residual Plots Another assumption of the model is that the random errors (residuals) have a constant variance (homoscedasticity). If the variance is not constant (heteroscedasticity) then ordinary least squares is not the most efficient estimation method. We may apply some sort of transformation to stabilise the variance when the variance is some function of the mean.
7
On SPSS>Graph>Legacy Dialog>Boxplot>Summaries of separate variables
8
Coefficient for determination R 2 R-squared gives us the proportion of the total variability in the response variable (Y) that is “explained” by the least squares regression line based on the predictor variable (X). It is usually stated as a percentage. Interpretation: On average, R 2 % of the variation in the dependent variable can be explained by the independent variable through the regression model.
9
Regression Significance: t-test H 0 : β=0. There is no association between the response variable and the independent variable. (Regression is insignificant) E[y|x]= α + 0*X H 1 : β≠0. The independent variable will affect the response variable. (Regression is significant) E[y|x]= α + βX Test Statistic: If p-value≤α. Reject Ho. A ge is significantly related to systolic blood pressure. Regression is significant. If p-value>α. Do not reject Ho. A ge is not significantly related to systolic blood pressure. Regression is not significant.
10
Regression Significance: F-test / Goodness of fit test H 0 : β=0. There is no association between the response variable and the independent variable. (Regression is insignificant) E[y|x]= α + 0*X H 1 : β≠0. The independent variable/s will affect the response variable. (Regression is significant) E[y|x]= α + βX Test Statistic: If p-value≤α. Reject Ho. A ge is significantly related to systolic blood pressure. Regression is significant. If p-value>α. Do not reject Ho. A ge is not significantly related to systolic blood pressure. Regression is not significant.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.