Download presentation
Presentation is loading. Please wait.
1
2DS00 Statistics 1 for Chemical Engineering Lecture 3
2
Week schedule Week 1: Measurement and statistics Week 2: Error propagation Week 3: Simple linear regression analysis Week 4: Multiple linear regression analysis Week 5: Nonlinear regression analysis
3
Detailed contents of week 3 Least Squares Method simple linear regression –parameter estimates –residuals –confidence intervals –significance test –influential points –lack-of-fit
4
Least Squares measurements of time and distance estimate speed (assuming constant speed)
5
Tijd (sec) Gemeten afstand Berekende afstand Gemeten – Berekende afstand Kwadraat 1 36.754 21 15.754 248.19 2 71.845 32 39.845 1587.62 3 60.479 43 17.479 305.52 4 101.149 54 47.149 2223.03 5 103.150 65 38.15 1455.42 6 111.148 76 35.148 1235.38 7 142.170 87 55.17 3043.73 8 157.334 98 59.334 3520.52 9 161.843 109 52.843 2792.38 10 206.030 120 86.03 7401.16 Kwadratensom 23812.96 Table of measurements and squares
6
Visualisation of sums of squares
7
Types of regression analysis Linear means linear in coefficients, not linear functions! Simple linear regression Multiple linear regression Non-linear regression
8
Surface tension nitrobenzene measurements of temperature and surface tension temperature ranges from 40 to 200 o C scatter plot indicates linear relation
9
Regression analysis of nitrobenzene example
10
Confidence intervals parameter estimates: estimate +/- t 14-2;0,025 standard error predicted values (extrapolation is dangerous, most accurate predictions at mean of independent variable)
11
Extrapolation
12
Significance testing
13
Model: Y i = 0 + 1 x 1 + i ssumptions: the model is linear (+ enough terms) the i 's are normally distributed with =0 and constant variance 2 the i 's are independent. Simple Linear regression: model assumptions
14
Normality checking + independence check normality by considering residuals apply both graphical checks and Shapiro-Wilks check independence by using the Durbin – Watson test also check residuals by plotting them against time
15
Residuals use studentized residuals in order to obtain universal scale e versushomogeneity of variance e versuslinearity e versus timeindependence of errors e versus x i homogeneity of variance
16
Lack-of-fit test if multiple measurements are available, then we may test whether model may be improved significantly test is based on two different ways of computing standard deviation note difference with testing of model is significant
17
Influential points regression lines tend to go to remote points: see http://www.stat.sc.edu/~west/javahtml/Regression.html http://www.stat.sc.edu/~west/javahtml/Regression.html
18
Check-list 1. apply regression analysis 2.check whether regression is signficant. If applicable, apply lack-of-fit test 3.study residual plots for constant variance 4.check for outliers 5.check normality of residuals (graphical checks, Shapiro-Wilks) 6.check independence of residuals (residual plots, Durbin – Watson) 7.check for influential points
19
Causality and regression Significant regression results do not imply causal relation ! Statistical results must be explained (afterwards) by chemical theory.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.