Download presentation
Presentation is loading. Please wait.
Published bySolomon Lenard Hopkins Modified over 9 years ago
1
Stats of Engineers, Lecture 8
2
1.If the sample mean was larger 2.If you increased your confidence level 3.If you increased your sample size 4.If the population standard deviation was larger
3
Recap: Confidence Intervals for the mean Normal data, variance known or large data sample – use normal tables Normal data, variance unknown – use t-distribution tables Q
4
Normal t-distribution
5
Linear regression
6
Sample means Equation of the fitted line is
7
Quantifying the goodness of the fit Residual sum of squares
8
Predictions Confidence interval for mean y at given x What is the error bar?
9
y2401811931551721101137594 x1.69.415.520.022.035.543.040.533.0 Example: The data y has been observed for various values of x, as follows: Fit the simple linear regression model using least squares.
10
Recall fit was
11
Extrapolation: predictions outside the range of the original data
12
Looks OK!
13
Extrapolation: predictions outside the range of the original data Quite wrong! Extrapolation is often unreliable unless you are sure straight line is a good model
14
What about the distribution of future data points themselves? Confidence interval for a prediction Two effects: - Variance of individual points about the mean
16
Confidence interval for mean y at given x -Extrapolation often unreliable – e.g. linear model may well not hold at below-freezing temperatures. Confidence interval unreliable at T=-20. Answer
17
Correlation Regression tries to model the linear relation between mean y and x. Correlation measures the strength of the linear association between y and x. Weak correlationStrong correlation - same linear regression fit (with different confidence intervals)
18
If x and y are negatively correlated:
19
More convenient if the result is independent of units (dimensionless number). r = 1: there is a line with positive slope going through all the points; r = -1: there is a line with negative slope going through all the points; r = 0: there is no linear association between y and x. Pearson product-moment. Define
20
Notes: - magnitude of r measures how noisy the data is, but not the slope
21
Correlation A researcher found that r = +0.92 between the high temperature of the day and the number of ice cream cones sold in Brighton. What does this information tell us? 1.Higher temperatures cause people to buy more ice cream. 2.Buying ice cream causes the temperature to go up. 3.Some extraneous variable causes both high temperatures and high ice cream sales 4.Temperature and ice cream sales have a strong positive linear relationship. Question from Murphy et al.
23
Correlation r error - not easy; possibilities include subdividing the points and assessing the spread in r values. Error on the estimated correlation coefficient? J Polit Econ. 2008; 116(3): 499–532. http://www.journals.uchicago.edu/doi/abs/10.1086/589524
24
Strong evidence for a 2-3% correlation. - this doesn’t mean being tall causes you earn more (though it could)
25
1. 2. 3. 4.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.