Download presentation
Presentation is loading. Please wait.
Published byHomer Davis Modified over 8 years ago
1
CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 6 Confidence intervals Confidence intervals for the sample mean Confidence Intervals (using t-distribution) Examples ©Claudio Nunez 2010, sourced from http://commons.wikimedia.org/wiki/File:2010_Chile_earthquake_- _Building_destroyed_in_Concepci%C3%B3n.jpg?uselang=en-gb Available under creative commons license
2
Population: μ σ 2, N Sample 2 Sample 3 Sample 4 Sample 1 Each sample will have a sample mean and a sample variance. Sampling (from last lecture)
3
The frequency distribution of a given sample statistic, e.g. the mean, standard deviation, range, etc, which would result if a large number of random samples of the same size (n) were drawn from the same population is known as the sampling distribution. Sampling Distributions e.g Sample distribution of the mean Example heights of students in lecture theatre Sample of size n=10
4
When working out if results of a trial/experiment are Statistically Significant Sampling Distributions- why important? -How reliable/good is a component i’m using on a Civil Eng project (concrete, steel truss, beam, new design/material.........etc. etc) -Is one component/material/design “Significantly” different from the another type More general areas: Any sort of medical, scientific experimental work, election poles, all experimental papers published from a University or company etc, -anywhere you might want to make predictions about general cases (populations) from samples
5
Sampling Distributions Take the sample mean For a sample of size n it can be shown that: The expected value of the sample mean = population mean The standard deviation of the sampling distribution of the mean is called the standard error of the mean.standard error of the mean. the spread of the sampling distribution of the mean decreases as the sample size increasesspread n=8 n=4 n=2 Very important to note the Difference between the distribution of X (the data) and the sample distribution of the mean
6
Sampling Distributions
7
Confidence Intervals Allows us to be 90 %, 95%, 99% sure that a population mean lies between certain values So we can take information about a sample and make inferences about the population e.g. The CONFIDENCE INTERVAL e.g. For a sample of students heights – we want, with 95% certainty, to find a range that the population mean falls between. Confidence Limits are the values at either end of this range.
8
We want 95% to fall Between confidence interval A(x) = in each tail - find Z (using reverse Normal tables) (amount of error we are willing to accept e.g. 0.05 (or 5%)) Significance level, 5% ( ) to fall outside confidence interval Value of 1.96 found from Normal tables for 95% (reverse look up of α/2 = 0.05/2 =0.025) A 95% Confidence Interval (C.I.) for a parameter, is an interval which is assessed to contain on 95% of occasions if repeated samples are taken. We usually wish to choose the smallest such interval, (with a symmetric population distribution will usually mean that we take a given distance either side of a given estimate. ) Confidence Intervals
9
We want 95% to fall Between confidence interval A(x) = in each tail - find Z (using reverse Normal tables) (amount of error we are willing to accept e.g. 0.05 (or 5%)) Significance level, 5% ( ) to fall outside confidence interval Value of 1.96 found from Normal tables for 95% (reverse look up of α/2 = 0.05/2 =0.025 ± 1.96 SD( ) Confidence Interval given by: A 95% Confidence Interval (C.I.) for a parameter, is an interval which is assessed to contain on 95% of occasions if repeated samples are taken. We usually wish to choose the smallest such interval, (with a symmetric population distribution will usually mean that we take a given distance either side of a given estimate. ) Confidence Intervals
10
In the particular case of the sample mean we can demonstrate that 95% of the time the mean of a sample of size n will be within of the sample mean. Confidence Intervals Hence we define the following conventional confidence intervals for the “population” means: ± 0.674 SD( ) ± 1.645 SD( ) ± 1.96 SD( ) where: SD ( ) 50% C.I. for μ... 90% C.I. for μ... 95% C.I. for μ... Found from Normal tables (reverse) TRUE WHEN WE KNOW σ AND FOR LARGE SAMPLES
11
In the particular case of the sample mean we can demonstrate that 95% of the time the mean of a sample of size n will be within of the sample mean. Confidence Intervals Hence we define the following conventional confidence intervals for the “population” means: ± 0.674 SD( ) ± 1.645 SD( ) ± 1.96 SD( ) where: SD ( ) 50% C.I. for μ... 90% C.I. for μ... 95% C.I. for μ... Found from Normal tables (reverse) NOTE: This assumes we know the POPULATION standard deviation σ. If we don’t- we have to use s (the sample standard deviation) as an estimate of the POPULATION standard deviation and use t-distribution for small samples. TRUE WHEN WE KNOW σ AND FOR LARGE SAMPLES
12
4 cases to consider 1) A sample (large or small) taken from a normally distributed population, with a known variance (σ 2 is known) 2) A large sample taken from a population, with a known variance (σ 2 known) 3) A large sample taken from a population, with an unknown variance (σ 2 NOT known) 4) A small sample taken from a population, with an unknown variance Confidence Intervals Typically large means > 20-30
13
Confidence Intervals Example Assume car speeds are Normally distributed with a standard deviation of 6 mph. If from a sample of 1000 cars we obtain a mean speed of 30.5 mph, what are the 95% confidence limits for μ? We would thus anticipate that μ lies in the range ± 1.96 x 0.19 i.e. 30.5 ± 0.372 or (30.128, 30.872) mph For n =1000 the standard deviation of is We have estimated the Population mean μ with 95% CI i.e. the population mean will have a 95% chance of being in the range calculated i)If the sample was of 100 cars what are 95% confidence limits for µ? ii)If the sample was of 100 cars what are 90% confidence limits for µ? ± 0.674 SD( ) ± 1.645 SD( ) ± 1.96 SD( ) 50% C.I. for μ... 90% C.I. for μ... 95% C.I. for μ... ± 1.96 SD( ) 95% C.I. for μ... ©Ian Fuller 2009, sourced from http://www.flickr.com/photos/ianfuller/3234155056/ Available under creative commons license
14
Example -solution i)If the sample was of 100 cars what are 95% confidence limits for µ? ii) If the sample was of 100 cars what are 90% confidence limits for µ? We would thus anticipate that μ lies in the range ± 1.96 x 0.6 i.e. 30.5 ± 1.176 or (29.324, 31.68) mph Larger range with smaller n We would thus anticipate that μ lies in the range ± 1.645 x 0.6 i.e. 30.5 ± 0.987 or (29.51, 31.49) mph Range becomes smaller for 90% CI For n=1000 it was (30.128, 30.872) mph
15
± 2.576 SD( ) ± 1.96 SD( ) 99% C.I. for μ... 95% C.I. for μ...
17
Multiple choice Choose A,B,C or D for each of these: In Statistics what does µ stand for? 1) A C D SAMPLES It’s the variance of a sample B It’s the mean of a sample It’s the mean of a populationIt’s the standard deviation of a sample
18
Multiple choice Choose A,B,C or D for each of these: In Statistics what is the name for σ 2 ? 2) A C D SAMPLES It’s the variance of a population B It’s the standard deviation squared of a sample It’s the standard deviation of a population it’s the mean of a sample
19
Multiple choice Choose A,B,C or D for each of these: In Statistics what does n stand for? 3) A C D SAMPLES It’s the number of samples taken B It’s the size of the sample It’s the nth member of a sampleIt’s the 1 st element in a sample
20
Multiple choice Choose A,B,C or D for each of these: In Statistics what does s stand for? 4) A C D SAMPLES It’s the variance of a sample B It’s the variance of a population It’s the mean of a populationIt’s the standard deviation of a sample
21
Multiple choice Choose A,B,C or D for each of these: Which of these is the standard deviation of a sample mean distribution (also called the standard error)? 5) A C D SAMPLING DISTRIBUTIONS B
22
Multiple choice Choose A,B,C or D for each of these: When would a t-distribution be more suitable to represent a Sampling Distribution (as opposed to a NORMAL distribution)? 6) A C D SAMPLING DISTRIBUTIONS For a large sample with known population standard deviation, σ B For any large sample For a small sample with a known population variance a small sample with an unknown population standard deviation, σ
23
Instead we use t-tables (extra parameter, v, the degrees of freedom) For small samples the t-distribution is distinctly flatter than the Normal distribution (for large samples n>30 approximates to normal distribution) Often we don’t know the POPULATION standard deviation σ (or variance σ 2 ). Use of the t-Distribution We can use the sample standard deviation as an estimate, but there are consequences t distribution Normal distribution t - distribution Sample size smallσ 2 unknown- use t Sample size smallσ 2 known- use z Sample size large σ 2 unknown - use z Sample size large σ 2 known - use z is no longer closely approximated by the standard Normal distribution (Z).
24
v=1 v=2 v=5 v=10 v=infinite t-distribution uses another parameter, v, the number of degrees of freedom (typically this is equal to n-1) t-tistribution approaches NORMAL distribution when n>30 (ish) (that’s why we can use the normal distribution, even when don’t know σ, but have a large sample ) Degrees of freedom v = n-1 (it changes in more complex examples)
25
Using t-tables Lets say we want 95% Confidence Limits and n = 20 Degrees of freedom v = 20–1 = 19 95% CI mean 0.025 in either tail. (compare with 1.96 for Normal Distribution) Confidence Interval= X
26
. Worked Example: confidence interval using t-dist n Q A survey is made of the output from a factory on eight randomly selected days in November, with the results as shown below What is the 95% confidence interval for the true mean output in November, assuming flows are approximately Normally distributed? Answer: What we are being asked here is to make an estimate of the population mean, knowing only the sample mean and variance of a sample size 8, i.e. a small sample. -estimate the true mean output (population mean) by the mean of the sample -and we would estimate the variance of the output population from the variance of the sample ©Paul Hows 2008, sourced from http://www.flickr.com/photos/howzey/5233868613 / Available under creative commons license
27
What is the 95% confidence interval for the true mean output in November, assuming flows are approximately Normally distributed? -we want a 95% confidence interval, we want 2½ % in each tail, so we look up A(t) = 0.025. -calculate the variance of the sample mean as follows: -Since the sample size is n=8, the number of degrees of freedom, v, is 7 (i.e. n-1). The tables give the critical t-value as:
29
What is the 95% confidence interval for the true mean output in November, assuming flows are approximately Normally distributed? -we want a 95% confidence interval, we want 2½ % in each tail, so we look up A(t) = 0.025. -calculate the variance of the sample mean as follows: -Since the sample size is n=8, the number of degrees of freedom, v, is 7 (i.e. n-1). The tables give the critical t-value as 2.3646. Hence the 95% confidence interval for μ is or to give an integer range. If we had known that the underlying population was σ 2 = 483.4 (instead of just our sample estimate) we could have used the normal distribution and critical values from Z tables. Work out the 95% CI for µ if this were the case.
30
N.B. If we had known that the underlying population was σ 2 = 483.4 (instead of just our sample estimate) we could have used the normal distribution and critical values from Z tables i.e. or The wider confidence interval found using the t-distribution reflects our greater degree of uncertainty in having estimated σ 2 from the small sample. t distribution Normal distribution t - distribution Sample size smallσ 2 unknown- use t Sample size smallσ 2 known- use z Sample size large σ 2 unknown - use z Sample size known σ 2 known - use z or Result when σ not known
31
https://vlebb.leeds.ac.uk/webapps/portal/frameset.jsp VLE demonstrate sampling distribution and central limit theorem
32
To test the strength of each batch of concrete, a sample of 9 small blocks are produced in moulds and left to set. After a week the site engineer tests the strength of each block by measuring the force (N/mm 2 ) required to crush each one. Sample strengths: 3 5 4 6 4 4 3 3 4 Question i) Assuming the population is Normally distributed and the population variance= 4, find the sample mean, X and SD(X) ii) A batch of concrete is passed for site if, from analysing the strength of the sample, the mean cube strength can be shown to be significantly above the national minimum standard of 2 N/mm 2. Test for this by finding the probability that we could obtain our sample mean IF the population mean, µ, was 2.
33
Sample strengths: 3 5 4 6 4 4 3 3 4σ 2 = 4, n=9 ii) A batch of concrete is passed for site if, from analysing the strength of the sample, the mean cube strength can be shown to be significantly above the national minimum standard of 2 N/mm 2. Test for this by finding the probability that we could obtain our sample mean, IF the population mean, µ, was 2. We want P(X > 4) {if the population mean strength was 2} So, we want P(Z > 3) = 0.0014 So it’s very unlikely (0.14%) that a sample mean of 4 would have come from a population, with a mean µ of 2. We can conclude that underlying population mean must be higher than 2.
34
Sample strengths: 3 5 4 6 4 4 3 3 4σ 2 = 4, n=9 ii) Concrete for a site which requires a higher performance concrete requires that a batch of concrete is only passed for site if, from analysing the strength of the sample, the mean cube strength can be shown to be significantly above a standard of 2.5 N/mm 2. Would the concrete still pass? Test for this by finding the probability that we could obtain our sample mean, IF the population mean, µ, was 2.5. We want P(X > 4) {if the population mean strength was 2.5} So, we want P(Z > 2.25) = 0.0122 So it’s very unlikely (1.2%) that a sample mean of 4 would have come from a population, with a mean µ of 2.5 (i.e. there is an 98.8% chance that it didn’t) We can conclude that underlying population mean must be higher than 2.5
35
CIVE2602 - Engineering Mathematics 2.2 Lecture 6- Summary Distribution of Sample Mean Confidence Intervals (e.g. 99%, 95%, 90% CI) Using t-distribution NEXT hypothesis testing – very useful
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.