Download presentation
Presentation is loading. Please wait.
1
Points and Interval Estimates
Example: AS Part of the budgeting process for next year, the manager of the Far Point electric generating plant must estimate the coal he will need for this year. Last year the plant almost ran out, so he is reluctant to budget for the same amount again. The plant manager however does feel that the past usage data will help him estimate that the number of coal to order. A random sample of 10 plant operating weeks chosen over the last 5 years yielded a mean usage of 11,400 tons a week, a sample sd of 700 tons a week. The plant manager can make a sensible estimate of the amount to order this year including some idea of the accuracy of the estimate he has made.
2
Points and Interval Estimates
Inferences from a Sample Estimation and Confidence Interval Statistical Significance t-statistics Sample Size Finite Population Multiplier
3
Point Estimate: A point estimate of the parameter is a single number that can be regarded as a sensible value for . A point estimate is obtained by selecting a suitable statistic and computing its value from the given sample data. The selected statistic is called the point estimator of . Properties of a Good Estimator Unbiasedness Minimum variance Unbiasedness An estimator is said to be unbiased if the expected value of the estimator is equal to the population parameter being estimated. If is the parameter being estimated and is an unbiased estimator of , then .
4
Example: For example sample mean is an unbiased estimator of the population mean, since
.Let be a random sample from a distribution with mean and variance . Then the estimator is an unbiased estimator of Estimators with Minimum Variance Suppose are two estimators of that are both unbiased. Then, although the distribution of each estimator is centered at the true value of , the spreads of the distributions about the true value may be different.
5
Principle of Minimum Variance Unbiased Estimation: Among all estimators of that are unbiased choose the one that has minimum variance. The resulting is called the minimum variance unbiased estimator of ü The MVUE is, in a certain sense, the most likely among all unbiased estimators to produce an estimate close to the true . Theorem: Let be a sample from a normal distribution with parameters . Then the estimation is the MVUE for . Or for a symmetrical distribution, both the sample mean and sample median are unbiased estimators of the population mean. But if we consider the criteria of minimum variance then it can be shown that sample mean is the better estimator of the population mean.
6
Maximum Likelyhood Estimation
Let have joint pmf or pdf f( ; (1) where the parameters have unknown values. When x1,x2,…,xn are the observed sample values and (1) is regarded as the function of , it is called the likelihood function. The maximum likelihood estimates (mle’s) are those values of the that maximize the likelihood function so that f( ; ) f( ; for all When X’s are substituted in place of the x’s, the maximum likelihood estimators results.
7
Example: Suppose X1,X2,…Xn is a random sample from an exponential distributions with parameter Because of independence, the likelihood function is a product of the individual’s pdf’s = = The ln(likelihood ) is Ln[f( ); ]=nln( )- Equating ( )[ln (likelihood)] to zero results in Thus the mle is It is not an unbiased estimator, since E(1/
8
Calculating interval estimate of the mean from large samples When .
Interval Estimation Let be a population parameter and (0< <1) a given number. If their exist two statistics A(X1,X2,…,Xn) and B=( X1,X2,…,Xn) and the observed values of the statistics are a(x1,x2,…,xn) and b(x1,x2,…,xn) then P(A< <B)= Then the interval (a,b) is called the interval estimate or the confidence interval of the parameter with confidence coefficient , where a and b are called the lower and upper confidence limits for . Calculating interval estimate of the mean from large samples When
9
We can conveniently choose the statistics
z= The sampling distribution of z is N(0,1), which depends on , the parameter to be estimated. Take two points symmetrically about the origin such that p( or P( or P( )
10
ü Above formula is used for
X 1-a Z ü Above formula is used for when the sample sizes are large, population distribution may be of any shape. When the population distribution is normal then the sample sizes may be smaller.
11
To interpret the above equation (1), think of a random interval
…..2 This interval is random as associated to the both ends is a random variable. Before the experiment is performed and any data is gathered, it is quite likely that mu will be lie inside the above interval (2). If after observing X1=x1,X2=x2……,Xn=xn we compute the observed sample mean and then substitute in to (1) in place of , the resulting fixed interval is called a % confidence interval of This confidence interval can be expressed as with % confidence Interpretation : Long run frequency interpretation of probability. It is incorrect to write the statement P ( lies in (79.3,80.7)=0.95
12
Z Values for Some of the More Common Levels of Confidence
90% 95% 98% 99% Confidence Level Z Value 1.645 1.96 2.33 2.575 If we want to construct a 95% confidence interval, the level of confidence is 95% or .95.
13
When the Population SD is unknown and n is large.
Use the estimate of the population standard deviation So replace by s. Confidence interval to estimate when the population standard deviation is unknown and n is large.
14
Confidence interval to estimate using the finite correction factor
Finite population multiplier When we have the finite population and sample is more than 5% of the population, then finite population multiplier is to be multiplied to the standard error. Confidence interval to estimate using the finite correction factor
15
Applications Exercise 8.9
Exercise 8.9 A community health association is interested in estimating the average number of maternity days woman stay in the local hospital. A random sample is taken of 36 woman who had babies in the hospital during the past one year. The following number of maternity days each woman was in the hospital are rounded to the nearest day. Use these data to construct the 98% confidence interval to estimate the average maternity stay in the hospital for all women who have babies in this hospital.
16
Estimating the Mean of a Normal Population: Small n and Unknown
The population has a normal distribution. The value of the population standard deviation is unknown. The sample size is small, n < 30. Z distribution is not appropriate for these conditions t distribution is appropriate
17
Properties of the t Distributions:
A t distribution is governed by only one parameter the df. The possible values of are the positive integers 1,2,… each different value of corresponds to a different t distribution. Let denote the density function of the curve for df. 1. Each curve is bell shaped and centered at zero. 2. Each curve is more spreadout than the standard normal (z) curve. 3. As increases, the spread of the corresponding curve decreases. 4. As , the sequence of the curve approaches the standard normal curve ( so the z curve is often called the t curve with df= ) There is a separate t distribution for every sample size or in statistical language “ There is a t distribution for every degrees of freedom” Degrees of freedom is defined as the number of values we can choose freely.
18
Summary of the Student’s t definition:
Let i=1,2,…,n) be a random sample of size n from a normal population with mean and variance then the students t distribution is by the statistic where is the sample mean and is an unbiased estimate of the population variance, and it follows the student’s t distribution with =(n-1) df with probability density function
19
t0.050 Table of Critical Values of t t0.100 t0.025 t0.010 t0.005 t
df t0.100 t0.050 t0.025 t0.010 t0.005 1 3.078 6.314 12.706 31.821 63.656 2 1.886 2.920 4.303 6.965 9.925 3 1.638 2.353 3.182 4.541 5.841 4 1.533 2.132 2.776 3.747 4.604 5 1.476 2.015 2.571 3.365 4.032 23 1.319 1.714 2.069 2.500 2.807 24 1.318 1.711 2.064 2.492 2.797 25 1.316 1.708 2.060 2.485 2.787 29 1.311 1.699 2.045 2.462 2.756 30 1.310 1.697 2.042 2.457 2.750 40 1.303 1.684 2.021 2.423 2.704 60 1.296 1.671 2.000 2.390 2.660 120 1.289 1.658 1.980 2.358 2.617 1.282 1.645 1.960 2.327 2.576 t
20
Example-1: A Generating plant manager wanted to estimate the coal needed for this year and took a sample by measuring coal usuage for 10 weeks. The sample data are n=10 weeks, The plant manager wants an interval estimate of the mean coal consumption in 95% level.
21
Exercise 8.21 The marketing director of a large department store wants to estimate the average number of customers who enters the store every 5 minutes. She randomly selects 5- minute intervals and count the number of arrivals in the store. She obtains the figure 58, 32, 41, 47, 56, 80,45, 29, 32 and 78. The analyst assumes the number of arrivals is normally distributed. Using these data, the analyst computes a 95% confidence interval to estimate the mean value for all 5 minutes intervals. What interval values does she get?
22
Confidence Interval to Estimate the Population Proportion
23
Exercise 8.29. The highway department wants to estimate the proportion of vehicle on Interstate 25 between the hours of midnight and 5 am that are 18 wheel tractor trailers. The estimate will be used to determine highway repair and construction consideration and in highway petrol planning. Suppose the researcher for the highway department counted vehicles at different locations on the interstate for several nights during this time period. Of the 3,481 vehicles counted, 927 were 18 wheelers . a. Determine the point estimate for the proportions of vehicles traveling interstate 25 during this time period that are 18 wheelers. b. Construct a 99 percent confidence interval for the proportions of vehicles on interstate 25during this time period that are 18 wheelers.
24
Error of Estimation (tolerable error) Estimated Sample Size
Determining Sample Size when Estimating Z formula Error of Estimation (tolerable error) Estimated Sample Size Estimated
25
Applications: Exercise: 8.39 A bank officer wants to determine the amount of average total monthly deposits per customer at the bank. He believes an estimate of this average amount using a confidence interval is sufficient. How large a sample should he take to be within $200 of the actual average with 99% confidence? He assumes the standard deviation of total monthly deposits for all customers is about $1000.
26
Example What proportion of secretaries of Fortune 500 companies has a personal computer at his or her workstation? You want to answer this question by conducting a random survey. How large a sample should you take if you want to be 95% confident of the results and you want the error of the confidence interval to be no more than .05? Assume no one has any idea of what the proportion actually is.
27
Error of Estimation (tolerable error) Estimated Sample Size
Determining Sample Size when Estimating P Z formula Error of Estimation (tolerable error) Estimated Sample Size
28
Estimating the population variance
Confidence Interval to estimate the population variance
29
Properties of the Chi-square distribution
X2 Distribution is a continuous probability distribution. (2) The exact shape of the distribution depends upon the number of degrees of freedom. For different values of, we shall have different shapes of the distribution. In general when is small, the shape of the curve is skewed to the right and as V gets larger, the distribution becomes more and more symmetrical and can be approximated by the normal distribution. (3) The mean of the chi-square distribution is V and variance 2V The sum of independent chi-square variates is also a chi-square variate. (5) The chi-square distribution is the sum of the squares of k independent random variables and there fore can never less than zero.
30
Df F F df = 5 0.10
31
Exercise 8.34. The Interstate Conference of employment Security Agencies says the average workweek in the United States is down to only 35 hours, largely because of a rise in part-time workers. Suppose this figure was obtained from a random sample of 20 workers and that the SD of the sample was 4.3 hours. Assume hours worked per week are normally distributed in the population. Use this sample information to develop a 98% confidence interval for the population variance of the number of hours worked per week for a worker. What is the point estimate?
32
8.36 Suppose a random sample of 14 people years of age produced the house hold income shown here. Use this data to determine a point estimate of the for the population variance of household income for people years of age and construct a 95% confidence interval. Assume house hold income is normally distributed. $ 37,500 44,800 33,500 36,900 42,300 32,400 28,000 41,200 46,600 38,500 40,200 32,00 35,000 36,800
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.