ESTIMATION kamala8086@gmail.com
Introduction Study sample to learn about population E.g. mean value or proportion of some measurement
Statistic and parameters Sample gives statistic Population give a parameter Very rarely the statistic to be equal to parameter Quantity Sample (statistic) Population (parameter) Mean (‘x-bar’) µ (‘mu’) Variance s2 σ2 (‘sigma squared’) Proportion or (‘p-hat’) P
Error Almost always the statistic have some error Two types of error Sampling error Non-sampling error
Sampling Errors arise due to the fact that we have observed only part of the whole population get less important as the sample size increases E.g. in a population of 1000 people. The study of 100 individuals is much more accurate than 20 individuals
Non-Sampling Errors/systematic errors are due mainly to fault in the sampling process Created by bias Increasing the size of a sample will not necessarily reduce also occur through equipment faults
Standard Error of the Arithmetic Mean Consider the variable X. Suppose we take a sample of ‘n’ units and measure this variable sample mean ‘x’ differ from population mean u Several ns will give several x-bars which are not the same but similar (small sampling error) If difference is large then large sampling error
Mathematically The larger the sample size the better the precision in estimating If the variability of the observations in the parent (study) population is small we would expect the error to be small Sigma may not be known so may be replaced by s
Recall Population: 95% CI is given by: Ū-1.96σ to Ū+1,96σ For sample sigma is replaced by SE(x) Sample CI: Ū-1.96.SE(x) to Ū+1.96.SE(x) SE(x) = sigma/Square root of n For a sample sigma == s
The weights of a random sample of 11 three-year-old children were taken in a village. The sample mean was 16 kg and the standard deviation of the sample was 2 kg. Standard Error: SE = 0.6 kg 95% Confidence Interval: 16 ± (2 × 0.6) = 14.8 to 17.2 kg
The weights of a random sample of 20 three-year-old children were taken in a village. The sample mean was 16 kg and the standard deviation of the sample was 2 kg. Standard Error: SE = = 0.45 kg 95% Confidence Interval: 16 ± (2 × 0.45) = 15.1 to 16.9 kg This means that we are approximately 95% certain that the mean weight of all three-year-old children in this population lies between 15.1 and 16.9 kg.
Standard Error of the Proportion Recall: SE is directly proportional to variability and inversely proportional to sample size for a normal distribution Note: formula
Among a sample of 120 TB patients, which was drawn from the total population of TB patients in the country, it was found that 72 complied with their out-patient treatment. calculate the standard error of non-compliance calculate the confidence interval of non-compliance Calculate the CI of compliance Note: use percentage and then proportion
Question 1 The hemoglobin levels of a random sample of 40 five-year-old children were taken in a village. The sample mean was 13gm/dl and the standard deviation of the sample was 3 gm/dl. Please calculate the Standard Error of the mean, and calculate the 95% confidence interval.
Key Points The larger the sample size, the smaller the standard error and the narrower the confidence interval will be. The advantage of having a sufficiently large sample is that the sample mean will be a better estimate of the population mean. At a certain point, increases in sample size demand vast investments in time and money, whereas the confidence interval only marginally decreases.