Estimation Bias, Standard Error and Sampling Distribution Estimation Bias, Standard Error and Sampling Distribution Topic 9.

Slides:



Advertisements
Similar presentations
Estimation of Means and Proportions
Advertisements

“Students” t-test.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
© 2011 Pearson Education, Inc
Chap 8-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 8 Estimation: Single Population Statistics for Business and Economics.
Statistics for Business and Economics
Estimation Procedures Point Estimation Confidence Interval Estimation.
Topic 7 Sampling And Sampling Distributions. The term Population represents everything we want to study, bearing in mind that the population is ever changing.
Lecture 5 Outline – Tues., Jan. 27 Miscellanea from Lecture 4 Case Study Chapter 2.2 –Probability model for random sampling (see also chapter 1.4.1)
BHS Methods in Behavioral Sciences I
Chapter 8 Estimation: Single Population
Chapter Topics Confidence Interval Estimation for the Mean (s Known)
Chapter 7 Estimation: Single Population
1 (Student’s) T Distribution. 2 Z vs. T Many applications involve making conclusions about an unknown mean . Because a second unknown, , is present,
Sampling Distributions & Point Estimation. Questions What is a sampling distribution? What is the standard error? What is the principle of maximum likelihood?
Confidence Intervals W&W, Chapter 8. Confidence Intervals Although on average, M (the sample mean) is on target (or unbiased), the specific sample mean.
Bootstrapping applied to t-tests
Confidence Interval Estimation
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Chapter 11: Estimation Estimation Defined Confidence Levels
Dan Piett STAT West Virginia University
Estimation of Statistical Parameters
Topic 5 Statistical inference: point and interval estimate
Lecture 14 Sections 7.1 – 7.2 Objectives:
ESTIMATION. STATISTICAL INFERENCE It is the procedure where inference about a population is made on the basis of the results obtained from a sample drawn.
© 2003 Prentice-Hall, Inc.Chap 6-1 Business Statistics: A First Course (3 rd Edition) Chapter 6 Sampling Distributions and Confidence Interval Estimation.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Estimates and Sample Sizes Lecture – 7.4
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
When σ is Unknown The One – Sample Interval For a Population Mean Target Goal: I can construct and interpret a CI for a population mean when σ is unknown.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Statistical estimation, confidence intervals
Biostatistics Dr. Chenqi Lu Telephone: Office: 2309 GuangHua East Main Building.
Sampling Error.  When we take a sample, our results will not exactly equal the correct results for the whole population. That is, our results will be.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 8-1 Confidence Interval Estimation.
Confidence Intervals Lecture 3. Confidence Intervals for the Population Mean (or percentage) For studies with large samples, “approximately 95% of the.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Statistics and Quantitative Analysis U4320 Segment 5: Sampling and inference Prof. Sharyn O’Halloran.
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco State University.
1 Estimation of Population Mean Dr. T. T. Kachwala.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
8.1 Estimating µ with large samples Large sample: n > 30 Error of estimate – the magnitude of the difference between the point estimate and the true parameter.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
+ Unit 5: Estimating with Confidence Section 8.3 Estimating a Population Mean.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.
6-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
Confidence Intervals Dr. Amjad El-Shanti MD, PMH,Dr PH University of Palestine 2016.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 8 Confidence Intervals Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Statistics for Business and Economics 7 th Edition Chapter 7 Estimation: Single Population Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
CHAPTER 6: SAMPLING, SAMPLING DISTRIBUTIONS, AND ESTIMATION Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for a Diverse Society.
Confidence Intervals.
ESTIMATION.
Making inferences from collected data involve two possible tasks:
Point and interval estimations of parameters of the normally up-diffused sign. Concept of statistical evaluation.
CONCEPTS OF ESTIMATION
Presentation transcript:

Estimation Bias, Standard Error and Sampling Distribution Estimation Bias, Standard Error and Sampling Distribution Topic 9

From sample to population Inductive (inferential) statistical methods Make inference about a population based on information from a sample derived from that population Population sample inductive statistical methods

Statistical Concepts of Sampling Suppose we want to estimate the mean birthweight of Malay male live births in Singapore, 1992 Due to logistical constraints, we decide to take a random sample of 50 live births from the records of all Malay male live births for that year

Sampling from Target Population random sample of 50 Malay male live births in Singapore, 1992 Target population: All Malay male live births in Singapore, 1992 Suppose sample mean = 3.55 kg sample SD (S) = 0.92 kg What can we say about the population mean?

Statistical Modeling Assume the population values follow a normal or some other appropriate distribution. This means a relative frequency histogram of the population values will look like a normal or that appropriate distribution. Assume we have a random sample, i.e., we sample n (=50 in example) values independently from the population

Notation Sample data : Assumeare independent and each is distributed according to say a normal distribution Population parameters: Population mean = mean of the normal population Population variance = variance of the normal population Population standard deviation

Two general areas: (a)Statistical Estimation i.e. estimating population parameters based on sample statistics Statistical Inference (b) Hypothesis Testing i.e. testing certain assumptions about the population Also called Test of Statistical Significance

Statistical Estimation There are two ways by which a population parameter can be estimated from a sample: (1)Point estimate (2)Interval estimate

Point Estimate Estimate the population parameter by a single value: Sample meanpopulation mean Sample medianpopulation median Sample variancepopulation variance Sample SDpopulation SD Sample proportionpopulation proportion

If the average birthweight for a random sample of Malay male births was 3.55 kg and we use it to estimate , the mean birthweight of all Malay male births in the population, we would be making a point estimate for  Point Estimate Poor practice to report just the point estimate because people cannot judge how good the estimate is Should also report the accuracy of the estimate. Remember that the quality of an estimator is judged by its performance over REPEATED SAMPLING although we have just one sample in hand. Inference for population parameter should make allowance for sampling error

Accuracy of statistical estimation Two types of error: (a) Sampling error or fluctuation “random” error or fluctuation that is due entirely to chance in the process of sampling. Minimizing the sampling error maximizes the precision of a statistical estimate. (b)Systematic error or bias Non-random error/bias which is either a property of the estimator itself or due to bias in the sampling or measurement process. Minimizing the systematic error maximizes the validity of a statistical estimate. Systematic errors can be minimized by making efforts to reduce measurement bias (eg non-random sampling, non- response and non-coverage, untruthful answers, unreliable calibration, errors with data recording and coding etc)

Unbiased estimation of the mean i.e., the sample mean equals the population mean when averaged over repeated samples

Unbiasedness means the sample mean equals the population mean when averaged over repeated samples However, there is fluctuation from sample to sample Variance = ? Hypothetical results of repeated sampling

Standard Error (SE) of an estimator The SE of an estimator (e.g., the sample mean) is just the standard deviation (SD) of the estimator. It measures the variability of the estimator under “repeated” sampling SE is just a special case of SD The reason why the standard deviation of an estimator is called standard error is because it is a measure the magnitude of the estimation error due to sampling fluctuation

Standard Deviation vs Standard Error The population standard deviation (SD) measures the amount of variation among the individual measurements that make up the population and can be estimated from a sample using the sample standard deviation. The standard error (e.g. of the sample mean), on the other hand, measures how much the value of the estimator changes from sample to sample under repeated sampling. As we take only 1 sample rather that repeated samples in practice, it seems impossible at first to estimate standard error which is defined with reference to repeated sampling. Fortunately, the standard error of the sample mean is a function of the population SD. As the latter is estimable from a single sample, so is the standard error.

Estimated standard error of the sample mean Let denote the population SD It was shown earlier that SE = SD(sample mean) = /, where n is the sample size Since can be estimated by the sample standard deviation S, we can estimate the standard error by SE = S/ Note that SE decreases with n at the rate 1/, i.e., the precision of the sample mean improves as sample size increases

Knowing the mean and standard error of an estimator still doesn’t tell us the whole story The whole story is told by the sampling distribution since that helps in calculating the probabilities

Sampling distribution of the sample mean The distribution of the sample mean under “repeated” sampling from the population Distribution of the sample mean rather than individual measurements In practice, we take only one sample, not repeated samples and so the sampling distribution is unobserved but fortunately it can often be derived theoretically Demo:

If the population is normal with mean and variance, then the sample mean based on a random sample of size n is also normal with mean and variance Note how we can derive theoretically the distribution of the sample mean under repeated sampling without actually drawing repeated samples This is important because we usually only have one sample at our disposal in practice Exact result when sampling from a normal population

Topic 10: Interval Estimate Provides an estimate of the population parameter by defining an interval or range of plausible values within which the population parameter could be found with a given confidence. This interval is called a confidence interval. The sampling distribution is used in constructing confidence intervals.

Confidence interval for the mean of a normal population Fact: With probability 0.95, a normally distributed variable is within 1.96 standard deviations from its mean. Now It follows that the sample mean must be within 1.96 standard errors from the population mean with probability Equivalently, the population mean is within 1.96 standard errors from the sample mean.

We call a 95% confidence interval for the population mean. If is unknown, replace it by the sample SD and replace 1.96 by the upper 2.5-percentile of a t-distribution with n-1 degrees of freedom to yield

as a 95% confidence interval for the population mean

The t densities t densities are symmetric and similar in appearance to N(0,1) density but with heavier tails Tables for t distributions are widely available As d.f. increases, t distribution converges to standard normal distribution Demo:

95% confidence interval for the population mean Birthweight data revisited n = 100,Sample mean = 3.55 kg, S = 0.92 kg SE =.92/sqrt(50) = 0.13 kg d.f. = 49, upper 2.5-percentile of t = % C.I. for the mean Malay male birthweight is / (0.13) = (3.29 kg, 3.81 kg)

The meaning of confidence interval Under repeated sampling, will contain the true mean 95% of the times.

Demo: