Chapter 3 INTERVAL ESTIMATES

Slides:



Advertisements
Similar presentations
Estimation of Means and Proportions
Advertisements

Statistical Inference Statistical Inference is the process of making judgments about a population based on properties of the sample Statistical Inference.
Sampling: Final and Initial Sample Size Determination
Chapter 11- Confidence Intervals for Univariate Data Math 22 Introductory Statistics.
Chapter Topics Confidence Interval Estimation for the Mean (s Known)
Standard error of estimate & Confidence interval.
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
LECTURE 16 TUESDAY, 31 March STA 291 Spring
Determination of Sample Size: A Review of Statistical Theory
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
Estimation Chapter 8. Estimating µ When σ Is Known.
Chapter 7 Process Capability. Introduction A “capable” process is one for which the distributions of the process characteristics do lie almost entirely.
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
What is a Confidence Interval?. Sampling Distribution of the Sample Mean The statistic estimates the population mean We want the sampling distribution.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
Sampling and Sampling Distributions. Sampling Distribution Basics Sample statistics (the mean and standard deviation are examples) vary from sample to.
 Normal Curves  The family of normal curves  The rule of  The Central Limit Theorem  Confidence Intervals  Around a Mean  Around a Proportion.
Sampling Distributions
Active Learning Lecture Slides For use with Classroom Response Systems
Confidence Intervals Cont.
Chapter 6: Sampling Distributions
Applied Statistics and Probability for Engineers
Inference: Conclusion with Confidence
Advanced Quantitative Techniques
Introduction to Inference
Chapter 2 HYPOTHESIS TESTING
Continuous Probability Distributions
Introduction For inference on the difference between the means of two populations, we need samples from both populations. The basic assumptions.
More on Inference.
Chapter 9 Estimation: Additional Topics
BAE 6520 Applied Environmental Statistics
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
ESTIMATION.
CHAPTER 2 Modeling Distributions of Data
BAE 5333 Applied Water Resources Statistics
Chapter 3 INTERVAL ESTIMATES
LECTURE 24 TUESDAY, 17 November
STA 291 Spring 2010 Lecture 12 Dustin Lueker.
Chapter 4 Comparing Two Groups of Data
Chapter 6 Confidence Intervals.
Inference: Conclusion with Confidence
Chapter 6: Sampling Distributions
Basic Practice of Statistics - 5th Edition
Chapter 7 Sampling Distributions.
Parameter, Statistic and Random Samples
Week 10 Chapter 16. Confidence Intervals for Proportions
Statistics in Applied Science and Technology
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Hypothesis Tests for a Population Mean in Practice
CI for μ When σ is Unknown
More on Inference.
CONCEPTS OF ESTIMATION
Elementary Statistics
CHAPTER 22: Inference about a Population Proportion
Chapter 6 Confidence Intervals.
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 6 Statistical Inference & Hypothesis Testing
CHAPTER 15 SUMMARY Chapter Specifics
From Samples to Populations
Continuous Probability Distributions
STA 291 Spring 2008 Lecture 13 Dustin Lueker.
Interval Estimation Download this presentation.
STA 291 Spring 2008 Lecture 12 Dustin Lueker.
How Confident Are You?.
Presentation transcript:

Chapter 3 INTERVAL ESTIMATES BAE 6520 Applied Environmental Statistics Biosystems and Agricultural Engineering Department Division of Agricultural Sciences and Natural Resources Oklahoma State University Source Dr. Dennis R. Helsel & Dr. Edward J. Gilroy 2006 Applied Environmental Statistics Workshop and Statistical Methods in Water Resources

Population vs. Sample We measure characteristics of a sample and infer that they apply to the population.

Intervals An interval computed from sample data provides information on how certain we are of where the true population parameter is.

Interval Estimates Confidence Interval – contains an unknown parameter (mean, median) of the population with a specified probability Prediction Interval – contains one or more future observations with a specified probability Tolerance Interval – contains a proportion (percentile) of future observations with a specified probability

What is Inside the Interval? Confidence Interval – contains an unknown parameter (mean, median) of the population with a specified probability Prediction Interval – contains one or more future observations with a specified probability Tolerance Interval – contains a proportion (percentile) of future observations with a specified probability

Your Interval May Not Contain the True Value!

Meaning of a Confidence Interval If you compute ten 90% confidence intervals Each from a different sample of data collected under identical conditions with identical methods Thus, each sample is equally valid Nine of the 10 intervals (90%) will contain the true mean. One will not! You never know if yours is that one!!!

Meaning of a Confidence Interval Ten 90% Confidence Intervals

Meaning of a Confidence Interval Example: 90% Confidence Interval about the Mean We are 90% confident that the true mean turbidity in the Poteau River is between 5 and 200 NTU (Nephelometric Turbidity Units).

Computing Confidence Intervals Parametric Intervals μ = population mean X = sample mean z = depends on confidence level σ = standard error of the mean _ Symmetric around the sample mean Confidence levels are valid if the data are normally distributed or there are a large amount of data

Computing Confidence Intervals Nonparametric Intervals Usually computed on median or other percentile Endpoints are data values Count in the same number of data from each end of the ranked dataset Does not depend on assumption that the data are normally distributed

Confidence Intervals on Skewed Data Parametric intervals assume the data follow a normal distribution or the mean does. If this is incorrect, the confidence intervals will not include the true value as often as the confidence interval suggests.

Confidence Intervals on Skewed Data First Approach Transform the data to approximate normality Compute the confidence interval Problem When the confidence interval is retransformed to the original units, it is no longer a confidence internal on the mean With logs, it is a confidence interval on the geometric mean, an estimate of the median

Confidence Intervals on Skewed Data Example Arsenic Concentrations New Hampshire Groundwater

Confidence Intervals on Skewed Data Second Approach Hope that the Central Limit Theorem applies. This is a function of the data skewness and the sample size See Chapter 2 for Central Limit Theorem discussion

Bootstrapping Currently the best way to compute a confidence internal from skewed data, or small sample size Does not require assumption of normality

Confidence Intervals on Skewed Data Third Approach - Bootstrapping Sample from the data set, with replacement This subsample is generated with replacement so that any data point can be sampled multiple times or not sampled at all. Compute the estimated statistic Do this many times Confidence endpoints determined from the ranked estimated statistic Based on the data set, so it works best with more data

Confidence Intervals on Skewed Data Bootstrapping Example: Arsenic Data Set Randomly pick 25 values from a 25 point arsenic data set. Sample with replacement. Compute the mean of these 25 values Do again 1000 times A 2-sided 95% confidence interval for the mean is the 0.025*1000th and 0.975*1000th ranked values for the mean

Confidence Intervals on Skewed Data Bootstrapping Example: Arsenic Data Set

Confidence Intervals on Skewed Data Bootstrapping Example: Arsenic Data Set

Other Confidence Intervals Can have other confidence intervals for other parameters Variance Standard Deviation Other percentiles Median Confidence intervals for a percentile is often call a “tolerance interval”

Prediction Intervals (contains one or more future observations with a specified probability) Simplest prediction interval (nonparametric) is to use the percentiles of the data For a two-sided 90% prediction interval, use the 5th and 95th percentiles 90% of the observed data fall within this interval, and thus we expect that 90% of the future observations will also fall within this interval Requires ample data

Prediction Intervals Parametric prediction interval will be shorter than a nonparametric interval if: Data follow the distribution assumed by the interval calculation Easy method for prediction interval Transform data to look normal Compute interval Transform interval back to original units

Confidence vs. Prediction Intervals A prediction interval will always be larger than the confidence interval for the same alpha. Why? The mean of 10 observations, for example, is always less variable than the location of the 10 observations themselves.

Tolerance Intervals (contains a proportion of future observations with a specified probability) An interval around a proportion of the distribution The proportion is called the “converge” What cutoff(s) will cover 95% of all future observations, with 90% confidence? Easy method for tolerance interval Transform data to look normal Compute interval Transform interval back to original units Works for prediction and tolerance intervals, but not confidence intervals