Uncertainty and confidence If you picked different samples from a population, you would probably get different sample means ( x ̅ ) and virtually none.

Slides:



Advertisements
Similar presentations
Our goal is to assess the evidence provided by the data in favor of some claim about the population. Section 6.2Tests of Significance.
Advertisements

Objectives (BPS chapter 18) Inference about a Population Mean  Conditions for inference  The t distribution  The one-sample t confidence interval 
Objectives (BPS chapter 15)
Introduction to Statistics
Chapter 10: Hypothesis Testing
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Chapter 9 Hypothesis Tests. The logic behind a confidence interval is that if we build an interval around a sample value there is a high likelihood that.
Introduction to Inference Estimating with Confidence Chapter 6.1.
Introduction to Inference Estimating with Confidence IPS Chapter 6.1 © 2009 W.H. Freeman and Company.
IPS Chapter 6 © 2012 W.H. Freeman and Company  6.1: Estimating with Confidence  6.2: Tests of Significance  6.3: Use and Abuse of Tests  6.4: Power.
Chapter 9 Hypothesis Testing.
BCOR 1020 Business Statistics
Tests of significance: The basics BPS chapter 15 © 2006 W.H. Freeman and Company.
Objectives (BPS chapter 14)
Overview Definition Hypothesis
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
Tests of significance: The basics BPS chapter 15 © 2006 W.H. Freeman and Company.
Objectives 6.2 Tests of significance
14. Introduction to inference
CHAPTER 16: Inference in Practice. Chapter 16 Concepts 2  Conditions for Inference in Practice  Cautions About Confidence Intervals  Cautions About.
Chapter 8 Introduction to Hypothesis Testing
Section 9.2 Testing the Mean  9.2 / 1. Testing the Mean  When  is Known Let x be the appropriate random variable. Obtain a simple random sample (of.
Introduction to inference Tests of significance IPS chapter 6.2 © 2006 W.H. Freeman and Company.
Essential Statistics Chapter 131 Introduction to Inference.
CHAPTER 14 Introduction to Inference BPS - 5TH ED.CHAPTER 14 1.
Lesson Significance Tests: The Basics. Vocabulary Hypothesis – a statement or claim regarding a characteristic of one or more populations Hypothesis.
Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.
Chapter 10 AP Statistics St. Francis High School Fr. Chris.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Hypothesis and Test Procedures A statistical test of hypothesis consist of : 1. The Null hypothesis, 2. The Alternative hypothesis, 3. The test statistic.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Introduction to Inferece BPS chapter 14 © 2010 W.H. Freeman and Company.
10.1: Confidence Intervals Falls under the topic of “Inference.” Inference means we are attempting to answer the question, “How good is our answer?” Mathematically:
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
Week 8 October Three Mini-Lectures QMM 510 Fall 2014.
Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as.
Confidence intervals: The basics BPS chapter 14 © 2006 W.H. Freeman and Company.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
Chapter 6.1 Estimating with Confidence 1. Point estimation  Sample mean is the natural estimator of the unknown population mean.  Is the point estimation.
Slide Slide 1 Section 8-4 Testing a Claim About a Mean:  Known.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Tests of significance: The basics BPS chapter 14 © 2006 W.H. Freeman and Company.
AP Statistics Section 11.1 B More on Significance Tests.
IPS Chapter 6 © 2012 W.H. Freeman and Company  6.1: Estimating with Confidence  6.2: Tests of Significance  6.3: Use and Abuse of Tests  6.4: Power.
IPS Chapter 6 DAL-AC FALL 2015  6.1: Estimating with Confidence  6.2: Tests of Significance  6.3: Use and Abuse of Tests  6.4: Power and Inference.
Introduction to inference Tests of significance IPS chapter 6.2 © 2006 W.H. Freeman and Company.
© Copyright McGraw-Hill 2004
Introduction to Inference Estimating with Confidence PBS Chapter 6.1 © 2009 W.H. Freeman and Company.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Chapter 13 Sampling distributions
Tests of significance: The basics BPS chapter 14 © 2006 W.H. Freeman and Company.
Introduction to inference Estimating with confidence IPS chapter 6.1 © 2006 W.H. Freeman and Company.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.2 Tests About a Population.
+ Unit 6: Comparing Two Populations or Groups Section 10.2 Comparing Two Means.
Testing a Single Mean Module 16. Tests of Significance Confidence intervals are used to estimate a population parameter. Tests of Significance or Hypothesis.
Sampling and Inference September 23, Web Address
Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Statistics for Business and Economics Module 1:Probability Theory and Statistical Inference Spring 2010 Lecture 6: Tests of significance / tests of hypothesis.
Introduction to inference Estimating with confidence
14. Introduction to inference
Introduction to Inference
Tests of significance: The basics
Objectives 6.2 Tests of significance
Confidence intervals: The basics
The Practice of Statistics in the Life Sciences Fourth Edition
Lecture 10/24/ Tests of Significance
Confidence intervals: The basics
Objectives 6.1 Estimating with confidence Statistical confidence
Objectives 6.1 Estimating with confidence Statistical confidence
Presentation transcript:

Uncertainty and confidence If you picked different samples from a population, you would probably get different sample means ( x ̅ ) and virtually none of them would actually equal the true population mean, . Std of sample mean is 10/sqrt(217) =.679 ≈.7

n Sample means, n subjects  Population, x individual subjects If the population is N(μ,σ), σ, known, the sampling dist’n of sample mean is N(μ,σ/√n). If not, the sampling dist’n of sample mean is ~N(μ,σ/√n) if n is large enough. Use of sampling distributions  We take one random sample of size n, and rely on the known properties of the sampling distribution.

~95% of all intervals computed with this method capture the parameter μ. Based on the ~ % rule, we can expect that: When we take a random sample, we can compute the sample mean and an interval of size plus-or- minus 2σ/√n around the mean.

Confidence intervals A confidence interval is a range of values with an associated confidence level, C. This value quantifies the chance that the interval contains the unknown population parameter. We have confidence C that  falls within the interval computed.

The margin of error, m A confidence interval (“CI”) can be expressed as:  estimate ± a margin of error m:  within x ̅ ± m  an interval:  within ( x ̅ − m ) to ( x ̅ + m)

A 95% confidence interval for the mean body temperature (in °F) was computed as (98.1, 98.4), based on body temperatures from a sample of 130 healthy adults. The correct interpretation of this interval is ??: A) 95% of observations in the sample have body temperature between 98.1 and 98.4°F. B) 95% of the individuals in the population should have body temperature between 98.1 and 98.4°F. C) We are 95% confident that the population mean body temperature is between 98.1 and 98.4°F. The margin of error for this interval is ??: A) 95%B) 98.25°F. C) 0.3°F.D) 0.15°F.

When taking a random sample from a Normal population with known standard deviation σ, a level C confidence interval for µ is: CI for a Normal population mean (σ known) 80% confidence level C C z* -z*  σ/√n is the standard deviation of the sampling distribution  C is the area under the N(0,1) between −z* and z*

How do we find z * -values? For 95% confidence level, z* = 1.96 (almost 2). (…) We can use a table of z- and t- values (Table C). For a given confidence level C, the appropriate z*-value is listed in the same column.

Density of bacteria in solution  99% confidence interval for the true density, z* =2.576 = 28 ± 2.576*1/sqrt(3) = (26.5,29.5) million bacteria/ml  90% confidence interval for the true density, z* = = 28 ± 1.645*1/sqrt(3) =(27.1,28.9) million bacteria/ml Measurement equipment has Normal distribution with standard deviation σ = 1 million bacteria/ml of fluid. Three measurements made: 24, 29, and 31 million bacteria/ml. Mean: = 28 million bacteria/ml. Find the 99% and 90% CI.

Link between confidence level and margin of error The confidence level C determines the value of z* (in Table C). The margin of error also depends on z*. C Z*Z*−Z*−Z* m Higher confidence C implies a larger margin of error m (less precision more accuracy). A lower confidence level C produces a smaller margin of error m (more precision less accuracy).  win/lose situation

Someone makes a claim about the unknown value of a population parameter. We check whether or not this claim makes sense in light of the “evidence” gathered (sample data). A test of statistical significance tests a specific hypothesis using sample data to decide on the validity of the hypothesis. Significance tests Blood levels of inorganic phosphorus are known to vary Normally among adults, with mean 1.2 and standard deviation 0.1 mmol/l. But is the true mean inorganic phosphorus level lower among the elderly? The average inorganic phosphorus level of a random sample of 12 healthy elderly subjects is mmol/l.  Is this smaller mean phosphorus level simply due to chance variation?  Is it evidence that the true mean phosphorus level in elderly individuals is lower than 1.2 mmol/l?

The null hypothesis, H 0, is a very specific statement about a parameter of the population(s). The alternative hypothesis, H a, is a more general statement that complements yet is mutually exclusive with the null hypothesis. Phosphorus levels in the elderly: H 0 : µ = 1.2 mmol/l H a : µ < 1.2 mmol/l (µ is smaller due to changing physiology) Null and alternative hypotheses

One-sided versus two-sided alternatives  A two-tail or two-sided alternative is symmetric: H a : µ  [a specific value or another parameter]  A one-tail or one-sided alternative is asymmetric and specific: H a : µ < [a specific value or another parameter] OR H a : µ > [a specific value or another parameter] What determines the choice of a one-sided versus two-sided test is the question we are asking and what we know about the problem before performing the test. If the question or problem is asymmetric, then H a should be one-sided. If not, H a should be two-sided.

Is the active ingredient concentration as stated on the label (325 mg/ tablet)? H 0 : µ = 325 H a : µ ≠ 325 Is nicotine content greater than the written 1 mg/cigarette, on average? H 0 : µ = 1 H a : µ > 1 Does a drug create a change in blood pressure, on average? H 0 : µ = 0 H a : µ ≠ 0 Does a particular stream have an unhealthy mean oxygen content (a level below 5 mg per liter)? Ecologists collect a liter of water from each of 45 random locations along a stream and measure the amount of dissolved oxygen in each. They find a mean of 4.62 mg per liter. H 0 :  = 5 H a :  < 5

The P-value P-value: The probability, if H 0 was true, of obtaining a sample statistic at least as extreme (in the direction of H a ) as the one obtained. Phosphorus levels vary Normally with standard deviation  = 0.1 mmol/l. H 0 : µ = 1.2 mmol/l versus H a : µ < 1.2 mmol/l The mean phosphorus level from the 12 elderly subjects is mmol/l. What is the probability of drawing a random sample with a mean as small as this one or even smaller, if H 0 is true?

Phosphorus levels vary Normally with standard deviation  = 0.1 mmol/l. H 0 : µ = 1.2 mmol/l versus H a : µ < 1.2 mmol/l The mean phosphorus level from the 12 elderly subjects is mmol/l.

Interpreting a P-value Could random variation alone account for the difference between H 0 and observations from a random sample?  Small P-values are strong evidence AGAINST H 0 and we reject H 0. The findings are “statistically significant.”  P-values that are not small don’t give enough evidence against H 0 and we fail to reject H 0. Beware: We can never “prove H 0.”

01 Highly significant Very significant Significant Somewhat significant  Not significant  Possible P-values Range of P-values P-values are probabilities, so they are always a number between 0 and 1. The order of magnitude of the P-value matters more than its exact numerical value.

The significance level α The significance level, α, is the largest P-value tolerated for rejecting H 0 (how much evidence against H 0 we require). This value is decided arbitrarily before conducting the test.  When P-value ≤ α, we reject H 0.  When P-value > α, we fail to reject H 0. Industry standards require a significance level α of 5%. Does the packaging machine need revision? Two-sided test. The P-value is 4.56%. P-value < 5%, the results are statistically significant at significance level 0.05.

Test for a population mean (σ known) To test H 0 : µ = µ 0 using a random sample of size n from a Normal population with known standard deviation σ, we use the null sampling distribution N(µ 0, σ/√n). The P-value is the area under N(µ 0, σ/√n) for values of x ̅ at least as extreme in the direction of H a as that of our random sample. Calculate the z-value then use Table B or C. Or use technology.

P-value in one-sided and two-sided tests To calculate the P-value for a two-sided test, use the symmetry of the normal curve. Find the P-value for a one-sided test and double it. One-sided (one-tailed) test Two-sided (two-tailed) test

Do the elderly have a mean phosphorus level below 1.2 mmol/l?  H 0 : µ = 1.2 versus H a : µ < 1.2  What would be the probability of drawing a random sample such as this or worse if H 0 was true? Table B: P-value = P(Z ≤ ) =.0064 The probability of getting a sample average so different from µ 0 = 1.2 is so low that we reject H 0.  The mean phosphorus level among the elderly is significantly less than 1.2 mmol/l. P-value

Confidence intervals to test hypotheses Because a two-sided test is symmetric, you can easily use a confidence interval to test a two-sided hypothesis. α / 2 In a two-sided test, C = 1 – α. C confidence level α significance level

Logic of confidence interval test We found a 99% confidence interval for the true bacterial density of, or 26.5 to 29.5 million bacteria/ per ml. With 99% confidence, could the population mean be µ =25 million/ml? µ = 29? Cannot reject H 0 :  = 29 Reject H 0 :  = 25 A confidence interval gives a black and white answer: Reject or don’t reject H 0. But it also estimates a range of likely values for the true population mean µ. A P-value quantifies how strong the evidence is against the H 0. But if you reject H 0, it doesn’t provide any information about the true population mean µ % CI 29.5