Chapter 14Introduction to Inference1 Chapter 14 Introduction to Inference
Statistical Inference What is Statistical Inference ? draw inferences from the particular to the general For everyone who does habitually attempt the difficult task of making sense of figures is, in fact, assaying a logical process of the kind we call induction, in that he is attempting to draw inferences from the particular to the general; or, as we more usually say in statistics, from the sample to the population. R.A. Fisher (1890 – 1962) Father of modern statistics
Chapter 14Introduction to Inference3 Two forms of statistical inference: Estimation (Confidence Intervals)Estimation (Confidence Intervals) Hypothesis Tests (Significance)Hypothesis Tests (Significance) Statistical Inference
Chapter 14Introduction to Inference4 Statistical Inference parametersObjective to infer parameters ParameterParameter ≡ a numerical characteristic of a population or probability function parameters:Examples of parameters: μ (population mean; expected value) σ (population standard deviation; standard dev parameter) p (probability of “success,” population proportion) conceptsChs 14 & 15 introduces concepts about inference techniquesChs 15–20 introduces inferential techniques
Chapter 14Introduction to Inference5 “Simple Conditions” for Chapter 14 Data acquired by simple random sample (SRS), i.e., all potential observations have same probability of entering the sample No major deviations departures from Normality in population Value of σ is known or assumed before collecting data Objective: to infer μ!
Chapter 14Introduction to Inference6 Example “Female BMI” Statement: What is the mean BMI µ in females between ages 20 and 29? Body Mass Index BMI = weight / height 2Body Mass Index ≡ BMI = weight / height 2 Assume “simple conditions” 1. SRS 2. Population approx. normal 3. σ = 7.5 (assumed before data collected) Plan: Estimate µ with 95% confidence
Chapter 14Introduction to Inference7 Reasoning behind estimation If I took a multiple SRSs, the sample means (x-bars) would be different in each one. We do not expect x-bar to be exactly equal to µ any given x-bar is just an estimate of µ. sampling distribution of meansThe variability of the x-bars in predictable in the form of a sampling distribution of means Fact: sampling distributions of meansFact: Under the “simple conditions” in this chapter, the sampling distributions of means will be Normal distribution with mean µ and standard deviation: ← Standard Deviation of the Mean (also referred to as the standard error of the mean)
Chapter 14Introduction to Inference8 In our example, n = 654 and σ = 7.5. Therefore: Example (Female BMI) how closeσ x-bar tells us how close x-bar is likely to be to µ two σ x-barThe rule tells us that x-bar will be within two σ x-bar units (that’s 0.6) of µ in 95% of samples we’ll be right 95% of the time If we say that µ lies in the interval (x-bar − 0.6) to (x- bar + 0.6), we’ll be right 95% of the time 95% confidentTherefore, we can be 95% confident that an interval “x- bar ± 0.6” will capture µ
Chapter 14Introduction to Inference9 Basis of CI CIs Basis of C onfidence I ntervals ( CIs )
Chapter 14Introduction to Inference10 CICI C onfidence I nterval ( CI ) CIThe CI has two parts point estimate ± margin of error point estimateSuppose in our particular sample, the mean is This is the point estimate for µ. margin of errorRecall from the previous slide that the margin of error for our data is 0.6 (with 95% confidence) 95% confidence intervalTherefore, the 95% confidence interval (for this particular sample) = 26.8 ± 0.6 = (26.2, 27.4).
Chapter 14Introduction to Inference11 CLevel C C onfidence Level C CIs levels of confidence.CIs can be calculated at different levels of confidence. Let CLet C represent the probability the interval will capture the parameter In our example, C = 95% Other common levels of confidenceOther common levels of confidence are 90% and 99%. z* critical valueIn this chapter we adjust the C level by changing the z* critical value.
Chapter 14Introduction to Inference12 Confidence Levels & z critical values critical value z* In this Chapter we adjust the confidence level by altering critical value z* Common levels of confidence & z critical values Confidence level C90%95%99% table C Critical value z* (table C)
Chapter 14Introduction to Inference13 C level CI for μ, σ known “z procedure” To estimate µ with confidence level C, use Use Table C to determine value of z*
Chapter 14Introduction to Inference14 Example (95% CI): Solve & Conclude Conclude: We are 95% confident population mean BMI µ is between 26.0 and 27.6
Chapter 14Introduction to Inference15 Now have students calculate a 99% CI with the data Conclude: We are 99% confident population mean BMI µ is between “lower confidence limit (LCL) here” and “upper confidence limit (UCL) here.” Hint: The only thing that changes is the z* critical value.
Chapter 14Introduction to Inference16 Interpreting a CI Confidence level CConfidence level C is the success rate of the method that produced the interval. We knowWe know with C level of confidence that the CI will capture µ. We don’t knowWe don’t know with certainty whether any given CI will capture µ or missed it.
Chapter 14Introduction to Inference17 Four-Step Procedure for CIs
Stopping Point for Exam 2 Slides after this point forward could be edited after exam 2
Chapter 14Introduction to Inference19 Hypothesis (“Significance”) Tests ObjectiveObjective test a claim about a parameter elaborate vocabularyUses an elaborate vocabulary
Chapter 14Introduction to Inference20 4-step Process Hypothesis (Significance) Testing
Chapter 14Introduction to Inference21 State and Plan Example “Population Weight Gain?” State State : Is there good evidence that the population is gaining weight?Plan ParameterParameter is population mean weight gain µ Null hypothesis H 0Null hypothesis H 0 statement of “no difference” population not gaining weight H 0 : μ = 0 Alternative hypothesis H aAlternative hypothesis H a population gaining weight H a : μ > 0 Type of testType of test : z test if “simple conditions” (slide 5) met
Chapter 14Introduction to Inference22 Notes on Statistical Hypotheses H 0 is key to understanding H a contradicts H 0 H a can be stated in one-sided or two-sided ways –One-sided H a –One-sided H a specifies the direction of the difference weight GAIN in population H a : μ > 0 –Two-sided H a –Two-sided H a does not specific the direction of the difference weight CHANGE in the population H a : μ ≠ 0
Chapter 14Introduction to Inference23 Example “Weight Gain” “Solve” Sub-steps conditions (a) Check conditionsSRS Normality No major departures from Normality σknown σknown before collecting data Calculate (b) Calculate statistics See “z Statistic” Slide P-value (c) Find P-value
Chapter 14Introduction to Inference24 Reasoning Reasoning of Significance Testing IfIf H 0 and the conditions are true, then the sampling distribution of x-bar would be Normal with µ = 0 and IfIf a study produced an x-bar of 0.3, this would be poor evidence against H 0 IfIf a different study produced an x-bar of 1.02, this would be good evidence against H 0
Chapter 14Introduction to Inference25 Test Statistic Standardize the sample mean X-bar is 3 standard deviations greater than expected if H 0 true Suppose: x-bar = 1.02, n = 10, and σ = 1
Chapter 14Introduction to Inference26 Z Table P-Value from Z Table For H a : μ > μ 0 P-value = Pr(Z > z stat ) = right-tail beyond z stat For H a : μ < μ 0 P-value = Pr(Z < z stat ) = left tail beyond z stat For H a : μ μ 0 P-value = 2 × one-tailed P-value
Chapter 14Introduction to Inference27 Z Table P-value from Z Table Draw (right)Draw (right) One-sided P-value = Pr(Z > 3.23) = 1 −.9994 =.0006 Two-sided P-value = 2 × one-sided P = 2 ×.0006 =.0012
Chapter 14Introduction to Inference28 Table C P-Value from Table C Wedge z stat between z* critical values (last rows of Table C) Does not give exact P-value For example, z stat = 3.23 –One-sided P-value between.001 and.0005 –Two-sided P-value between.002 and.001
Chapter 14Introduction to Inference29 P-value: Interpretation P-value ≡ the probability the data would take a value as extreme or more extreme than observed if H 0 were true Smaller-and-smaller P-values → stronger-and- stronger evidence against H 0Smaller-and-smaller P-values → stronger-and- stronger evidence against H 0 Conventions.10 < P < 1.0 insignificant evidence against H 0.05 < P ≤.10 marginally significant evidence vs. H 0.01 < P ≤.05 significant evidence against H 0 0 < P ≤.01 highly significant evidence against H 0
Chapter 14Introduction to Inference30 αα (alpha) ≡ threshold for “significance” If we choose α = 0.05, we require evidence so strong that it would occur no more than 5% of the time when H 0 is true Decision ruleDecision rule P-value ≤ α evidence is significant P-value > α evidence not significant For example, let α = The two-sided P- value = is less than.01, so data are significant at the α =.01 level. “Significance Level”
Chapter 14Introduction to Inference31 Example “Weight Gain” Conclusion The P-value of.0012 provides highly significant evidence against H 0 : µ = 0 We rule in favor of H a : µ ≠ 0 Conclude: the population’s mean weight in changing Our sample mean weight gain of 1.02 pounds per person is statistically significant at the α=.002 level but not at the α=.001 level
Chapter 14Introduction to Inference32Basics of Significance Testing32