Inferences Regarding Population Central Values

Inferences Regarding Population Central Values
Chapter 5 Inferences Regarding Population Central Values

Inferential Methods for Parameters
Parameter: Numeric Description of a Population Statistic: Numeric Description of a Sample Statistical Inference: Use of observed statistics to make statements regarding parameters Estimation: Predicting the unknown parameter based on sample data. Can be either a single number (point estimate) or a range (interval estimate) Testing: Using sample data to see whether we can rule out specific values of an unknown parameter with a certain level of confidence

Estimating with Confidence
Goal: Estimate a population mean based on sample mean Unknown: Parameter (m) Known: Approximate Sampling Distribution of Statistic Recall: For a random variable that is normally distributed, the probability that it will fall within 2 standard deviations of mean is approximately 0.95

Estimating with Confidence
Although the parameter is unknown, it’s highly likely that our sample mean (estimate) will lie within 2 standard deviations (aka standard errors) of the population mean (parameter) Margin of Error: Measure of the upper bound in sampling error with a fixed level (we will typically use 95%) of confidence. That will correspond to 2 standard errors:

Confidence Interval for a Mean m
Confidence Coefficient (1-a): Probability (based on repeated samples and construction of intervals) that a confidence interval will contain the true mean m Common choices of 1-a and resulting intervals:

Philadelphia Monthly Rainfall (1825-1869)

4 Random Samples of Size n=20, 95% CI’s

Factors Effecting Confidence Interval Width
Goal: Have precise (narrow) confidence intervals Confidence Level (1-a) Increasing 1-a implies increasing probability an interval contains parameter implies a wider confidence interval. Reducing 1-a will shorten the interval (at a cost in confidence) Sample size (n): Increasing n decreases standard error of estimate, margin of error, and width of interval (Quadrupling n cuts width in half) Standard Deviation (s): More variable the individual measurements, the wider the interval. Potential ways to reduce s are to focus on more precise target population or use more precise measuring instrument. Often nothing can be done as nature determines s

Precautions Data should be simple random sample from population (or at least can be treated as independent observations) More complex sampling designs have adjustments made to formulas (see Texts such as Elementary Survey Sampling by Scheaffer, Mendenhall, Ott) Biased sampling designs give meaningless results Small sample sizes from nonnormal distributions will have coverage probabilities (1-a) typically below the nominal level Typically s is unknown. Replacing it with the sample standard deviation s works as a good approximation in large samples

Selecting the Sample Size
Before collecting sample data, usually have a goal for how large the margin of error should be to have useful estimate of unknown parameter (particularly when comparing two populations) Let E be the desired level of the margin of error and s be the standard deviation of the population of measurements (typically will be unknown and must be estimated based on previous research or pilot study) The sample size giving this margin of error is:

Hypothesis Tests Method of using sample (observed) data to challenge a hypothesis regarding a state of nature (represented as particular parameter value(s)) Begin by stating a research hypothesis that challenges a statement of “status quo” (or equality of 2 populations) State the current state or “status quo” as a statement regarding population parameter(s) Obtain sample data and see to what extent it agrees/disagrees with the “status quo” Conclude that the “status quo” is not true if observed data are highly unlikely (low probability) if it were true

Elements of a Hypothesis Test (I)
Null hypothesis (H0): Statement or theory being tested. Stated in terms of parameter(s) and contains an equality. Test is set up under the assumption of its truth. Alternative Hypothesis (Ha): Statement contradicting H0. Stated in terms of parameter(s) and contains an inequality. Will only be accepted if strong evidence refutes H0 based on sample data. May be 1-sided or 2-sided, depending on theory being tested. Test Statistic (T.S.): Quantity measuring discrepancy between sample statistic (estimate) and parameter value under H0 Rejection Region (R.R.): Values of test statistic for which we reject H0 in favor of Ha P-value: Probability (assuming H0 true) that we would observe sample data (test statistic) this extreme or more extreme in favor of the alternative hypothesis (Ha)

Example: Interference Effect
Does the way items are presented effect task time? Subjects shown list of color names in 2 colors: different/black yi is the difference in times to read lists for subject i: diff-blk H0: No interference effect: mean difference is 0 (m = 0) Ha: Interference effect exists: mean difference > 0 (m > 0) Assume standard deviation in differences is s = 8 (unrealistic*) Experiment to be based on n=70 subjects How likely to observe sample mean difference  2.39 if m = 0?

P-value 2.39

Elements of a Hypothesis Test (II)
Type I Error: Test resulting in rejection of H0 in favor of Ha when H0 is in fact true P(Type I error) = a (typically .10, .05, or .01) Type II Error: Test resulting in failure to reject H0 in favor of Ha when in fact Ha is true (H0 is false) P(Type II error) = b (depends on true parameter value) 1-Tailed Test: Test where the alternative hypothesis states specifically that the parameter is strictly above (below) the null value 2-Tailed Test: Test where the alternative hypothesis is that the parameter is not equal to null value (simultaneously tests “greater than” and “less than”)

Test Statistic Parameter: Population mean (m ) under H0 is m0
Statistic (Estimator): Sample mean obtained from sample measurements is Standard Error of Estimator: Sampling Distribution of Estimator: Normal if shape of distribution of individual measurements is normal Approximately normal regardless of shape for large samples Test Statistic: (labeled simply as z in text) Note: Typically s is unknown and is replaced by s in large samples

Decision Rules and Rejection Regions
Once a significance (a) level has been chosen a decision rule can be stated, based on a critical value: 2-sided tests: H0: m = m0 Ha: m  m0 If test statistic (zobs) > za/2 Reject Ho and conclude m > m0 If test statistic (zobs) < -za/2 Reject Ho and conclude m < m0 If -za/2 < zobs < za/2 Do not reject H0: m = m0 1-sided tests (Upper Tail): H0: m  m0 Ha: m > m0 If test statistic (zobs) > za Reject Ho and conclude m > m0 If zobs < za Do not reject H0: m  m0 1-sided tests (Lower Tail): H0: m  m0 Ha: m < m0 If test statistic (zobs) < -za Reject Ho and conclude m < m0 If zobs > -za Do not reject H0: m  m0

Computing the P-Value 2-sided Tests: How likely is it to observe a sample mean as far of farther from the value of the parameter under the null hypothesis? (H0: m = m0 Ha: m  m0) After obtaining the sample data, compute the mean and convert it to a z-score (zobs) and find the area above |zobs| and below -|zobs| from the standard normal (z) table 1-sided Tests: Obtain the area above zobs for upper tail tests (Ha:m > m0) or below zobs for lower tail tests (Ha:m < m0)

Interference Effect (1-sided Test)
Testing whether population mean time to read list of colors is higher when color is written in different color Data: yi: difference score for subject i (Different-Black) Null hypothesis (H0): No interference effect (H0: m  0) Alternative hypothesis (Ha): Interference effect (Ha: m > 0) n = 70 subjects in experiment, reasonably large sample Conclude there is evidence of an interference effect (m > 0)

Interference Effect (2-sided Test)
Testing whether population mean time to read list of colors is effected (higher or lower) when color is written in different color Data: Xi: difference score for subject i (Different-Black) Null hypothesis (H0): No interference effect (H0: m = 0) Alternative hypothesis (Ha): Interference effect (+ or -) (Ha: m  0) Again, evidence of an interference effect (m > 0)

Equivalence of 2-sided Tests and CI’s
For given a , a 2-sided test conducted at a significance level will give equivalent results to a (1-a) level confidence interval: If entire interval > m0, P-value < a , zobs > za/2 (conclude m > m0) If entire interval < m0, P-value < a , zobs < -za/2 (conclude m < m0) If interval contains m0, P-value > a , -za/2< zobs < za/2 (don’t conclude m m0) Confidence interval is the set of parameter values that we would fail to reject the null hypothesis for (based on a 2-sided test)

Power of a Test Power - Probability a test rejects H0 (depends on m)
H0 True: Power = P(Type I error) = a H0 False: Power = 1-P(Type II error) = 1-b Example (Using context of interference data): H0: m = HA: m > 0 s2=64 n=16 Decision Rule: Reject H0 (at a=0.05 significance level) if:

Power of a Test Now suppose in reality that m = 3.0 (HA is true)
Power now refers to the probability we (correctly) reject the null hypothesis. Note that the sampling distribution of the sample mean is approximately normal, with mean 3.0 and standard deviation (standard error) 2.0. Decision Rule (from last slide): Conclude population mean interference effect is positive (greater than 0) if the sample mean difference score is above 3.29 Power for this case can be computed as:

Power of a Test As sample size increases, power increases
All else being equal: As sample size increases, power increases As population variance decreases, power increases As the true mean gets further from m0 , power increases

Power of a Test Distribution (H0) Distribution (HA) Fail to reject H0
.5576 .4424 .95 .05

For given m , power increases with sample size
Power Curves for sample sizes of 16,32,64,80 and varying true values m from 0 to 5 with s = 8. For given m , power increases with sample size For given sample size, power increases with m

Sample Size Calculations for Fixed Power
Goal - Choose sample size to have a favorable chance of detecting important difference from m0 in 2-sided test: H0:m = m0 vs Ha:m  m0 Step 1 - Define an important difference to be detected (D): Case 1: s approximated from prior experience or pilot study - difference can be stated in units of the data Case 2: s unknown - difference must be stated in units of standard deviations of the data Step 2 - Choose the desired power to detect the desired important difference (1-b, typically at least .80). For 2-sided test:

Example - Interference Data
2-Sided Test: H0:m = 0 vs Ha:m  0 Set a = P(Type I Error) = 0.05 Choose important difference of |m-m0|=D=2.0 Choose Power=P(Reject H0|D=2.0) = .90 Set b = P(Type II Error) = 1-Power = = .10 From study, we know s 8 Would need 169 subjects to have a .90 probability of detecting effect

Potential for Abuse of Tests
Should choose a significance (a) level in advance and report test conclusion (significant/nonsignificant) as well as the P-value. Significance level of 0.05 is widely used in the academic literature Very large sample sizes can detect very small differences for a parameter value. A clinically meaningful effect should be determined, and confidence interval reported when possible A nonsignificant test result does not imply no effect (that H0 is true). Many studies test many variables simultaneously. This can increase overall type I error rates

Family of t-distributions
Symmetric, Mound-shaped, centered at 0 (like the standard normal (z) distribution Indexed by degrees of freedom (df), the number of independent observations (deviations) comprising the estimated standard deviation. For one sample problems df = n-1 Have heavier tails (more probability over extreme ranges) than the z-distribution Converge to the z-distribution as df gets large Tables of critical values for certain upper tail probabilities are available (Table 2, p. 1088)

Inference for Population Mean
Practical Problem: Sample mean has sampling distribution that is Normal with mean m and standard deviation s / n (when the data are normal, and approximately so for large samples). s is unknown. Have an estimate of s , s obtained from sample data. Estimated standard error of the sample mean is: When the sample is SRS from N(m , s) then the t-statistic (same as z- with estimated standard deviation) is distributed t with n-1 degrees of freedom

Probability Cri t ical Values Degrees of Freedom Critical Values

One-Sample Confidence Interval for m
SRS from a population with mean m is obtained. Sample mean, sample standard deviation are obtained Degrees of freedom are df= n-1, and confidence level (1-a) are selected Level (1-a) confidence interval of form: Procedure is theoretically derived based on normally distributed data, but has been found to work well regardless for large n

1-Sample t-test (2-tailed alternative)
2-sided Test: H0: m = m Ha: m  m0 Decision Rule (ta/2 such that P(t(n-1) ta/2)=a/2) : Conclude m > m0 if Test Statistic (tobs) is greater than ta/2 Conclude m < m0 if Test Statistic (tobs) is less than -ta/2 Do not conclude Conclude m  m0 otherwise P-value: 2P(t(n-1) |tobs|) Test Statistic:

P-value (2-tailed test)
-|tobs| |tobs|

1-Sample t-test (1-tailed (upper) alternative)
1-sided Test: H0: m = m Ha: m > m0 Decision Rule (ta such that P(t(n-1) ta)=a) : Conclude m > m0 if Test Statistic (tobs) is greater than ta Do not conclude m > m0 otherwise P-value: P(t(n-1) tobs) Test Statistic:

P-value (Upper Tail Test)

1-Sample t-test (1-tailed (lower) alternative)
1-sided Test: H0: m = m Ha: m < m0 Decision Rule (ta obtained such that P(t(n-1) ta)=a) : Conclude m < m0 if Test Statistic (tobs) is less than -ta Do not conclude m < m0 otherwise P-value: P(t(n-1) tobs) Test Statistic:

P-value (Lower Tail Test)

Example: Mean Flight Time ATL/Honolulu
Scheduled flight time: 580 minutes Sample: n=31 flights 10/2004 (treating as SRS from all possible flights Test whether population mean flight time differs from scheduled time H0: m = Ha: m  580 Critical value (2-sided test, a = 0.05, n-1=30 df): t.025=2.042 Sample data, Test Statistic, P-value:

Inference on a Population Median
Median: “Middle” of a distribution (50th Percentile) Equal to Mean for symmetric distribution Below Mean for Right-skewed distribution Above Mean for Left-skewed dsitribution Confidence Interval for Population Median: Sort observations from smallest to largest (y(1) ...y(n)) Obtain Lower (La/2) and Upper (Ua/2) “Bounds of Ranks” Small Samples: Obtain Ca(2),n from Table 4 (p. 1091) Large Samples:

Example - ATL/HNL Flight Times
Small-Sample: C.05(2),31=9 Large-Sample:

Inferences Regarding Population Central Values

Similar presentations

Presentation on theme: "Inferences Regarding Population Central Values"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Inferences Regarding Population Central Values

Similar presentations

Presentation on theme: "Inferences Regarding Population Central Values"— Presentation transcript:

Similar presentations

About project

Feedback