Download presentation
Presentation is loading. Please wait.
1
Ph.D. COURSE IN BIOSTATISTICS DAY 2
Ph.D. COURSE IN BIOSTATISTICS DAY 2 SOME RESULTS ABOUT MEANS AND VARIANCES The sample mean and the sample variance were used to describe a typical value and the variation in the sample. We may similarly use the population mean, the expected value, and the population variance to describe the typical value and the variation in a population. These values are often referred to as the theoretical values, and the sample mean and the sample variance are considered as estimates of the analogous population quantities. If X represents a random variable, e.g. birth weight or blood pressure, the mean and variance are often denoted The notation is also used when the distribution is not normal. BIOSTATISTICS DAY 2
2
The random variation in a series of observations is transferred to
The random variation in a series of observations is transferred to uncertainty, i.e. sampling error or sampling variation, in estimates computed from the observations. The average, or sample mean, is an important example of such an estimate. Let denote a random sample of size n from a population with mean and variance then the average is itself a random variable. If several samples of size n are drawn from the population, the average value will vary between samples. Terminology: A ”random sample” implies that the observations are mutually independent replicates of the experiment ”take a unit at random from the population and measure the value on this unit”. For the average (sample mean) we have BIOSTATISTICS DAY 2
3
The sample mean is an unbiased estimate of the population mean.
The variance of the sample mean is proportional to the variance of a single observation and inversely proportional to the sample size. The standard deviation of the sample mean = standard error of the mean = s.e.m. Interpretation: The expected value, the variance, and the standard error of the mean are the values of these quantities that one would expect to find if we generated a large sample of averages each obtained from independent random samples of size n from the same population. The result shows that the precision of the sample mean increases with the sample size. Moreover, if the variation in the population follows a normal distribution the sampling variation of the average also follows a normal distribution
4
Consider a random sample of size n from a population
with mean and variance and an independent random sample of size m from a population with mean and variance For the difference between the sample means we have These results are a consequence of the following general results Linear transformations of random variables (change of scale) The expected value of a sum of random variables The variance of a sum of independent random variables
5
For a random sample of size n from a normal distribution the result
above can be reformulated as standard normal distribution The standard normal distribution is tabulated, so for given values of and this relation can be used to derive probability statements about the sample mean. The sampling distribution of the variance The sample variance s2 is also a statistic derived from the observations and therefore subject to sampling variation. For a random sample from a normal distribution one may show that so the sample variance is an unbiased estimate of the population variance
6
For a random sample of size n from a normal distribution the
sampling error of the sample variance can also be described. We have -distribution with f = n -1 degrees of freedom The distributions (chi-square distributions) are tabulated so for a given value of this relation can be used to derive probability statements about the sample variance. A distribution is the distribution of a sum of independent, squared standard normal variates. The distribution of the sample variance when for various n.
7
INTRODUCTION TO STATISTICAL INFERENCE
Statistical inference: The use of a statistical analysis of data to draw conclusions from observations subject to random variation. Data are considered as a sample from a population (real or hypothetical) The purpose of the statistical analysis is to make statements about certain aspects of this population The basic components of a statistical analysis Specification of a relevant statistical model (the scientific problem is ”translated” to a statistical problem) Estimation of the population characteristics (the model parameters) Validation of the underlying assumptions Test of hypotheses about the model parameters. A statistical analysis is always based on a statistical model, which formalizes the assumptions made about the sampling procedure and the random and systematic variation in the population from which the sample is drawn.
8
The validity of the conclusions depends on the degree to which
the statistical model gives an adequate description of the sampling procedure and the random and systematic variation. Consequently, checking the appropriateness of the underlying assumptions (i.e. the statistical model) is an important part of a statistical analysis. The statistical model should be seen as an approximation to the real world. The choice of a suitable model is always a balance between complex models, which are close approximations, but very difficult to use in practice, and simple models, which are crude approximations, but easy to apply
9
Example: Comparing the efficacy of two treatments
Design: Experimental units (e.g. patients) are allocated to two treatments. For each experimental unit in both treatment groups an outcome is measured. The outcome reflects the efficacy of the treatment. Purpose: To compare the efficacy of the two treatments Analysis: To summarize the results the average outcome is computed in each group and the two averages are compared. Possible explanations for a discrepancy between the average outcome in the two groups The treatments have different efficacy. One is better than the other Random variation Bias originating from other differences between the groups. Other factors which influence the outcome may differ between the groups and lead to apparent differences between the efficacy of the two treatments (confounding).
10
A proper design of the study (randomization, blinding etc.) can
eliminate or reduce the bias and therefore make this explanation unlikely. Bias correction (control of confounding) is also possible in the statistical analysis. The statistical analysis is performed to estimate the size of the treatment difference and evaluate if random variation is a plausible explanation for this difference. If the study is well-designed and the statistical analysis indicates that random variation is not a plausible explanation for the difference, we may conclude that a real difference between the efficacy of the two treatments is the most likely explanation of the findings. The statistical analysis can also identify a range of plausible values, a so-called confidence interval, for the difference in efficacy.
11
STATISTICAL ANALYSIS OF A SAMPLE FROM A NORMAL DISTRIBUTION
Example. Fish oil supplement and blood pressure in pregnant women Purpose: To evaluate the effect of fish oil supplement on diastolic blood pressure in pregnant women. Design: Randomised controlled clinical trial on 430 pregnant women, enrolled at week 30 and randomised to either fishoil supplement or control. Data: Diastolic and systolic blood pressure at week 30 and 37 (source: Sjurdur Olsen) The Stata file fishoil.dta contains the following variables grp treatment group, 1 for control, 2 for fish oil group a string variable with the name of the group allocation difsys increase in systolic blood pressure from week 30 to week 37 difdia increase in diatolic blood pressure from week 30 to week 37
12
We shall here consider the change in diastolic blood pressure from week 30 to week 37.
Stata histogram difdia , by(group)
13
Stata qnorm difdia if grp==1, title("Control") /// saving(q1,replace) qnorm difdia if grp==2, title("Fish oil") /// saving(q2,replace) graph combine q1.gph q2.gph
14
For both groups the histogram and the probability plot correspond closely to the expected behavior of normally distributed data. Hence our statistical model is: The observations in each group can be considered as a random sample from a normal distribution with unknown parameters as below: The two sets of observations are independent. The ultimate goal of the analysis is to compare the two treatments with respect to the expected change in blood pressure. We shall return to this analysis later. First we want to examine the change in the diastolic blood pressure in women in the control group.
15
We now consider the control group and focus on the increase in diastolic blood pressure
Problem: Do the data suggest that the diastolic blood pressure in pregnant women increases from week 30 to week 37? Data The observed values of the change in diastolic blood pressure in the 213 women who participated in the study Statistical model The data are considered as a random sample of size 213 from a normal distribution with mean and variance The parameter describes the expected change and the parameter describes the random variation caused by biological factors and measurement errors.
16
Assumptions The assumptions of the statistical model are
The observations are independent The observations have the same mean and the same variance A normal distribution describes the variation. Checking the validity of the assumptions is usually done by various plots and diagrams. Knowledge of the measurement process can often help in identifying points which need special attention. Re 1. Checking independence often involves going through the sampling procedure. Here the assumption would e.g. be violated if the same woman contributes with more than one pregnancy. Re 2. Do we have “independent replications of the same experiment”? Factors that are known to be associated with changes in blood pressure are not accounted for in the model. They contribute to the random variation.
17
Re 3. The plots above indicate that a normal distribution gives an
adequate description of the data Estimation The estimation problem: Find the normal distribution that best fits the data. Solution: Use the normal distribution with mean equal to the sample mean and variance equal to the sample variance. sum difdia if grp==1 Variable | Obs Mean Std. Dev Min Max difdia | i.e. Note: A normal distribution is completely determined by the values of the mean and the variance. Convenient notation: A “^” on top of a population parameter is used to identify the estimate of the parameter
18
Question: Do the data suggest a systematic change in the diastolic
pressure? No systematic change means that the expected change is 0, i.e. Hypothesis: The data are consistent with the value of being 0. This hypothesis is usually written as We have observed an average value of Is sampling variation a possible explanation for the difference between the observed value of 1.90 and the expected value of 0? Statistical test To evaluate if the random variation can account for the difference we assume that the hypothesis is true and compute the probability that the average value in a random sample of size 213 differs by at least as much as the observed value. From the model assumptions we conclude that the average can be considered as an observation from a normal distribution with mean 0 and standard deviation equal to
19
Consequently, the standardized value
is an observation from a standard normal distribution. Problem: The population standard deviation is unknown, but in large samples we may use the sample standard deviation and still rely on the normal distribution. Small samples are considered later. Replacing with the estimate s we therefore get For a normal distribution a value more than 3 standard deviations from the mean is very unlikely to occur. Using a table of the standard normal distribution function we find that a value that deviates more 3.69 in either direction occurs with a probability of
20
p-value The probability computed above is called the p-value.
p-value = the probability of obtaining a value of the test statistics as least as extreme as the one actually observed if the hypothesis is true. Usually extreme values in both tails of the distribution are included (two-sided test), so in the present case -3.69 3.69 The calculation indicates that sampling variation is a highly implausible explanation for the observed change in blood pressure. The observed deviation from the hypothesized value is statistically significant. Usually a hypothesis is rejected if the p-value less than 0.05.
21
SMALL SAMPLES – use of the t – distribution
To compute the p-value above we replaced the unknown population standard deviation with the sample standard deviation and referred the value of the test statistic to a normal distribution. For large samples this approach is unproblematic, but for small samples the p-value becomes too small, since the sampling error of the sample standard deviation is ignored. Statistical theory shows that the correct distribution of the test statistic is a so-called t-distribution with f = n – 1 degrees of freedom. The t-distribution has been tabulated, so we are still able to compute a p-value. Note that the t-distributions does not depend on the parameters and so the same table applies in all situations. As the sample size increases the t-distribution will approach a standard normal distribution. Usually the approximation is acceptable for samples larger than 60, say.
22
If we again compute but this time look-up the value in a table of a t-distribution with f = 212 degrees of freedom, we get p = Since the sample is relatively large the result is almost identical to the one above. A comparison of t-distribution with 4, 19, and 99 degrees of freedom and a standard normal distribution (the black curve!)
23
STATA: PROBABILITY CALCULATIONS
Output from statistical programs like Stata usually also includes p-values so statistical tables are rarely needed. Moreover Stata has a lot of build-in functions that can compute almost any kind of probabilities. Write help probfun to see the full list. Some examples display norm(3.6858) returns , the value of the cumulative probability function of a standard normal distribution at 3.6858, i.e the probability that a standard normal variate is less than or equal to display ttail(212,3.6858) returns , the probability that a t-statistic with 212 degrees of freedom is larger than display Binomial(224,130,0.5134) returns , the probability of getting at least 130 successes from a Binomial distribution with n = 224 and p =
24
ONE SAMPLE t-TEST: THE GENERAL CASE
Above we derived the t-test of the hypothesis The same approach can be used to test if any specified value is consistent with the data. If we e.g. want to test the hypothesis we compute display 2*ttail(212,0.1911) returns the p-value , so an expected change of 2 is compatible with the data and can not be rejected. Note: The function ttail gives a probability in the upper tail of the distribution. A negative t-value should therefore be replaced by the corresponding positive value when computing the p-value.
25
CONFIDENCE INTERVALS In the example the observed average change in blood pressure is 1.901, and this value was used as an estimate of the expected change Values close to are also compatible with the data, we saw e.g. that the value 2 could not be rejected. Problem: Find the range of values for the expected change that is supported by the data. A confidence interval is the solution to this problem. Formally: A 95% confidence interval identifies the values of the unknown parameter which would not be significantly contradicted by a (two-sided) test at the 5% level, because the p-value associated with the test statistic for each of these values is larger than 5%
26
Frequency interpretation: If the experiment is repeated a large
number of times and a 95% confidence interval is computed for each replication, then 95% of these confidence intervals will contain the true value of the unknown parameter. How to calculate the 95% confidence interval The limits of the confidence interval are the values of t equal to the 2.5 and 97.5 percentile of a t-distribution with n – 1 degrees of freedom. The t-distribution is symmetric around 0, so and the the confidence limits are therefore given by the values of satisfying i.e. The formula shows the confidence intervals becomes more narrow as the sample size increases.
27
Example continued In Stata the command invttail gives the upper percentiles and display invttail(212,0.025) returns The 95% confidence limits for the expected change in diastolic blood pressure therefore becomes and the 95% confidence interval becomes 99% confidence intervals are derived from the upper 0.5 percentile in a similar way. Also, one-sided confidence intervals can be defined and computed from one-sided statistical test (statistical tests are called one-sided if large deviations in only one direction are considered extreme).
28
STATA: ONE SAMPLE t-TEST
A single command in Stata will give all the results derived so far. ttest difdia=0 if grp==1 One-sample t test Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] difdia | t = degrees of freedom = Ho: mean(difdia) = 0 Ha: mean < Ha: mean != Ha: mean > 0 Pr(T < t) = pr(|T| > |t|) = Pr(T > t) = hypothesis tested (two-sided) p-value To test the hypothesis use ttest difdia=2 if grp==1 instead
29
Statistical inference about the variance
So far we have looked at statistical inference about the mean of a normal population based on a random sample. In the same setting we can also derive a test statistic for hypotheses about the variance (or the standard deviation) and obtain confidence intervals for this parameter. The arguments are based on the result about the sampling distribution of the sample variance (see p. 6) -distribution with f = n -1 degrees of freedom Inference problems involving a hypothesis about the variance are much less common, but may e.g. arise in studies of methods of measurement Example continued Suppose we for some reason want to see if the change in diastolic blood pressure has a standard deviation of 7, or equivalently a variance of 49.
30
To test the hypothesis we could compute
and see if this value is extreme when referred to a -distribution on 212 degrees of freedom. Using Stata’s probability calculator, display chi2(212,245.24), we get This is the probability of a values at less than or equal to The probability of getting a value larger than is = Stata can also give this result directly from the command display chi2tail(212,245.24). The p-value is 2 times the smallest tail probability, i.e A standard deviation of 7 can not be rejected. Rule: If the test statistic, x, is smaller than the degrees of freedom, f, use display 2*chi2(x,f) else use display 2*chi2tail(x,f)
31
Confidence intervals for variances and standard deviations
A 95% confidence interval for the population variance is given by where f is the degrees of freedom and and are the 2.5 and the 97.5 percentiles of a -distribution with f degrees of freedom. A 95% confidence interval for the standard deviation therefore becomes Example – diastolic blood pressure continued Stata’s probability calculator has a function invchi2 that computes percentiles of -distributions. We find that display invchi2(212,0.025) gives display invchi2(212,0.975) gives
32
A 95% confidence interval for the standard deviation is therefore
More Stata A test of a hypothesis about the standard deviation is carried out by the command sdtest difdia=7 if grp==1 One-sample test of variance Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] difdia | sd = sd(difdia) c = chi2 = Ho: sd = degrees of freedom = Ha: sd < Ha: sd != Ha: sd > 7 Pr(C < c) = *(C > c) = Pr(C > c) = hypothesized value (two-sided) p-value Note that the 95% confidence interval is the confidence interval for the population mean and not for the standard deviation.
33
STATISTICAL ANALYSIS OF TWO INDEPENDENT
SAMPLES FROM NORMAL DISTRIBUTIONS Example. Fish oil supplement and blood pressure in pregnant women The study was a randomized trial carried out to evaluate the effect of fishoil supplement on diastolic blood pressure in pregnant women. Pregnant women were assigned at random to one of two treatment groups. One group received fish oil supplement, the other was a control group. Here we shall compare the two treatments using difdia, the change in diastolic blood pressure, as outcome, or response. We have already seen histograms and Q-Q plots of the distribution of difdia in each of the two groups (see p ) and these plots suggest that the random variation may be adequately described by normal distributions.
34
The standard analysis of this problem is based on the following
statistical model The observations in each group can be considered as a random sample from a normal distribution with unknown parameters as below: The two sets of observations are independent. Note that the size of the random variation is assumed to be the same in the two groups, so this assumption should also be checked. The purpose of the analysis is to quantify the difference between the expected change in the two groups and assess if this difference is statistically different from 0
35
Model assumptions Independence within and between samples
Random samples from population with the same variance The random variation in each population can be described by a normal distribution Note: The model assumptions imply that if this difference is not statistically different from 0 we may conclude that the distributions are not significantly different, since a normal distribution is completely determined by the parameters and . Re 1. Inspect the design and the data. Repeated observations on the same individual usually imply violation of the independence assumption. Re 2. A formal test of the hypothesis of identical variances of normal distribution is described below. Re 3. Histograms and Q-Q plots, see page 12-13
36
Estimation Basic idea: population values are estimated by the corresponding sample values. This gives two estimates of the variance, which should be pooled to a single estimate. Stata performs the basic calculations with bysort grp: summarize difdia _________________________________________________________________ -> grp = control Variable | Obs Mean Std. Dev Min Max difdia | -> grp = fish oil difdia | i.e. control group: mean = fish oil group: mean = 2.19
37
The standard deviations are rather similar, so let us assume for
a moment that it is reasonable to derive a pooled estimate. How should this be done? Statistical theory shows that the best approach is to compute a pooled estimate of the variance as a weighted average of the sample variances and use the corresponding standard deviation as the pooled estimate. The weighted average uses weights proportional to the degrees of freedom, i.e. f = n – 1. Hence and Stata does not include this estimate in the output above, but the result is produced by the commands quietly regress difdia grp display e(rmse) giving the output , i.e. sp = 7.962 writing quietly in front suppresses output from the command the string variable group can not be used here
38
comparing means of two independent samples
Statistical test comparing means of two independent samples The expected change in diastolic blood pressure is slightly higher in the fish oil group. Does this reflect a systematic effect? To see if random variation can explain the difference we test the hypothesis of identical population means in the two samples. The line of argument is similar to the one that was used in the one- sample case. Assume that the hypothesis is true. This observed difference between the two means must then be caused by sampling variation. The plausibility of this explanation is assessed by computing a p-value, the probability of obtaining a result at least as extreme as the observed.
39
From the model assumption we conclude that if the hypothesis is true
then the difference between the sample means can be considered as an observation from a normal distribution with mean 0 and variance Consequently, the standardized value is an observation from a standard normal distribution. If the standard deviation is replace by the pooled estimate we arrive at the test statistic
40
To derive the p-value this test statistic should be referred to a
t-distribution with degrees of freedom, since we may show that sampling distribution of the pooled variance estimate is identical to the sampling distribution of a variance estimate with degrees of freedom (see page 6). We get and the p-value becomes The difference is not statistically significant different from 0.
41
Confidence intervals for the parameters of the model
The model has unknown three parameters and . A 95% confidence interval for the expected value becomes and similarly for Note that the pooled standard deviation is used and is therefore the 97.5 percentile of a t-distribution with degrees of freedom. For the change in diastolic blood pressure we get Note: some programs, e.g. Stata, use the separate sample standard deviation when computing these confidence intervals. A 95% confidence interval for the standard deviation is based on the pooled estimate with = 428 degrees of freedom (see page 31)
42
Confidence intervals for the difference between means
In a two-sample problem the parameter of interest is usually the difference between the expected values. From the results above (page 39) we get where the t-percentile refers to a t-distribution with degrees of freedom. The example
43
STATA: TWO SAMPLE t-TEST (equal variances)
A single command in Stata gives all the results derived so far except and estimate of the pooled variance (see page 37) ttest difdia , by(grp) s.d. in combined samples, not pooled s.d. Two-sample t test with equal variances Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] control | fish oil | combined | diff | diff = mean(control) - mean(fish oil) t = Ho: diff = degrees of freedom = Ha: diff < Ha: diff != Ha: diff > 0 Pr(T < t) = Pr(|T| > |t|) = Pr(T > t) = hypothesis tested (two sided) p-value
44
Comparing the variances: The F-distribution
In the statistical model we assumed the same variance in the two populations. To assess this assumption we consider a statistical test of the hypothesis An obvious test statistic is the ratio of sample variances A value close to 1 is expected if the hypothesis is true. Both small and large values would suggest that the variances differ. From statistical theory follows that the distribution of the ratio of two independent variance estimates is a so-called F-distribution if the corresponding population variances are identical (i.e. if H is true). The F-distribution is characterized by a pair of degrees of freedom (the degrees of freedom for the two variance estimates). Like normal, t-, and chi-square distributions the F-distributions are extensively tabulated.
45
Comparing the variances
In practice the hypothesis of equal variances is tested by computing and p-value is then obtained as where the pair of degrees of freedom are those of the numerator and the denominator. Example For the change in diastolic blood pressure we have Stata’s command display 2*Ftail(216,212,1.2344) returns 0.125, so the p-value becomes The difference between the two standard deviations is not statistically significant.
46
STATA: COMPARISON OF TWO VARIANCES
Stata’s command sdtest can also be used to compare two variances. Write sdtest difdia , by(grp) Variance ratio test Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] control | fish oil | combined | ratio = sd(control) / sd(fish oil) f = Ho: ratio = degrees of freedom = 212, 216 Ha: ratio < Ha: ratio != Ha: ratio > 1 Pr(F < f) = *Pr(F < f) = Pr(F > f) = hypothesis tested (two sided) p-value
47
Comparing the means when variances are unequal
Problem: What if the assumption of equal variances is unreasonable? Some solutions: Try to obtain homogeneity of variances by transforming the observations in a suitable way, e.g. by working with log-transformed data. 2. Use an approximate t-test, that does not rely on equal variances. The approximate t-test has the form Under the hypothesis of equal means the distribution of this test statistic is approximately equal to a t-distribution. To compute the degrees of freedom for the aproximate t-distribution first compute
48
the degrees of freedom is then obtained as
3. Use a non-parametric test, e.g. a Wilcoxon-Mann-Whitney test. We shall consider solution 1 next time and solution 3 later in the course. The Stata command ttest computes solution 2 if the option unequal is added. Note: When the variances of the two normal distributions differ the hypothesis of equal means are no longer equivalent to the hypothesis of equal distributions.
49
STATA: TWO SAMPLE t-TEST (unequal variances)
To compute the approximate t-test (solution 2 above) with Stata write ttest difdia , by(grp) unequal approximate confidence limits Two-sample t test with unequal variances Group | Obs Mea Std. Err Std. Dev. [95% Conf. Interval] control | fish oil | combined | diff | diff = mean(control) - mean(fish oil) t = Ho: diff = Satterthwaite's degrees of freedom = Ha: diff < Ha: diff != Ha: diff > 0 Pr(T < t) = Pr(|T| > |t|) = Pr(T > t) = degrees of freedom of the approximate t-test (two sided) p-value approximate t-test
50
SOME GENERAL COMMENTS ON STATISTICAL TESTS
To test a hypothesis we compute a test statistic, which follows a known distribution if the hypothesis is true. We can therefore compute the probability of obtaining a value of the test statistic as least as extreme as the one observed. This probability is called the p-value. The p-value describes the degrees of support of the hypothesis found in the data. The result of the statistical test is often classified as ”statistical significant” or ”non-significant” depending on whether or not the p-value is smaller than a level of significance, often called , and usually equal to 0.05. The hypothesis being tested is often called the null hypothesis. A null hypothesis always represents a simplication of the statistical model. Hypothesis testing is sometimes given a decision theoretic formulation: The null hypothesis is either true or false and a decision is made based on the data.
51
When hypothesis testing is viewed as decisions, two types of error are
possible Type 1 error: Rejecting a true null hypothesis Type 2 error: Accepting (i.e. not rejecting) a false null hypothesis. The level of significance specifies the risk of a type 1 error. In the usual setting the null hypothesis is tested against an alternative hypothesis which includes different values of the parameter, e.g. The risk of a type 2 error depends on which of the alternative values are the true value. The power of a statistical test is 1 minus the risk of type 2 error. When planning an experiment power considerations are sometimes used to determine the sample size. We return to this in the last lecture. Once the data are collected confidence intervals are the appropriate way to summarize the uncertainty in the conclusions.
52
Relation between p-values and confidence intervals
In a two sample problem it is tempting to compare the 95% confidence intervals of the two means and conclude that the hypothesis is non-significant if the 95% confidence intervals overlap. This is not correct. Overlapping 95% confidence intervals does not imply that the difference is not significant on a 5% level. On the other hand, if the 95% confidence intervals do not overlap, the difference is statistical significant on a 5% level (actually, the p-value is 1% or smaller). This may at first seem surprising, but it is a simple consequence of the fact that for independent samples the result implies that
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.