Inferential Statistics Parametric Ch 5. Inferential Statistics Random Samples Estimate Population Statistics Correlation of Two variables Experimental Methods Standard Error of the Mean Descriptive Statistics Ch 1. The Mean, The Number of Observations, & the Standard Deviation N/Population/Parameters Measures of Central Tendency –Median, Mode, Mean Measures of Variability – Range, Sum of Squares, Variance, Standard Deviation Ch 6. T Scores / T Curves Estimates of Z scores Computing t Scores Critical Values Degrees of Freedom Other stuff to come Ch 10.Two Way Factorial Analysis of Variance Three null hypotheses Graphing the means Factorial designs Ch 7. Correlation Variable Relationships – linearity, direction, strength Correlation Coefficient Scatter plots Best Fitting lines Ch 2. Frequency Distributions and Histograms Frequency Distributions Bar Graphs / Histograms Continuous vs Discreet Variables Chapter 11. A variety of t tests Ch 12. Tukeys Significant Difference Testing differences in group means Alpha for the whole experiment HSD - Honestly Significant Difference Ch 8. Regression Predicting using the regression equation Generalizing – The null hypothesis Degrees of freedom and statistical significance Ch 3. The Normal Curve Z scores & percentiles Least Squares, Unbiased estimates Ch 9. Experimental Studies Independent and dependent variables The experimental hypothesis The F test and the t test Ch 12. Power Analysis Type 1 error and alpha Type 2 error and beta How many subjects do you need? Ch 4. Translating To and From Z Scores Normal Scores Scale Scores Raw Scores Percentiles Ch 13. Assumptions Underlying Parametric Statistics Sample means form a normal curveSubjects are randomly selected from the population Homgeneity of VarianceExperimental error is random across samples Non Parametric Ch 14. Chi Square Nominal Data
Chapter 11- Lecture 1 t tests: single sample, repeated measures, and two independent samples. 11/16/2018
Conceptual overview 11/16/2018
t and F tests: 2 approaches to measuring the distance between means There are two ways to tell how far apart things are. When there are only two things, you can directly determine their distance from each other. If they are two scores, as they usually are in Psychology, you simply subtract one from the other to find their difference. That is the approach used in the t test and its variants. 11/16/2018
F tests: Alternatively, when you want to describe the distance of three or more things from each other, the best way to index their distance from each other is to find a central point and talk about their average squared distance (or average unsquared distance) from that point. The further apart things are from each other, the further they will be, on the average, from that central point. That is the approach you have used in the F test (and when treating the t test as a special case of the F test: t for one or two, F for more.) 11/16/2018
One way or another: the two methods will yield identical results. We can use either method, or a combination of the two methods to ask the key question in this part of the course, “Are two or more means further apart than they are likely to be when the null hypothesis is true.” 11/16/2018
H0: It’s just sampling fluctuation. If the only thing that makes the two means different is random sampling fluctuation, the means will be fairly close to the population mean and to each other. If an independent variable is pushing the means apart, their distance from each other, or from some central point, will tend to be too great to be explained by the null hypothesis. 11/16/2018
Generic formula for the t test These ideas lead to a generic formula for the t test: t (dfW)=(Actual difference between 2 means) (Estimated average difference between the two means that should exist if the H0 is correct) 11/16/2018
Calculation and theory As usual, we must work on calculation and theory. Again we’ll do calculation first. 11/16/2018
The first of three types of simple t tests - the single or one sample t test t test in which a sample mean is compared to a population mean. The population mean is almost always the value postulated by the null hypothesis. Since it is a mean obtained from a theory (H0 is a theory), we call that mean “muT”. To do the single sample t test, we divide the actual difference between the sample mean and muT by the estimated standard error of the mean, sX-bar. 11/16/2018
Let’s do a problem: You may recognize this problem. We used it to set up confidence intervals in Ch. 6 11/16/2018
For example: For example, let’s say that we had a new antidepressant drug we wanted to peddle. Before we can do that we must show that the drug is safe. Drugs like ours can cause problems with body temperature. People can get chills or fever. We want to show that body temperature is not effected by our new drug. 11/16/2018
Testing a theory “Everyone knows” that normal body temperature for healthy adults is 98.6oF. Therefore, it would be nice if we could show that after taking our drug, healthy adults still had an average body temperature of 98.6oF. So we might test a sample of 16 healthy adults, first giving them a standard dose of our drug and, when enough time had passed, taking their temperature to see whether it was 98.6oF on the average. 11/16/2018
Here’s the formula: 11/16/2018
Here’s the formula: 11/16/2018
Data for the one sample t test We randomly select a group of 16 healthy individuals from the population. We administer a standard clinical dose of our new drug for 3 days. We carefully measure body temperature. RESULTS: We find that the average body temperature in our sample is 99.5oF with an estimated standard deviation of 1.40o (s=1.40). In Chapter 7 we asked whether 99.5oF was in the 95% CI around muT? It wasn’t. We should get the same result with a t test. 11/16/2018
Here’s the computation: 11/16/2018
Notice that the critical value of t changes with the number of degrees of freedom for s, our estimate of sigma, and must be taken from the t table. If n= 16 in a single sample, dfW=n-k=15. 11/16/2018
df 1 2 3 4 5 6 7 8 .05 12.706 4.303 3.182 2.776 2.571 2.447 2.365 2.306 .01 63.657 9.925 5.841 4.604 4.032 3.707 3.499 3.355 df 9 10 11 12 13 14 15 16 .05 2.262 2.228 2.201 2.179 2.160 2.145 2.131 2.120 .01 3.250 3.169 3.106 3.055 3.012 2.997 2.947 2.921 df 17 18 19 20 21 22 23 24 .05 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064 .01 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797 df 25 26 27 28 29 30 40 60 .05 2.060 2.056 2.052 2.048 2.045 2.042 2.021 2.000 .01 2.787 2.779 2.771 2.763 2.756 2.750 2.704 2.660 df 100 200 500 1000 2000 10000 .05 1.984 1.972 1.965 1.962 1.961 1.960 .01 2.626 2.601 2.586 2.581 2.578 2.576
We have falsified the null. We would write the results as follows t (15) = 2.57, p<.05 Since we have falsified the null, we reject 98.6o as the population mean for people who have taken the drug. Instead, we would predict the average person, drawn from the same population as our sample, would respond as did our sample. We would predict they will have an average body temperature of 99.5o after taking the drug. That is, they would have a slight fever. 11/16/2018
An obvious problem with the one sample experimental design: no control group. 11/16/2018
So, we can use a single random sample of participants as their own controls if we measure them two or more times. If it they are measured twice, we can use the repeated measures t test. 11/16/2018
Computation of the repeated measures t test Let’s say we measured 5 moderately depressed inpatients and rated their depression with the Hamilton rating scale for depression. The we treated them with CBT for 10 sessions and again got Hamilton scores. Lower scores = less depression. 11/16/2018
Before After Difference Here are pre, post and difference scores showing post-treatment scores subtracted from pretreatment scores Al scored 28 before treatment and 18 after . Bill scored 22 before and 14 after. Carol scored 23 before and 14 after. Dora scored 38 before and 27 after. Ed scored 33 before and 21 after. Before After Difference 28 18 10 22 14 8 23 14 9 38 27 11 33 21 12 Mean difference = 10.00 11/16/2018
In this case, there are 5-1=4 degrees of freedom In this case, there are 5-1=4 degrees of freedom. Now we can compute the estimated standard error of the difference scores: 11/16/2018
Now we are ready for the formula for the repeated measures t test: t equals the actual average difference between the means minus their difference under H0 divided by the estimated standard error of the difference scores 11/16/2018
11/16/2018
Here is the computation in this case: 11/16/2018
Here is how we would write the result: t (4) = 14.14, p<.01 In this case the means are 14.14 estimated standard errors apart. We wrote the results as p<.01 But these are results so strong, they far exceed any value in the table, even with just a few degrees of freedom. This antidepressant works!!!! 11/16/2018
There are times a repeated measures design is appropriate and times when it is not. When it is not, we use a two sample, independent groups t test. 11/16/2018
The t test for two independent groups. You already know a formula for the two sample t test: t (n-k) = sB/s But now we want an alternative formula that allows us to directly compare two means by subtracting one from the other. It takes a little algebra, but the formula is pretty straight forward 11/16/2018
Three steps to computing the t test for two independent groups: First, we need to compute the actual difference between the two means. (That’s easy, we just subtract one from the other. 11/16/2018
Step 2: Then we compare that difference to the difference predicted by H0. That’s also easy because the null, as usual, predicts no difference between the means. H0: mu1 – mu2 = 0.00 That is, the null says that there is actually no average difference between the means of the two population represented by the two groups. 11/16/2018
Step 3 – this one is a little harder Step 3 – this one is a little harder. Here we compute the standard error of the difference between two means . Although the population means may be identical, samples will vary because of random sampling fluctuation. The amount of fluctuation is determined by MSW and the sizes of the two groups. 11/16/2018
So we need to divide the actual difference between the mean score at time 1 minus the mean at time two. Then we subtract the theoretical difference (which is 0.00 according to the null). Finally, we divide by the estimated standard error of the difference between the means of two independent groups. 11/16/2018
Let’s learn the conceptual basis and computation of the estimated standard error of the difference between 2 sample means. 11/16/2018
The estimated average squared distance between a sample mean and the population mean due solely to sampling fluctuation is MSW /n, where n is the size of the sample. The estimated average squared distance between two sample means is their two squared differences from mu added together: MSW/n1+ MSW/n2. 11/16/2018
So, if the samples are the same size, their average squared distance from each other equals: MSW/n1+ MSW/n2 = 2MSW/n But if the samples have different numbers of scores, we have to use the average size of the two groups. 11/16/2018
The problem is we can’t use a usual arithmetic average; we need to use a geometric average called the harmonic mean, nH. Then the average squared distance between two independent sample means equals 2MSW/nH The square root of that is the average unsquared difference between the mean of the two samples, the denominator in the t test. 11/16/2018
Here is the formula for the estimated standard error of the difference between the means of two independent samples 11/16/2018
Here’s the formula for the independent groups t test: 11/16/2018
Where 11/16/2018
So, to do that computation we need to learn to compute nH. 11/16/2018
Calculating the Harmonic Mean Notice that this technique allows different numbers of subjects in each group. Oh No!! My rat died! What is going to happen to my experiment? 11/16/2018
If the groups are the same size , the harmonic and ordinary mean number of participants is the same. 3 groups; 4 subjects each 11/16/2018
When groups do not have equal numbers, harmonic mean is smaller than ordinary mean. 4 groups; 6, 4, 8 and 4 participants. Ordinary mean=22/4=5.50 participants each. 11/16/2018
The theory part: 11/16/2018
ZX-bar scores As you know from Chapter 4, the Z score of a sample mean is the number of standard errors of the mean the sample mean is from mu. Here is the formula. ZX-bar = (X-bar - mu)/ sigmaX-bar 11/16/2018
Confidence intervals with Z As you learned in Chapter 4, if a sample differs from its population mean solely because of sampling fluctuation, 95% of the time it will fall somewhere in a symmetrical interval that goes 1.96 standard errors in both directions from mu. That interval is, of course, the CI.95. CI.95 = mu + 1.960 sigmaX-bar Or, for theoretical population means: CI.95 = muT + 1.960 sigmaX-bar 11/16/2018
MUT, the CI.95 and H0. Most of the time we don’t know mu, so we are really talking about muT. Most of the time, muT will be the value of mu suggested by the null hypothesis. If a sample falls outside the 95% confidence interval around muT, we have to assume that it has been pushed away from mu by some factor other than sampling fluctuation. 11/16/2018
ZX-bar and the null hypothesis If H0 says that the only reason that a sample mean differs from muT is sampling fluctuation, as H0 usually does, then the value of ZX-bar can be used as a test of the null hypothesis. If H0 is correct, ZX-bar should fall within the CI.95, within 1.960 standard errors of muT. If ZX-bar has an absolute value greater than 1.960, the sample mean falls outside the 95% confidence interval around mu and falsifies the null hypothesis. 11/16/2018
The underlying logic of the Z test Here is the formula for Zx-bar again. ZX-bar = (X-bar - muT)/ sigmaX-bar When used as a test of the null, most text books identify ZX-bar simply as Z. We will follow that lead and, when we use it in a test of the null, call Zx-bar simply “Z.” Here is the formula for the Z test. Z = (X-bar - muT)/ sigmaX-bar If the absolute value of Z equals or exceeds 1.960, Z is significant at .05. If the absolute value of Z equals or exceeds 2.576, Z is significant at .01. 11/16/2018
In the Z test You start with a random sample then expose it to an IV. You determine muT, the predicted mean if the null hypothesis is true. If the absolute value of Z > 1.960, Xbar outside the CI.95 around muT. The null hypothesis is probably not correct. Since you have falsified the null, you must turn to H1, the experimental hypothesis Also, since Z was significant, you conclude that were other individuals from the population treated the same way, they would respond similarly to the sample you studied. 11/16/2018
There are two problems We seldom know sigma. It would be nice to have a control group. Let’s deal with those problems one at a time. We’ll deal with the fact that we don’t know sigma Therefore we can’t compute sigmaX-bar. 11/16/2018
The first problem: Since we don’t know sigma, we must use our best estimate of sigma, s, the square root of MSW and then estimate sigmaX-bar by dividing s by the square root of n, the size of the sample. We therefore must use the critical values of the t distribution to determine the CI.95 and CI.99 around muT in which the null hypothesis predicts that Xbar will fall. The exact value will depend of degrees of freedom for s. Since s is the square root of MSW, dfW=n-k. 11/16/2018
t curves and degrees of freedom revisited Z curve q u n c y 1 df To get 95% of the population in the body of the curve when there is 1 df of freedom, you go out over 12 standard deviations. 5 df To get 95% of the population in the body of the curve when there are 5 df of freedom, you go out over 3 standard deviations. 3 2 1 0 1 2 3 Standard deviations score 11/16/2018
Critical values of the t curves The following table defines t curves with 1 through 10,000 degrees of freedom Each curve is defined by how many estimated standard deviations you must go from the mean to define a symmetrical interval that contains a proportions of .9500 and .9900 of the curve, leaving proportions of .0500 and .0100 in the two tails of the curve (combined). Values for .9500/.0500 are shown in plain print. Values for .9900/.0900 and the degrees of freedom for each curve are shown in bold print. 11/16/2018
df 1 2 3 4 5 6 7 8 .05 12.706 4.303 3.182 2.776 2.571 2.447 2.365 2.306 .01 63.657 9.925 5.841 4.604 4.032 3.707 3.499 3.355 df 9 10 11 12 13 14 15 16 .05 2.262 2.228 2.201 2.179 2.160 2.145 2.131 2.120 .01 3.250 3.169 3.106 3.055 3.012 2.997 2.947 2.921 df 17 18 19 20 21 22 23 24 .05 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064 .01 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797 df 25 26 27 28 29 30 40 60 .05 2.060 2.056 2.052 2.048 2.045 2.042 2.021 2.000 .01 2.787 2.779 2.771 2.763 2.756 2.750 2.704 2.660 df 100 200 500 1000 2000 10000 .05 1.984 1.972 1.965 1.962 1.961 1.960 .01 2.626 2.601 2.586 2.581 2.578 2.576
We can compute the standard error of the mean when we know sigma. Estimated distance of sample means from mu: the estimated standard error of the mean We can compute the standard error of the mean when we know sigma. We just have to divide sigma by the square root of n, the size of the sample Similarly, we can estimate the standard error of the mean, estimated the average unsquared distance of sample means from mu. We just have to divide s by the square root of n, the size of the sample in which we are interested 11/16/2018
Here’s the formula: 11/16/2018
The one sample t test If the absolute value of t exceeds the critical value at .05 in the t table, you have falsified the null and must accept the experimental hypothesis. 11/16/2018
The second problem: no control group The second problem: no control group. Participants as their own controls: the repeated measures t test 11/16/2018
3 experimental designs: First = unrelated groups There are three basic ways to run experiments. The first is to create different groups each of which contains different individuals randomly selected from the population. You then measure the groups once to determine whether the differences among their means exceeds that expected for sampling fluctuation. That’s what we’ve done until now. 11/16/2018
Second type of design–repeated measures The second is to create one random sample from the population. You then treat the group in different ways and measure that group two or more times, once for each different way the group is treated. Again, you want to determine whether the differences among the group’s means taken at different times, exceeds that expected for sampling fluctuation. 11/16/2018
Baseline vs. post-treatment If the first measurement is done before the start of the experiment, the result will be a baseline measurement. This allows participants to function as their own controls. In any event, the question is always whether the change between conditions is larger than you would expect from sampling fluctuation alone. 11/16/2018
From this point on, we look only at the difference scores. That is, we ignore the original pre and post absolute scores altogether and only look at the differences between time 1 and time 2. Of course, our first computation is the mean and estimated standard deviation of the differences scores. 11/16/2018
Here is the example we used in learning the computation of the repeated measures t test A B C D E X 10 8 9 11 12 X 10.00 (X - X) 0.00 2.00 -1.00 1.00 (X - X)2 0.00 4.00 1.00 (X-X)=0.00 (X-X)2=10.00 = SSW X=50 n= 5 =10.00 MSW = SSW/(n-k) = 10.00/4 = 2.50 s = MSW = 1.58 11/16/2018
The null hypothesis in our repeated measures t test Theoretically, the null can predict any difference. Pragmatically, the null almost always predicts that there will be no change at all from the first to the second measurement, that the average difference between time 1 and time 2 will be 0.00. Mathematically H0: muD = 0.00, where muD is the average difference score. 11/16/2018
Does this look familiar? We have a single set of difference scores to compare to muT. In the single sample t test, we compared a set of scores to muT. So the repeated measures t test is just like the single sample t test. Only this time our scores are difference scores. 11/16/2018
To do a t test, we need the expected mean under the null To do a t test, we need the expected mean under the null. We have that, muT=0.00. 11/16/2018
We also need the expected amount of difference between the two means given random sampling fluctuation. 11/16/2018
Here is the formula for sD-bar The expected fluctuation of the difference scores is called the estimated standard error of the difference scores, sD-bar. The estimated standard error of the difference scores = the estimated standard deviation of the difference scores divided by the square root of the number of differences scores. It has nD-k = nD – 1 degrees of freedom, where nD is the number of difference scores. Here is the formula for sD-bar 11/16/2018
The repeated measures t is a version of the single sample t test: t equals the actual average difference between the means minus their difference under H0 divided by the estimated standard error of the difference scores 11/16/2018
By the way Repeated measures designs are the simplest form of related measures designs, in which the each participants in each group is related to one participant in each of the other groups. The simplest way for participants in one group to be related to each other is to use the same participants in each group. But there are other ways. For example, each mouse in a four condition experiment could have one litter-mate in each of the other conditions. But the commonest design is repeated measures, and that is what we will study 11/16/2018
11/16/2018
11/16/2018
11/16/2018
11/16/2018
11/16/2018
11/16/2018
11/16/2018
11/16/2018