Introduction to the t-statistic Introduction to Statistics Chapter 9 Oct 13-15, 2009 Classes #16-17.

Introduction to the t-statistic Introduction to Statistics Chapter 9 Oct 13-15, 2009 Classes #16-17

The Problem with Z-scores  Z-scores have a shortcoming as an inferential statistic: The computation of the standard error requires knowing the population standard deviation (  ).  In reality, we rarely know the value of .  When we don’t know , can’t compute standard error.  Therefore, we use the t statistic, rather than the Z-score, for hypothesis testing.

How to test?  Still need to know about standard deviation of the population  To figure this out… Use formula learned back in chapter 4:  Use this to calculate estimated standard error (s M )  We calculate t statistic similarly to how we calculated the z- statistic, but now we will use the estimated standard error of the mean (rather than the standard error of the mean)

The t Statistic  Use the sample variance (s 2 ) to estimate the population variance s 2 = SS/df = SS/(n-1)  Use variance s 2 in the formula to get the estimated standard error: Provides an estimate of the standard distance between M and  when  is unknown estimated standard error = s M = s n = s2s2 n

The t Statistic  Finally, we replace the standard error in the z-score formula with the estimated standard error to get the t statistic formula: t = M-  s M

Illustration  M =  n t = M-  s M s M = s n Z = M-   M In chapter 8 for Z-scores we used: Population SD Now we are using an estimate of standard error by using the sample SD plugging it into z-score formula: plugging it into t statistic formula:

Another change…  Up until this chapter we have been using formulas that used the standard deviation as part of the standard error formula  Now, we shift our focus to the formula based on variance  On page, 234 the book gives reasoning for this.  Main reason: sample variance (s 2 ) provides an accurate and unbiased estimate of the population variance (  ²)

One more change…  Although the definitional formula for sum of squares is the most direct for computing SS it has its problems When the mean is not a whole number the deviations will contain decimals and thus calculations become more difficult leading to rounding error Therefore, from now on we will use the SS computational formula which can be found on page 93 in the textbook  “From now on” means all future tests and final exam

Sample Variance  Therefore, we will use sample variance rather than sample standard deviation to compute s  Sample standard deviation is a descriptive statistic rather than a inferential statistic Sample variance will provide the most accurate way to estimate the standard error

We are now using variance-based formula in these equations  Why? Inferential purpose, rather than descriptive Drawing inferences about the population estimated standard error = s M = s n = s2s2 n

t statistic  Definition: Used to test hypotheses about an unknown population mean  when the value of  is unknown. The formula for the t statistic has the same structure as the z-score formula, except the t statistic uses the estimated standard error in the denominator. t = M-  s M

Z-score vs. T-Score  Z-distribution stays the same, regardless of sample size  T-distribution changes, depending on how many pieces of information you have: degrees of freedom here, df = n-1

Everything else stays the same  Have an alpha level  Have one-tailed and two-tailed tests  Determine boundaries of critical region  Determine whether t-statistic falls in critical region  If it does, reject null and know that p<alpha

Degrees of Freedom  How well does s approximate  ? Depends on the size of the sample. The larger the sample, the better the approximation. Degrees of Freedom (df) = n-1 Measures the number of scores that are free to vary when computing SS for sample data. The value of df also describes how well a t statistic estimates a normal curve.

Degrees of Freedom  Degrees of Freedom = df = n-1  As df (sample size) gets larger, 3 things result: 1) s 2 (sample variance) better represents  2 (population variance). 2) t better approximates z. 3) in general, the sample better represents the population.

The t-distribution  T-distribution The set of all possible t statistics obtained by selecting all possible samples of size n from a given population  How well the t distribution approximates a normal distribution is determined by the df.  In general, the greater n (and df), the more normal the t distribution becomes.  t distribution more variable and flatter than normal z-score distribution – why is this the case? Both mean and standard error can vary in t-distribution – only the mean varies in the z- distribution

Distributions of the t statistic

The Versatility of the t test  You do not need to know  when testing with t  The t test permits hypothesis testing in situations in which  is unknown  All you really need to compute t is a sensible null hypothesis and a sample drawn from the unknown population

Hypothesis Testing with t (two tails)  Same four steps, with a few differences: Now estimating the standard error, so compute t rather than z Consult t-distribution table rather than Unit Normal Table to find critical value for t (this will involve the calculation of the df)

Hypothesis Testing w/ t-statistic  Instead of the Unit Normal Table, we now have the t-table p. 531-532 Similar in form to the Unit Normal Table Pay attention to the df column!!  Let’s think about this table for a minute Looking at the two-tail, p=0.05 column:  What is value at 10 df?  What is value at 20 df?  What is value at 30 df?  What is value at 120 df?

A portion of the t-distribution table

Hypothesis Testing with t (two tails)  Step 1: State the hypotheses.  Step 2: Set  and locate the critical region. You will need to calculate the df to do this, and use the t distribution table.  Step 3: Graph (shade) the critical region.  Step 4: Collect sample data and compute t. This will involve 3 calculations, given SS, n, , and M:  a) the sample variance (s 2 )  b) the estimated standard error (s M )  c) the t statistic

Hypothesis Testing with t (two tails)  Step 5: Go back to graph and see if t calc falls in the critical region  Step 6: Make a decision. Compare t computed in Step 3 t CALC with t CRIT found in the t table: If t CALC > t CRIT (ignoring signs)  Reject H O If t CALC < t CRIT (ignoring signs)  Fail to Reject H O

One-Tailed Hypothesis Testing with t  Same as with z, only steps 1 and 2 change.  Step 1: Now use directional hypotheses.  H 0 :  = ? and H 1 :  ? (predicts decrease) OR  H 0 :  = ? and H 1 :  ? (predicts increase).  Step 2: Now the critical region located in only one tail of the distribution (sign of t CRIT represents the direction of the predicted effect). You will have to use a different column on the t distribution table.

Example1  Do eye-spot patterns affect behavior? If eye-spots do affect behavior, birds should spend more or less time in chamber w/ eye-spots painted on the walls. Sample of n=16 birds. Allowed to wander between the 2 chambers for 60 minutes. If eye-spots do not affect behavior, we’d expect they’d spend about 30 minutes in each chamber. We’re told the sample mean =39, SS = 540.

Example1  Step 1: State the hypotheses Ho: µ plain side = 30 min. H1: µ plain side ≠ 30 min. Two-tailed Alpha = 0.05  Step 2: Locate the critical region Based on df. What are df here? What is the critical value?

Step 3: Shade in critical region

Example 1  Step 4: Calculate the t-statistic. First calculate the sample variance  s 2 = SS/n-1, 540/15 = 36. Next use the sample variance (s 2 ) to calculate the estimated standard error s M Finally, compute the t-statistic:

Example 1  Step 5: Make a decision. T-calculated = 6.00 t-critical = + 2.131 We observe that our t-value is in the region of rejection. We conclude that eye-spots have an effect on predatory behavior.

Example 2 A teacher was trying to see whether a new teaching method would increase the Test of English as Foreign Language (TOFEL) scores of students. She received a report which included a partial list of previous scores on the exam. Unfortunately, most of the records were burned in a fire that occurred in the school’s Records’ Department. From the available data, students taught by old methods had  = 580. She tested her method in a class of 20 students and got a mean of 595 and variance of 225. Is this increase statistically significant at the level of 0.05 in a 2-tailed test?

Example 2  Step 1:  Step 2:

Example 2: Step 3

Example 2  Step 4:  Step 5:

Example 3  A researcher believes that children in poverty-stricken regions are undernourished and underweight. Past studies show the mean weight of 6-year olds is normally distributed with a  20.9 kg. However, the exact mean and standard deviation of the population is not available. The researcher collects a sample of 9 children from these poverty-stricken regions, with a sample mean of 17.3 kg & s = 2.51 kg.  Using a one-tailed test and a 0.01 level of significance, determine if this sample is significantly different from what would be expected for the population of 6-year olds.

Example 3  Step 1  Step 2

Example 3: Step 3

Example 4  A researcher has developed a new formula (Sunblock Extra) that she claims will help protect against the harmful rays of the sun. In a recent promotion for the new formula she is quoted as saying she is sure her new formula is better than the old one (Sunscreen).  Her prediction: The “improved” Sunblock Extra will score higher than the previous Sunscreen score of 12?  She decides to use the.05 significance level to test for differences. To the right are the Sunblock Extra scores for participants in her study.  In notation form:  H 0 :  H A :  Determine if there is a significant difference between the new product and the old one (make your decision and interpret).

Example 4  Step 2:

Example 4: Step 3

Example 5  Scientists believe that the “Monstro Motors” new model will get the highest gas mileage of any car on their lot. Although, not much data is available on the older cars, from a review of previous models they estimate that the best of the rest of their cars achieved 67 m.p.g. They using an alpha level α =.01.  H 0 :  H A :  Determine if there is a significant difference between the MPG of the new car and the best old model on their lot (make your decision and interpret).

Example 5  Step 2:

Example 5: Step 3

Steps  Step 2?  Step 3?  Step 4?

Effect Size  estimated d  r squared

Effect Size  Example 4 Small effect (small to medium)  Example 5 Large effect

Credits  http://myweb.liu.edu/~nfrye/psy53/ch9.ppt#9 http://myweb.liu.edu/~nfrye/psy53/ch9.ppt#9  http://homepages.wmich.edu/~malavosi/Chapt9PPT_S_05.ppt#2 http://homepages.wmich.edu/~malavosi/Chapt9PPT_S_05.ppt#2  http://faculty.plattsburgh.edu/alan.marks/Stat%20206/Introduction%20to% 20the%20t%20Statistic.ppt#4 http://faculty.plattsburgh.edu/alan.marks/Stat%20206/Introduction%20to% 20the%20t%20Statistic.ppt#4  http://home.autotutor.org/hiteh/Stats%20S04/Statistics04onesamplet- test1.ppt#7 http://home.autotutor.org/hiteh/Stats%20S04/Statistics04onesamplet- test1.ppt#7

Introduction to the t-statistic Introduction to Statistics Chapter 9 Oct 13-15, 2009 Classes #16-17.

Similar presentations

Presentation on theme: "Introduction to the t-statistic Introduction to Statistics Chapter 9 Oct 13-15, 2009 Classes #16-17."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Introduction to the t-statistic Introduction to Statistics Chapter 9 Oct 13-15, 2009 Classes #16-17.

Similar presentations

Presentation on theme: "Introduction to the t-statistic Introduction to Statistics Chapter 9 Oct 13-15, 2009 Classes #16-17."— Presentation transcript:

Similar presentations

About project

Feedback