Stats Lunch: Day 4 Intro to the General Linear Model and Its Many, Many Wonders, Including: T-Tests.

Slides:



Advertisements
Similar presentations
Inferential Statistics and t - tests
Advertisements

1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Independent t -test Features: One Independent Variable Two Groups, or Levels of the Independent Variable Independent Samples (Between-Groups): the two.
PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
Chapter 8 The t Test for Independent Means Part 2: Oct. 15, 2013.
PSY 307 – Statistics for the Behavioral Sciences
Inferential Stats for Two-Group Designs. Inferential Statistics Used to infer conclusions about the population based on data collected from sample Do.
Hypothesis testing applied to means. Characteristics of the Sampling Distribution of the mean The sampling distribution of means will have the same mean.
Independent Samples and Paired Samples t-tests PSY440 June 24, 2008.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples.
Don’t spam class lists!!!. Farshad has prepared a suggested format for you final project. It will be on the web
Lecture 9: One Way ANOVA Between Subjects
One-way Between Groups Analysis of Variance
S519: Evaluation of Information Systems
 What is t test  Types of t test  TTEST function  T-test ToolPak 2.
PSY 307 – Statistics for the Behavioral Sciences
Intro to Parametric Statistics, Assumptions & Degrees of Freedom Some terms we will need Normal Distributions Degrees of freedom Z-values of individual.
Richard M. Jacobs, OSA, Ph.D.
PSY 307 – Statistics for the Behavioral Sciences
Inferential Statistics
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Inferential Statistics
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
AM Recitation 2/10/11.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Inferential Statistics: SPSS
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Statistical Analysis Statistical Analysis
Significance Tests …and their significance. Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from.
ANOVA Greg C Elvers.
Education 793 Class Notes T-tests 29 October 2003.
T-distribution & comparison of means Z as test statistic Use a Z-statistic only if you know the population standard deviation (σ). Z-statistic converts.
Sampling Distribution of the Mean Central Limit Theorem Given population with and the sampling distribution will have: A mean A variance Standard Error.
Stats 95 t-Tests Single Sample Paired Samples Independent Samples
The Hypothesis of Difference Chapter 10. Sampling Distribution of Differences Use a Sampling Distribution of Differences when we want to examine a hypothesis.
Stats Lunch: Day 7 One-Way ANOVA. Basic Steps of Calculating an ANOVA M = 3 M = 6 M = 10 Remember, there are 2 ways to estimate pop. variance in ANOVA:
Chapter 11 HYPOTHESIS TESTING USING THE ONE-WAY ANALYSIS OF VARIANCE.
January 31 and February 3,  Some formulae are presented in this lecture to provide the general mathematical background to the topic or to demonstrate.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
1 Psych 5500/6500 t Test for Two Independent Means Fall, 2008.
Testing Hypotheses about Differences among Several Means.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
Jeopardy Hypothesis Testing t-test Basics t for Indep. Samples Related Samples t— Didn’t cover— Skip for now Ancient History $100 $200$200 $300 $500 $400.
DIRECTIONAL HYPOTHESIS The 1-tailed test: –Instead of dividing alpha by 2, you are looking for unlikely outcomes on only 1 side of the distribution –No.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
© Copyright McGraw-Hill 2000
Chapter 13 - ANOVA. ANOVA Be able to explain in general terms and using an example what a one-way ANOVA is (370). Know the purpose of the one-way ANOVA.
Smith/Davis (c) 2005 Prentice Hall Chapter Nine Probability, the Normal Curve, and Sampling PowerPoint Presentation created by Dr. Susan R. Burns Morningside.
I271B The t distribution and the independent sample t-test.
Chapter 12 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 12: One-Way Independent ANOVA What type of therapy is best for alleviating.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Chapter 10 The t Test for Two Independent Samples
IS 4800 Empirical Research Methods for Information Science Class Notes March 16, 2012 Instructor: Prof. Carole Hafner, 446 WVH Tel:
Stats 95 t-Tests Single Sample Paired Samples Independent Samples.
Other Types of t-tests Recapitulation Recapitulation 1. Still dealing with random samples. 2. However, they are partitioned into two subsamples. 3. Interest.
Chapter 13 Understanding research results: statistical inference.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
T-TEST. Outline  Introduction  T Distribution  Example cases  Test of Means-Single population  Test of difference of Means-Independent Samples 
Chapter 9 Introduction to the t Statistic
Dependent-Samples t-Test
INF397C Introduction to Research in Information Studies Spring, Day 12
Internal Validity – Control through
Hypothesis Testing in the Real World
Chapter 9 Test for Independent Means Between-Subjects Design
Presentation transcript:

Stats Lunch: Day 4 Intro to the General Linear Model and Its Many, Many Wonders, Including: T-Tests

Steps of Hypothesis Testing 1. Restate your question as a research hypothesis and a null hypothesis about the populations 2. Determine the characteristics of the comparison distribution (mean and standard error of sampling distribution of the means) 3. Determine the cutoff sample score on the comparison distribution at which the null hypothesis could be rejected 4. Determine your sample’s score on the comparison distribution 5. Decide whether to reject the null hypothesis

The General Linear Model Mathematical Statement expressing that the score of any participant in a treatment condition is the linear sum of the population parameters. Y ij =  T +  i +  ij Individual ScoreGrand Mean Treatment Effect Experimental Error Basic Assumptions of the GLM 1)Scores are independent. 2)Scores in treatment populations are (approximately) normally distributed. 3)Variance of scores in treatment populations are (approximately) equal.

The General Linear Model  Based on the structure and assumptions of the GLM, we can estimate expected values for our models…which turns out to be useful for a whole bunch of analyses... Cannoncial Analyses Multiple Regression ANOVA t-tests  Most statistics we regularly use are derivations (special cases) of the GLM  t-tests, ANOVA methods became part of the scientific culture because they were easier to calculate by hand…

Basic Point of Calculations: A ratio of… What We Can Explain = t score, F score, etc. What We Can’t -So what can we explain? -All we can explain is what we manipulate -The impact of our IV (ex: what our drug does) Explained Variance (What Our IV does) Unexplained Variance (Error) Null Hypothesis: Means of all groups being studied are the same: -Mean Old Drug = Mean New Drug = Mean Placebo The General Linear Model

Explained Variance: Effect of our treatment, or the reason why subjects in different conditions score differently.  i =  i -  T Unexplained Variance/Error: Reason why people in the same condition (Ex: All the subjects in the New Drug Group) don’t have the same score. -can be due to anything (we hope it’s not systematic) -Not not affected by the IV (it’s the same whether the null is true or not).  ij = Y ij -  i Y ij =  T + (  i -  T ) + (Y ij -  i )

W.S. Gosset and the t-test  Gosset was employed by the Guinness brewery to scientifically examine beer making.  Due to costs, only had access to small samples of ingredients.  Needed to determine the probability of events occurring in a population based on these small samples.  Developed a method to estimate PARAMETERS based on SAMPLE STATISTICS.  These estimates varied according to sample size (t-curves)

Estimating σ 2 from SD 2 The variance in a sample should be similar to variation in the population: -Variance of sample is smaller than a population -If we just use SD 2 to estimate, we will tend to UNDERESTIMATE the true population variance -So, SD 2 is a biased estimate of σ 2 -How far off our guess is tied to the # of subjects we have...

Getting an Unbiased estimate of Pop. Variance -Remember, the MORE subjects we have, the smaller our estimate of population variance would be -Thus we can get an unbiased estimate of variance by mathematically reducing N Biased (Regular)Formula Unbiased Formula  (X-M) 2 _________ N  (X-M) 2 N-1 S 2 = Unbiased estimate of population variance (same probability of being over or under…) S = Estimate of pop SD =  S 2

_________  (X-M) 2 N-1 Degrees of Freedom (df) The denominator of the equation for getting S 2 is called the “Degrees of Freedom” # of scores in a sample that are free to vary when estimating a population parameter… Ex: If we are trying to figure out the mean from 5 scores -We know the mean is 5, we know that  X = 25 (25/5 = M = 5) 5 + X + Y + Z + Q = X + Y + Z = X = 25 -So, in this last case X cannot vary (only one possible solution for X)

Characteristics of Sampling Dist of Means when using estimated population variance (S 2 ) The logic is the same as when we knew the pop variance… -But we need to use S 2 when we calc. Variance of Sampling Dist (and Standard Error) -This will in turn influence the shape of the Comparison Distribution (It won’t be quite normal) Variance and Standard Error _________  (X-M) 2 N-1 S 2 M = 8.28/N S M =  8.28 = 2.88 When Finding variance for Sampling Dist ALWAYS divide by N... S 2 M = S 2 /N S M =  S 2 M

Shape of Sampling Dist of Means when using estimated population variance (S 2 ) -Shape of the Comparison distribution changes when using S 2 -Differs slightly from the Normal Curve -This effect is greater the fewer subjects there are (less info about variance) - Called the “t Distribution” -Similar to Normal Curve, but the t Distribution: -Has heavier tails -Hence, larger percentage of scores occur in the tails -Requires higher sample mean to reject the null

_________  (X-M) 2 N-1 S 2 = _________  (X-M) 2 df -Thus, the shape of the t Distribution is effected by df (N-1) -Smaller N means the t dist. is less like normal curve -Instead of one comparison distribution that we used before (the normal curve), there a whole bunch of t distributions -Different distribution for each df -Thus for a mean of a given sample size, we need to compare that score to the appropriate comparison (t) distribution: -Has same df as your sample

Determining t score Cutoff Point (to reject null) -Need to use a t table (page 307 and Appendix A2) -Has t cutoffs for each df at different alpha levels -Both one-tailed and two-tailed tests Comparing Sample Mean to Comparison Distribution -need to calculate “t score” instead of “z score” t = (Treatment Effect) SMSM ______

Lower N (df) means greater percentage of scores in the tail -Need higher sample mean to reject the null -N > 30, resembles normal curve

Finally, We Do Something Useful... Within-Subjects vs. Between Subjects Designs -W/in Subjects Design (Repeated Measures): Research strategy in which the same subjects are tested more than once: Ex: Measuring cold symptoms in the same people before and after they took medicine -Between Subjects Design: Have two independent groups of subjects. One group gets a treatment (experimental group) and the other group does not (control group). Ex: Baby Mozart Group vs. Group that didn’t get it We do a different kind of t test for each type of research design...

t Tests for Dependent and Independent Means W/in Subjects Design  Use t test for dependent means -scores are dependent because they’re from the same people… Between Subjects Design  Use t test for independent means t Tests for Dependent Means -AKA: Paired-Samples, matched-samples, w/in groups We have two sets of scores (1 and 2) from same subjects -Ex: Before and After treatment… -Works exactly the same as a t test for a single sample -Except, now we have two scores to deal with instead of just one...

Variations of the Basic Formula for Specific t-test Varieties W/in Subjects t = (Treatment Effect)SMSM ______  Treatment Effect equals change scores (Time 2 – Time 1)  If null hyp. is true, mean of change scores = 0  S m of change scores used Between Subjects  Treatment Effect equals mean of experimental group – mean of control group.  S m is derived from a weighted estimate (based on N) of the two groups…  In other words we are controlling for proportion of the TOTAL Df contribute by each sample… df total = df sample 1 + df sample 2

Conducting a Repeated Measures t-test in SPSS 1)Click on “Analyze” 2)Then “Compare Means” 3)Select “Paired-Samples T test) 4)Choose paired variables 5)Click OK

Conducting a Repeated Measures t-test in SPSS Make sure N is correct Make sure df is correct Use 2-tailed tests.

Conducting an Independent Samples t-test in SPSS 1)Click on “Analyze” 2)Then “Compare Means” 3)Select “Indpt. Samples T-test” 4) Add your D.V. here 5) Add your IV (grouping variable) 6) Click on “Define Groups”

Conducting an Independent Samples t-test in SPSS 7) Define your groups according to the structure of your data file (be sure to remember which group is which) 8) Click on “continue” 9) Click on “Ok”

Back to the GLM (Remember how I said it’s all the same?)  You don’t have to use t-tests (even if you only have two groups/scores)  Can also use regression or ANOVA Results from Paired Samples t-test Results from Repeated Measures ANOVA  p values are identical  F = t 2  Using ANOVA module gives you more options such as getting effect sizes, power, and parameter estimates…

Extra Slides on Estimating Variance in Dependent Samples t-tests.

All Sorts of Estimating... Remember, the only thing we KNOW is what we know about our samples. -Need to use this to estimate everything else… Estimating Population Variance -We assume that the two populations have the same variance… -So, we could ESTIMATE population variance from EITHER sample (or both) -What are the chances that we would get exactly the SAME estimate of pop. variance (S 2 ) from 2 different samples… -So, we would get two DIFFERENT estimates for what should be the SAME #.

Estimating Population Variance -If we have two different estimates, then the best strategy would be to average them somehow… -Pooled Estimate of Population Variance (S 2 Pooled ) -But, we can’t just average the two estimates together (especially if one sample is larger)… -One estimate would be better than the other -We need a “weighted-average” that controls for the quality of estimates we get from different N’s. -We control for the proportion of the total df each sample contributes... -df total = df sample 1 + df sample 2

Ex: Sample 1 has N= 10, df = 9 Sample 2 has N = 9, df = 8 We would calculate S 2 for each sample just as we’ve done it before (SS/df) Ex: S 2 1 = 12, S 2 2 = 10 df total = df sample 1 + df sample 2 = = 17 S 2 Pooled = (9/17) * 12 + (8/17) * 10 =.53 * * 10 = = If we just took the plain average, we would have estimated 11 S 2 Pooled = (df 1 /df tot ) * ( S 2 1 ) +(df 2 /df tot ) * ( S 2 2 ) df total = df sample 1 + df sample 2

Estimating Variance of the 2 Sampling Distributions... Need to do this before we can describe the shape of the comparison distribution (Dist. of the Differences between Means) -We assume that the variance of the Populations of each group are the same -But, if N is different, our estimates of the variance of the Sampling Distribution are not the same (affected by sample size) -So we need to estimate variance of Sampling Dist. for EACH population (using the N from each sample) S 2 M1 = S 2 Pooled / N 1 S 2 M2 = S 2 Pooled / N 2 S 2 M1 = 11.06/10 = 1.11S 2 M2 = 11.06/9 = 1.23 Ex:

Finally, we can find Variance and Sd of the Distribution of the Differences Between Means (comparison distribution) S 2 Difference = S 2 M1 + S 2 M2 From this, we can find SD for Dist. Of Difference Between Means… S difference =  S 2 Difference Ex: S difference =  2.34 = 1.53 Ex: S 2 Difference = = 2.34 t = (M 2 – M 1 )/ S difference