Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2018 Room 150 Harvill Building 10:00 - 10:50 Mondays, Wednesdays.

Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2018 Room 150 Harvill Building 10: :50 Mondays, Wednesdays & Fridays. Welcome Monday 10/29/18

.. The Green Sheets

Before next exam (November 16th)
Schedule of readings Before next exam (November 16th) Please read chapters in OpenStax textbook Please read Chapters 2, 3, and 4 in Plous Chapter 2: Cognitive Dissonance Chapter 3: Memory and Hindsight Bias Chapter 4: Context Dependence

Lab sessions This Week Project 3

Project 3: Analysis of Variance (ANOVA)

Preview of homework assignment

Five steps to hypothesis testing
Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses Step 2: Decision rule Alpha level? (α = .05 or .01)? One or two tailed test? Balance between Type I versus Type II error Critical statistic (e.g. z or t or F or r) value? Step 3: Calculations Step 4: Make decision whether or not to reject null hypothesis If observed z (or t) is bigger then critical z (or t) then reject null Step 5: Conclusion - tie findings back in to research problem

We lose one degree of freedom for every parameter we estimate
Degrees of Freedom Degrees of Freedom (d.f.) is a parameter based on the sample size that is used to determine the value of the t statistic. Degrees of freedom tell how many observations are used to calculate s, less the number of intermediate estimates used in the calculation.

Comparing z score distributions with t-score distributions
z-scores Similarities include: Using bell-shaped distributions to make confidence interval estimations and decisions in hypothesis testing Use table to find areas under the curve (different table, though – areas often differ from z scores) t-scores Summary of 2 main differences: We are now estimating standard deviation from the sample (We don’t know population standard deviation) We have to deal with degrees of freedom

Differences include: We use t-distribution when we don’t know standard deviation of population, and have to estimate it from our sample 2) The shape of the sampling distribution is very sensitive to small sample sizes (it actually changes shape depending on n) Please notice: as sample sizes get smaller, the tails get thicker. As sample sizes get bigger tails get thinner and look more like the z-distribution

Please note: Once sample sizes get big enough the t distribution (curve) starts to look exactly like the z distribution (curve) scores Comparing z score distributions with t-score distributions Differences include: We use t-distribution when we don’t know standard deviation of population, and have to estimate it from our sample 2) The shape of the sampling distribution is very sensitive to small sample sizes (it actually changes shape depending on n) 3) Because the shape changes, the relationship between the scores and proportions under the curve change (So, we would have a different table for all the different possible n’s but just the important ones are summarized in our t-table)

We use degrees of freedom (df) to approximate sample size
Interpreting t-table We use degrees of freedom (df) to approximate sample size Technically, we have a different t-distribution for each sample size This t-table summarizes the most useful values for several distributions This t-table presents useful values for distributions (organized by degrees of freedom) Each curve is based on its own degrees of freedom (df) - based on sample size, and its own table tying together t-scores with area under the curve n = 17 n = 5 . Remember these useful values for z-scores? 1.64 1.96 2.58

Area between two scores Area between two scores
Area beyond two scores (out in tails) Area beyond two scores (out in tails) Area in each tail (out in tails) Area in each tail (out in tails) df

useful values for z-scores? .
Area between two scores Area between two scores Area beyond two scores (out in tails) Area beyond two scores (out in tails) Area in each tail (out in tails) Area in each tail (out in tails) df Notice with large sample size it is same values as z-score Remember these useful values for z-scores? . 1.96 2.58 1.64

Hypothesis testing: one sample t-test
Let’s jump right in and do a t-test Hypothesis testing: one sample t-test Is the mean of my observed sample consistent with the known population mean or did it come from some other distribution? We are given the following problem: 800 students took a chemistry exam. Accidentally, 25 students got an additional ten minutes. Did this extra time make a significant difference in the scores? The average number correct by the large class was 74. The scores for the sample of 25 was Please note: In this example we are comparing our sample mean with the population mean (One-sample t-test) 76, 72, 78, 80, 73 70, 81, 75, 79, 76 77, 79, 81, 74, 62 95, 81, 69, 84, 76 75, 77, 74, 72, 75

µ = 74 µ Hypothesis testing
Step 1: Identify the research problem / hypothesis Did the extra time given to this sample of students affect their chemistry test scores Describe the null and alternative hypotheses One tail or two tail test? Ho: µ = 74 = 74 H1:

We use a different table for t-tests
Hypothesis testing Step 2: Decision rule = .05 n = 25 Degrees of freedom (df) = (n - 1) = (25 - 1) = 24 two tail test This was for z scores We use a different table for t-tests

two tail test α= .05 (df) = 24 Critical t(24) = 2.064

µ = 74 Hypothesis testing = = 868.16 = 6.01 24 x (x - x) (x - x)2
76 72 78 80 73 70 81 75 79 77 74 62 95 69 84 76 – 76.44 72 – 76.44 78 – 76.44 80 – 76.44 73 – 76.44 70 – 76.44 81 – 76.44 75 – 76.44 79 – 76.44 77 – 76.44 74 – 76.44 62 – 76.44 95 – 76.44 69 – 76.44 84 – 76.44 = -0.44 = = = = = = = = = = = = = = = 0.1936 2.4336 2.0736 6.5536 0.3136 5.9536 Step 3: Calculations µ = 74 Σx = N 1911 25 = = 76.44 N = 25 = 6.01 868.16 24 Σx = 1911 Σ(x- x) = 0 Σ(x- x)2 =

µ = 74 Hypothesis testing = 76.44 - 74 1.20 2.03 .
Step 3: Calculations µ = 74 = 76.44 N = 25 s = 6.01 = 1.20 2.03 critical t 6.01 25 Observed t(24) = 2.03

Hypothesis testing Step 4: Make decision whether or not to reject null hypothesis Observed t(24) = 2.03 Critical t(24) = 2.064 2.03 is not farther out on the curve than 2.064, so, we do not reject the null hypothesis Step 6: Conclusion: The extra time did not have a significant effect on the scores

Hypothesis testing: Did the extra time given to these 25 students affect their average test score? Start summary with two means (based on DV) for two levels of the IV notice we are comparing a sample mean with a population mean: single sample t-test Finish with statistical summary t(24) = 2.03; ns Describe type of test (t-test versus z-test) with brief overview of results Or if it had been different results that *were* significant: t(24) = -5.71; p < 0.05 The mean score for those students who where given extra time was percent correct, while the mean score for the rest of the class was only 74 percent correct. A t-test was completed and there appears to be no significant difference in the test scores for these two groups t(24) = 2.03; n.s. Type of test with degrees of freedom n.s. = “not significant” p<0.05 = “significant” n.s. = “not significant” p<0.05 = “significant” Value of observed statistic 24

Independent samples t-test
25

Five steps to hypothesis testing
Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses Step 2: Decision rule Alpha level? (α = .05 or .01)? Critical statistic (e.g. z or t) value? Step 3: Calculations Step 4: Make decision whether or not to reject null hypothesis If observed z (or t) is bigger then critical z (or t) then reject null Step 5: Conclusion - tie findings back in to research problem

Hypothesis testing with t-tests
The result is “statistically significant” if: the observed statistic is larger than the critical statistic observed stat > critical stat If we want to reject the null, we want our t (or z or r or F or x2) to be big!! the p value is less than 0.05 (which is our alpha) p < If we want to reject the null, we want our “p” to be small!! we reject the null hypothesis then we have support for our alternative hypothesis Review

Independent samples t-test
Are the two means significantly different from each other, or is the difference just due to chance? Independent samples t-test Donald is a consultant and leads training sessions. As part of his training sessions, he provides the students with breakfast. He has noticed that when he provides a full breakfast people seem to learn better than when he provides just a small meal (donuts and muffins). So, he put his hunch to the test. He had two classes, both with three people enrolled. The one group was given a big meal and the other group was given only a small meal. He then compared their test performance at the end of the day. Please test with an alpha = .05 Big Meal 22 25 Small meal 19 23 21 Mean= 21 Mean= 24 Got to figure this part out: We want to average from 2 samples - Call it “pooled” x1 – x2 t = 24 – 21 variability t = variability 28

α = .05 Independent samples t-test
Step 1: Identify the research problem Did the size of the meal affect the learning / test scores? Step 2: Describe the null and alternative hypotheses Step 3: Decision rule α = .05 Two tailed test n1 = 3; n2 = 3 Degrees of freedom total (df total) = (n1 - 1) + (n2 – 1) = (3 - 1) + (3 – 1) = 4 Critical t(4) = 2.776 Step 4: Calculate observed t score 29

Notice: Simple Average = 3.5
Mean= 21 Mean= 24 Big Meal Deviation From mean -2 1 Small Meal Deviation From mean -2 2 Squared deviation 4 1 Squared Deviation 4 Big Meal 22 25 Small meal 19 23 21 Σ = 6 Σ = 8 6 3 Notice: s2 = 3.0 1 2 1 Notice: Simple Average = 3.5 8 4 Notice: s2 = 4.0 2 2 2 S2pooled = (n1 – 1) s12 + (n2 – 1) s22 n1 + n2 - 2 S2pooled = (3 – 1) (3) + (3 – 1) (4) = 3.5 30

S2p = 3.5 Mean= 21 Mean= 24 Big Meal Deviation From mean -2 1 Small Meal Deviation From mean -2 2 Squared deviation 4 1 Squared Deviation 4 Participant 1 2 3 Big Meal 22 25 Small meal 19 23 21 Σ = 6 Σ = 8 = 24 – 21 1.5275 = 1.964 3.5 3.5 3 3 Observed t Observed t = Critical t = 2.776 1.964 is not larger than so, we do not reject the null hypothesis t(4) = 1.964; n.s. Conclusion: There appears to be no difference between the groups 31

Type of test with degrees of freedom Value of observed statistic
We compared test scores for large and small meals. The mean test scores for the big meal was 24, and was 21 for the small meal. A t-test was calculated and there appears to be no significant difference in test scores between the two types of meals, t(4) = 1.964; n.s. Type of test with degrees of freedom n.s. = “not significant” p<0.05 = “significant” Value of observed statistic Start summary with two means (based on DV) for two levels of the IV Finish with statistical summary t(4) = 1.96; ns Describe type of test (t-test versus anova) with brief overview of results Or if it *were* significant: t(9) = 3.93; p < 0.05 32

Thank you! See you next time!!

Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2018 Room 150 Harvill Building 10:00 - 10:50 Mondays, Wednesdays.

Similar presentations

Presentation on theme: "Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2018 Room 150 Harvill Building 10:00 - 10:50 Mondays, Wednesdays."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2018 Room 150 Harvill Building 10:00 - 10:50 Mondays, Wednesdays.

Similar presentations

Presentation on theme: "Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2018 Room 150 Harvill Building 10:00 - 10:50 Mondays, Wednesdays."— Presentation transcript:

Similar presentations

About project

Feedback