Chapter 14 Comparing two groups Dr Richard Bußmann.

Slides:



Advertisements
Similar presentations
Lesson 14 Comparing Two Groups Copyright © 2012 Pearson Education.
Advertisements

Chapter 25: Paired Samples and Blocks. Paired Data Paired data arise in a number of ways. Compare subjects with themselves before and after treatment.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 25 Paired Samples and Blocks.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 21, Slide 1 Chapter 21 Comparing Two Proportions.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Objective: To test claims about inferences for two sample means from matched-pair tests or blocked samples, under specific conditions.
One sample T Interval Example: speeding 90% confidence interval n=23 Check conditions Model: t n-1 Confidence interval: 31.0±1.52 = (29.48, 32.52) STAT.
Confidence Interval and Hypothesis Testing for:
Testing means, part III The two-sample t-test. Sample Null hypothesis The population mean is equal to  o One-sample t-test Test statistic Null distribution.
PSY 307 – Statistics for the Behavioral Sciences
BCOR 1020 Business Statistics Lecture 22 – April 10, 2008.
BCOR 1020 Business Statistics
Copyright © 2010 Pearson Education, Inc. Chapter 25 Paired Samples and Blocks.
Chapter 11: Inference for Distributions
Copyright © 2010 Pearson Education, Inc. Chapter 24 Comparing Means.
5-3 Inference on the Means of Two Populations, Variances Unknown
Objective: To test claims about inferences for two sample means, under specific conditions.
Chapter 24: Comparing Means.
T-test Mechanics. Z-score If we know the population mean and standard deviation, for any value of X we can compute a z-score Z-score tells us how far.
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 15 Inference for Counts:
STA291 Statistical Methods Lecture 31. Analyzing a Design in One Factor – The One-Way Analysis of Variance Consider an experiment with a single factor.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 23, Slide 1 Chapter 23 Comparing Means.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 25 Paired Samples and Blocks.
Comparing Two Proportions
Chapter 10 Comparing Two Means Target Goal: I can use two-sample t procedures to compare two means. 10.2a h.w: pg. 626: 29 – 32, pg. 652: 35, 37, 57.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 24 Comparing Means.
Business Statistics for Managerial Decision Comparing two Population Means.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 24, Slide 1 Chapter 24 Paired Samples and Blocks.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 24 Comparing Means.
Chapter 22: Comparing Two Proportions
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 25 Paired Samples and Blocks.
Chapter 22: Comparing Two Proportions. Yet Another Standard Deviation (YASD) Standard deviation of the sampling distribution The variance of the sum or.
Copyright © 2010 Pearson Education, Inc. Slide Beware: Lots of hidden slides!
Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
Lesson Comparing Two Means. Knowledge Objectives Describe the three conditions necessary for doing inference involving two population means. Clarify.
STA 2023 Module 11 Inferences for Two Population Means.
AP Statistics Chapter 24 Comparing Means.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 24 Comparing Means.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 25 Paired Samples and Blocks.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 25 Paired Samples and Blocks.
Paired Samples and Blocks
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 24, Slide 1 Chapter 25 Paired Samples and Blocks.
Comparing Means Chapter 24. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side.
Chapter 22 Comparing Two Proportions.  Comparisons between two percentages are much more common than questions about isolated percentages.  We often.
+ Unit 6: Comparing Two Populations or Groups Section 10.2 Comparing Two Means.
T tests comparing two means t tests comparing two means.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide Next Time: Make sure to cover only pooling in TI-84 and note.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 23, Slide 1 Chapter 24 Comparing Means.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Statistics 25 Paired Samples. Paired Data Data are paired when the observations are collected in pairs or the observations in one group are naturally.
Statistics 24 Comparing Means. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side.
Statistics 22 Comparing Two Proportions. Comparisons between two percentages are much more common than questions about isolated percentages. And they.
AP Statistics Chapter 25 Paired Samples and Blocks.
Copyright © 2009 Pearson Education, Inc. Chapter 25 Paired Samples and Blocks.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Comparing Two Proportions
Chapter 24 Comparing Means.
Chapter 23 Comparing Means.
Two-Sample Hypothesis Testing
Comparing Two Proportions
Hypothesis tests for the difference between two means: Independent samples Section 11.1.
Chapter 23 Comparing Means.
Comparing Two Proportions
Chapter 24 Comparing Means Copyright © 2009 Pearson Education, Inc.
Presentation transcript:

Chapter 14 Comparing two groups Dr Richard Bußmann

Comparing performances QTM1310/ Sharpe Comparing performances Do customers spend more using their credit card if they are given an incentive such as “double miles” or “double coupons” toward flights, hotel stays, or store purchases? To answer questions such as this, credit card issuers often perform experiments on a sample of customers, making them an offer of an incentive, while other customers receive no offer. By comparing the performance of the two offers on the sample, they can decide whether the new offer would provide enough potential profit if they were to “roll it out” and offer it to their entire customer base. 2

QTM1310/ Sharpe Comparing two means The statistic of interest is the difference in the observed means of the offer and no offer groups: 𝑦 0 − 𝑦 𝑛 What we’d really like to know is the difference of the means in the population at large: 𝜇 0 − 𝜇 𝑛 Now the population model parameter of interest is the difference between the means. 3

QTM1310/ Sharpe Comparing two means In order to tell if a difference, that we observe in the sample means, indicates a real difference in the underlying population means, we’ll need to know the sampling distribution model and standard deviation of the difference. Once we know those, we can build a confidence interval and test a hypothesis, just as we did for a single mean. 4

QTM1310/ Sharpe Comparing two means It’s easy to find the mean and standard deviation of the spend lift (increase in spending) for each of these groups, but that’s not what we want. We need the standard deviation of the difference in their means. For that, we can use a simple rule: If the sample means come from independent samples, the variance of their sum or difference is the sum of their variances. 5

QTM1310/ Sharpe Comparing two means As long as the two groups are independent, we find the standard deviation of the difference between the two sample means by adding their variances and then taking the square root: 𝑆𝐷 𝑦 1 − 𝑦 2 = 𝑉𝑎𝑟 𝑦 1 +𝑉𝑎𝑟 𝑦 2 = 𝜎 1 𝑛 1 2 + 𝜎 2 𝑛 2 2 = 𝜎 1 2 𝑛 1 + 𝜎 2 2 𝑛 2 6

QTM1310/ Sharpe Comparing two means Usually we don’t know the true standard deviations of the two groups, 𝜎 1 and 𝜎 2 , so we substitute the estimates, 𝑠 1 and 𝑠 2 , and find a standard error: 𝑆𝐸 𝑦 1 − 𝑦 2 = 𝑠 1 2 𝑛 1 + 𝑠 2 2 𝑛 2 We’ll use the standard error to see how big the difference really is. 7

QTM1310/ Sharpe Comparing two means Just as for a single mean, the ratio of the difference in the means to the standard error of that difference has a sampling model that follows a Student’s t distribution. The sampling model isn’t really Student’s t, but by using a special, adjusted degrees of freedom value, we can find a Student’s t-model that is so close to the right sampling distribution model that nobody can tell the difference. Since it doesn’t help our understanding, we leave it to the computer or calculator. 8

QTM1310/ Sharpe Comparing two means We can model the standardized sample difference between the means of two independent groups with a Student’s t-model 𝑡= 𝑦 1 − 𝑦 2 −( 𝜇 1 − 𝜇 2 ) 𝑆𝐸( 𝑦 1 − 𝑦 2 ) with 𝑆𝐸 𝑦 1 − 𝑦 2 = 𝑠 1 2 𝑛 1 + 𝑠 2 2 𝑛 2 9

QTM1310/ Sharpe Two sample t-test To construct the hypothesis test for the difference of the means, we start by hypothesizing a value for the true difference of the means. We’ll call that hypothesized difference ∆ 0 . It’s so common for that hypothesized difference to be zero that we often just assume ∆ 0 =0 10

QTM1310/ Sharpe Two sample t-test The two-sample t-test is the ratio of the difference in the means from our samples to its standard error compared to a critical value from a Student’s t-model. Hence 𝐻 0 : 𝜇 1 − 𝜇 2 = ∆ 0 Assuming ∆ 0 =0 So 𝑡= 𝑦 1 − 𝑦 2 − ∆ 0 𝑆𝐸( 𝑦 1 − 𝑦 2 ) with 𝑆𝐸 𝑦 1 − 𝑦 2 = 𝑠 1 2 𝑛 1 + 𝑠 2 2 𝑛 2 When the null holds, the statistic can be modelled by a Student’s t-model with df given by a special formula. The t-ratio (p-value) can then be compared with a 𝑡 𝑐𝑟𝑖𝑡; 𝑑𝑓 11

Assumptions & conditions QTM1310/ Sharpe Assumptions & conditions Independence Assumption The data in each group must be drawn independently and at random from each group’s own homogeneous population or generated by a randomized comparative experiment. Randomization Condition Without randomization of some sort, there are no sampling distribution models and no inference. 10% Condition We usually don’t check this condition for differences of means. We needn’t worry about it at all for randomized experiments. 12

Assumptions & conditions QTM1310/ Sharpe Assumptions & conditions Normal Population Assumption We need the assumption that the underlying populations are each Normally distributed. Nearly Normal Condition We must check this for both groups; a violation by either one violates the condition. Independent Groups Assumption To use the two-sample t methods, the two groups we are comparing must be independent of each other. No statistical test can verify that the groups are independent. You have to think about how the data were collected. 13

Confidence interval for the difference between two means QTM1310/ Sharpe Confidence interval for the difference between two means A hypothesis test really says nothing about the size of the difference. All it says is that the observed difference is large enough that we can be confident it isn’t zero. Given the conditions and assumptions, a two-sample t- interval for the difference between means 𝜇 1 and 𝜇 2 is found using the critical value 𝑡 𝑐𝑟𝑖𝑡; 𝑑𝑓 , given the df and a confidence level. Hence the confidence interval becomes 𝑦 1 − 𝑦 2 ± 𝑡 𝑐𝑟𝑖𝑡; 𝑑𝑓 ∗𝑆𝐸 𝑦 1 − 𝑦 2 𝑤𝑖𝑡ℎ 𝑆𝐸 𝑦 1 − 𝑦 2 = 𝑠 1 2 𝑛 1 + 𝑠 2 2 𝑛 2 14

QTM1310/ Sharpe example A market analyst wants to know if a new website is showing increased page views per visit. A customer is randomly sent to one of two different websites, offering the same products, but with different designs. Given statistics below, find the estimated mean difference in page visits between the two websites. 15

QTM1310/ Sharpe example Given the statistics, find the estimated mean difference in page visits between the two websites. 𝑙𝑒𝑡 𝑑𝑓=163.59, 𝑡ℎ𝑒𝑛 𝑦 1 − 𝑦 2 ± 𝑡 𝑐𝑟𝑖𝑡; 𝑑𝑓 ∗ 𝑠 1 2 𝑛 1 + 𝑠 2 2 𝑛 2 = 7.7−7.3 ±1.9676 4.6 2 80 + 4.3 2 95 =0.4±1.338 =[−0.938,1.738] 16

QTM1310/ Sharpe example A market analyst wants to know if a new website is showing increased page views per visit. A customer is randomly sent to one of two different websites, offering the same products, but with different designs. What does the confidence interval [–0.938, 1,738] say about the null hypothesis that the mean difference in page views from the two websites? 17

QTM1310/ Sharpe example A market analyst wants to know if a new website is showing increased page views per visit. A customer is randomly sent to one of two different websites, offering the same products, but with different designs. What does the confidence interval [–0.938, 1,738] say about the null hypothesis that the mean difference in page views from the two websites? Fail to reject the null hypothesis. Since 0 is in the interval, it is a plausible value for the true difference in means. 18

QTM1310/ Sharpe example A market analyst wants to know if a new website is showing increased page views per visit. A customer is randomly sent to one of two different websites, offering the same products, but with different designs. Given the statistics, find the t-statistic for the observed difference in mean page visits. 𝑔𝑖𝑣𝑒𝑛 𝑑𝑓=163.59; 𝑡= 𝑦 1 − 𝑦 2 𝑠 1 2 𝑛 1 + 𝑠 2 2 𝑛 2 = 7.7−7.3 4.6 2 80 + 4.3 2 90 = 0.4 0.68 =0.588 19

QTM1310/ Sharpe example A market analyst wants to know if a new website is showing increased page views per visit. A customer is randomly sent to one of two different websites, offering the same products, but with different designs. What does the hypothesis test results, t = 0.59 and p– value = 0.5557 say about the null hypothesis that the mean difference in page views from the two websites? Fail to reject the null hypothesis. There is insufficient evidence to conclude a statistically significant mean difference in the number of webpage visits. Is this consistent with the conclusion drawn from the confidence interval? Yes, the conclusion is the same. 20

QTM1310/ Sharpe Pooled t-test If you bought a used camera in good condition from a friend, would you pay the same as you would if you bought the same item from a stranger? One group of people in an experiment was told to imagine buying from a friend whom they expected to see again. The other group was told to imagine buying from a stranger. 21

QTM1310/ Sharpe Pooled t-test Here are the prices they offered for a used camera in good condition. 22

QTM1310/ Sharpe Pooled t-test The usual null hypothesis is that there’s no difference in means and that’s what we’ll use for the camera purchase prices. When we performed the t-test earlier in the chapter, we used an approximation formula that adjusts the degrees of freedom to a lower value. When 𝑛 1 + 𝑛 2 is only 15, as it is here, we don’t really want to lose any degrees of freedom. 23

QTM1310/ Sharpe Pooled t-test If we’re willing to assume that the variances of the groups are equal (at least when the null hypothesis is true), then we can save some degrees of freedom. To do that, we have to pool the two variances that we estimate from the groups into one common, or pooled, estimate of the variance: 𝑠 𝑝𝑜𝑜𝑙𝑒𝑑 2 = 𝑛 1 −1 𝑠 1 2 +( 𝑛 2 −1) 𝑠 2 2 𝑛 1 −1 +( 𝑛 2 −1) 24

QTM1310/ Sharpe Pooled t-test Now we substitute the common pooled variance for each of the two variances in the standard error formula, making the pooled standard error formula simpler: 𝑆𝐸 𝑝𝑜𝑜𝑙𝑒𝑑 𝑦 1 − 𝑦 2 = 𝑠 𝑝𝑜𝑜𝑙𝑒𝑑 2 𝑛 1 + 𝑠 𝑝𝑜𝑜𝑙𝑒𝑑 2 𝑛 2 = 𝑠 𝑝𝑜𝑜𝑙𝑒𝑑 1 𝑛 1 + 1 𝑛 2 The formula for degrees of freedom for the Student’s t- model is simpler, too. Now it’s just: 𝑑𝑓= 𝑛 1 −1 +( 𝑛 2 −1) 25

Equal variance assumption QTM1310/ Sharpe Equal variance assumption To use the pooled t-methods, you’ll need to add the Equal Variance Assumption that the variances of the two populations from which the samples have been drawn are equal. That is 𝜎 1 2 = 𝜎 2 2 The remaining conditions are the same as for the two sample t-test. 26

Pooled t-test We then test the hypothesis 𝐻 0 : 𝜇 1 − 𝜇 2 = ∆ 0 With 𝐻 0 : 𝜇 1 − 𝜇 2 = ∆ 0 With ∆ 0 𝑎𝑙𝑤𝑎𝑦𝑠 𝑎𝑙𝑚𝑜𝑠𝑡=0 Using 𝑡= 𝑦 1 − 𝑦 2 − ∆ 0 𝑆𝐸 𝑝𝑜𝑜𝑙𝑒𝑑 ( 𝑦 1 − 𝑦 2 )

Pooled t-test Where 𝑆𝐸 𝑝𝑜𝑜𝑙𝑒𝑑 𝑦 1 − 𝑦 2 = 𝑠 𝑝𝑜𝑜𝑙𝑒𝑑 1 𝑛 1 + 1 𝑛 2 And 𝑆𝐸 𝑝𝑜𝑜𝑙𝑒𝑑 𝑦 1 − 𝑦 2 = 𝑠 𝑝𝑜𝑜𝑙𝑒𝑑 1 𝑛 1 + 1 𝑛 2 And 𝑠 𝑝𝑜𝑜𝑙𝑒𝑑 2 = 𝑛 1 −1 𝑠 1 2 + 𝑛 2 −1 𝑠 1 2 𝑛 1 −1 + 𝑛 2 −1 If the conditions and asssumtions hold, then the statistics sampling distribution can be modelled with a Student’s t- model with 𝑛 1 −1 + 𝑛 2 −1 =𝑑𝑓

Pooled t-test This gives the pooled-t confidence interval of QTM1310/ Sharpe Pooled t-test This gives the pooled-t confidence interval of 𝑦 1 − 𝑦 2 ± 𝑡 𝑐𝑟𝑖𝑡; 𝑑𝑓 ∗ 𝑆𝐸 𝑝𝑜𝑜𝑙𝑒𝑑 ( 𝑦 1 − 𝑦 2 ) With 𝑡 𝑐𝑟𝑖𝑡; 𝑑𝑓 that depends on the confidence level and degrees of freedom of 𝑛 1 −1 + 𝑛 2 −1 29

When to use a pooled t-test QTM1310/ Sharpe When to use a pooled t-test (Many software packages offer the pooled t-test as the default for comparing means of two groups and require you to specifically request the two-sample t-test (or sometimes the misleadingly named “unequal variance t- test”) as an option.) Be careful when using software to specify the right test. The advantages of pooling are small and pooling is only rarely allowed, i.e. when the condition 𝐻 0 : 𝜇 1 − 𝜇 2 = ∆ 0 holds. As it’s also never wrong not to pool, just don’t pool… 30

QTM1310/ Sharpe Turkey’s quick test To use Tukey’s test, one group must have the highest value, and the other must have the lowest. We count how many values in the high group are higher than all the values of the lower group. Add to this the number of values in the low group that are lower than all the values of the higher group. You can count ties as ½. 31

QTM1310/ Sharpe Turkey’s quick test If the total of these exceedances is 7 or more, we can reject the null hypothesis of equal means at α = 0.05. The “critical values” of 10 and 13 correspond to α’s of 0.01 and 0.001. The only assumption it requires is that the two samples be independent. Tukey’s quick test is not as widely known or accepted as the two-sample t-test, so you still need to know and use the two-sample t. 32

QTM1310/ Sharpe Paired data The independence assumption for the two-sample t-test is crucial. However, quite commonly, we have data on the same cases in two different circumstances. Data such as these are called paired. Paired data commonly occur as “before and after” measurements of some property. Note: You should not use the two-sample (or pooled two- sample) method when the data are paired. © 2010 Pearson Education 33

QTM1310/ Sharpe Paired data Because it is the difference of the pair that is interesting, treat the data as if there was a single variable holding these differences. Create this variable from the data… So, a paired t-test is just a one-sample t-test for the mean of the pairwise differences. Because the paired t-test is mechanically the same as a one-sample t-test (of the pairwise differences), the assumptions and conditions for the paired t-test are the same as for the one-sample t-test. 34

Paired data assumptions & conditions QTM1310/ Sharpe Paired data assumptions & conditions The data must actually be paired. Determine from examining how the data were collected whether the two groups are paired or independent. Independence It is the differences that must be independent of one another. Randomization Condition There is no test for independence. But, in an experiment, the treatments should be randomly assigned to subjects. 35

Paired data assumptions & conditions QTM1310/ Sharpe Paired data assumptions & conditions 10% Condition Also, if the population is finite, then the sample should be no more than 10% of the population. Normal Population Assumption The population of differences should follow a Normal model. Nearly Normal Condition Check the Nearly Normal Condition using a histogram of the differences. 36

Paired t-test The paired t-test is mechanically a one-sample t-test. QTM1310/ Sharpe Paired t-test The paired t-test is mechanically a one-sample t-test. After the conditions are met, we test if the difference in means is significantly different from a hypothesized value. Hence 𝐻 0 : 𝜇 𝑑 = ∆ 0 Where the 𝑑’s are the pairwise differences and ∆ 0 =𝑎𝑙𝑚𝑜𝑠𝑡 𝑎𝑙𝑤𝑎𝑦𝑠 0 37

Paired t-test Use 𝑡 𝑛−1 = 𝑑 − ∆ 0 𝑆𝐸( 𝑑 ) 𝑡 𝑛−1 = 𝑑 − ∆ 0 𝑆𝐸( 𝑑 ) Where 𝑑 is the mean of the pairwise differences and n is the number of pairs and 𝑆𝐸 𝑑 = 𝑠 𝑑 √𝑛 Where 𝑠 𝑑 is the standard deviation of the pairwise differences. If the condition are met and the null is true, the sampling distribution of this statistic is a Student’s t-model with n-1 degrees of freedom. Using this information, you can find p-values.

Paired t-test Subsequently we can find the confidence interval for the mean of the paired differences as 𝑑 ± 𝑡 𝑐𝑟𝑖𝑡; 𝑛−1 ∗𝑆𝐸( 𝑑 ) With 𝑆𝐸 𝑑 = 𝑠 𝑑 √𝑛 By now, you will know that 𝑡 𝑐𝑟𝑖𝑡; 𝑛−1 will depend on the particular confidence level and the degrees of freedom, n-1, based on the number of pairs, n.

QTM1310/ Sharpe example After a new ERP system is installed, the acceleration of the financial close process is measured to gauge the effectiveness of the new system. Data are gathered from 8 companies who reported their average time (in weeks) to financial close before and after implementation of their ERP system. State the conditions to use a paired t-test. Paired data assumption Randomization condition Normal population assumption 40

Example Paired data assumption: Randomization condition: QTM1310/ Sharpe Example Paired data assumption: Data are paired by company. Randomization condition: Assume these 8 companies are representative of all companies who recently implemented the ERP system. Normal population assumption: A histogram is needed to be sure the data could have come from a normal population. 41

QTM1310/ Sharpe example After a new ERP system is installed, the acceleration of the financial close process is measured to gauge the effectiveness of the new system. Given the 95% confidence interval for mean difference [0.053, 1.847], t = 2.50 P-value = 0.041, 𝛼 = 0.05, what is your conclusion to the test of there being a difference before and after implementation of the new ERP system? At 𝛼 = 0.05, reject the null hypothesis (p-value = 0.041). There is sufficient evidence that the average acceleration (in weeks) of the financial close process is different after implementation of a new ERP system. 42

caution Watch out for paired data when using the two-sample t- test. QTM1310/ Sharpe caution Watch out for paired data when using the two-sample t- test. Don’t use individual confidence intervals for each group to test the difference between their means. Look at the plots. Don’t use a paired t-method when the samples aren’t paired. Don’t forget to look for outliers when using paired methods. 43

QTM1310/ Sharpe summary Know how to test whether the difference in the means of two independent groups is equal to some hypothesized value. The two-sample t-test is appropriate for independent groups. It uses a special formula for degrees of freedom. The Assumptions and Conditions are the same as for one- sample inferences for means with the addition of assuming that the groups are independent of each other. The most common null hypothesis is that the means are equal. 44

QTM1310/ Sharpe summary Be able to construct and interpret a confidence interval for the difference between the means of two independent groups. The confidence interval inverts the t-test in the natural way. Know how and when to use pooled t inference methods. There is an additional assumption that the variances of the two groups are equal. This may be a plausible assumption in a randomized experiment. 45

QTM1310/ Sharpe summary Recognize when you have paired or matched samples and use an appropriate inference method. Paired t methods are the same as one-sample t methods applied to the pairwise differences. If data are paired they cannot be independent, so two- sample t and pooled-t methods would not be applicable. 46

QTM1310/ Sharpe summary The standard error for the difference in sample means relies on the assumption that our data come from independent groups. Pooling is usually not the best choice here. We can add variances of independent random variables to find the standard deviation of the difference in two independent means. 47