Measuring Evidence with p-values

Measuring Evidence with p-values
Section 4.2 Measuring Evidence with p-values

Does drinking tea boost your immune system?
Question of the Day Does drinking tea boost your immune system?

Tea and Immune Response
Participants were randomized to drink five or six cups of either tea (black) or coffee every day for two weeks (both drinks have caffeine but only tea has L-theanine) After two weeks, blood samples were exposed to an antigen, and production of interferon gamma (immune system response) was measured Explanatory variable: tea or coffee Response variable: immune system response Does drinking tea actually boost your immunity? Antigens in tea-Beverage Prime Human Vγ2Vδ2 T Cells in vitro and in vivo for Memory and Non-memory Antibacterial Cytokine Responses, Kamath et.al., Proceedings of the National Academy of Sciences, May 13, 2003.

Tea and the Immune System
If the tea drinkers have enough higher levels of immune system response, can we conclude that drinking tea rather than coffee caused an increase in this aspect of the immune response? Yes No Randomized experiment allows conclusions about causality

Review H0: µT > µC, Ha: µT = µC H0: µT < µC, Ha: µT = µC
µT = mean immune system response after drinking tea µC = mean immune system response after drinking coffee Does drinking tea boost immunity? The relevant hypotheses are: H0: µT > µC, Ha: µT = µC H0: µT < µC, Ha: µT = µC H0: µT = µC, Ha: µT > µC H0: µT = µC, Ha: µT < µC H0: µT = µC, Ha: µT ≠ µC

Tea and Immune System The explanatory variable is tea or coffee, and the response variable is immune system response measured in amount of interferon gamma produced. How could we visualize this data? Bar chart Histogram Side-by-side boxplots Scatterplot One categorical and one quantitative

Tea and Immune System 𝑥 𝑇 − 𝑥 𝐶 =34.82−17.70=17.12

Two Plausible Explanations
Why might the tea drinkers have higher levels of immune system response? Two plausible explanations: Alternative true: Tea drinkers have higher immune system responses than coffee drinkers Null true, random chance: the people who got randomly assigned to the tea group have better immune systems than those who got randomly assigned to the coffee group

The Plausibility of the Null
The goal is determine whether the null hypothesis and random chance are a plausible explanation, given the observed data What kinds of statistics might we get, just by random chance, if the null hypothesis were true?

Actual Experiment 1. Randomize units to treatment groups Tea Coffee R

Actual Experiment Randomize units to treatment groups
Conduct experiment Measure response variable Tea Coffee 5 R 11 R R 13 18 R 20 R R R 3 R 11 R R 15 R 47 R 48 52 R 55 R 56 R R 58 16 R R 21 R 21 38 R R 52

Actual Experiment Randomize units to treatment groups
Conduct experiment Measure response variable Calculate statistic Tea Coffee 5 R 11 R R 13 18 R 20 R R R 3 R 11 R R 15 R 47 R 48 52 R 55 R 56 R R 58 16 R R 21 R 21 38 R R 52

Actual Experiment Two plausible explanations:
Tea boosts immunity Random chance What might happen just by random chance??? Tea Coffee R 5 11 R 13 R R 18 20 R R R 3 R 11 R R 15 47 R R 48 52 R R 55 56 R R 58 16 R R 21 21 R 38 R 52 R

Simulation R R 3 R R 11 R 15 16 R 21 R R 21 38 R 52 R R 5 11 R R 13 R 18 20 R R 47 R 48 52 R R 55 56 R 58 R Tea Coffee R 5 11 R R 13 R 18 R 20 R R R 3 R 11 R 15 R 47 R 48 R 52 R 55 R 56 58 R 16 R 21 R 21 R 38 R 52 R

Simulation 1. Re-randomize units to treatment groups Tea Coffee 3 11
3 11 R 15 16 R 21 R 21 R R 38 52 R R 5 R 11 13 R R 18 R 20 47 R R 48 R 52 55 R R 56 R 58 1. Re-randomize units to treatment groups Tea Coffee 38 R 52 R R 5 R 15 16 R R 21 R 21 13 R 18 R 20 R R 47 R 55 11 R R 48 52 R 56 R 58 R

Simulation Repeat Many Times!
1. Re-randomize units to treatment groups Tea Coffee Use technology (e.g. StatKey or other) to repeat this process many times and build a randomization distribution. 2. Calculate statistic: R R 11 38 R 52 R R 5 3 R 15 R 16 R 21 21 R R 13 R 18 20 R R 47 55 R 11 R R 48 R 52 R 56 58 R

Distribution of Statistic Under H0
How extreme is the observed statistic??? Is the null hypothesis a plausible explanation? (Note: you shouldn’t be able to answer this question quite yet, but should be thinking about why this would or wouldn’t convince you to reject the null as a plausible explanation)

Randomization Distribution
A randomization distribution is a collection of statistics from samples simulated assuming the null hypothesis is true The randomization distribution shows what types of statistics would be observed, just by random chance, if the null hypothesis were true

Green Tea and Prostate Cancer
A study was conducted on 60 men with PIN lesions, some of which turn into prostate cancer Half of these men were randomized to take 600 mg of green tea extract daily, while the other half were given a placebo pill The study was double-blind, neither the participants nor the doctors knew who was actually receiving green tea After one year, only 1 person taking green tea had gotten cancer, while 9 taking the placebo had gotten cancer

The explanatory variable is green tea extract of placebo, the response variable is whether or not the person developed prostate cancer. What statistic and parameter is most relevant? Mean Proportion Difference in means Difference in proportions Correlation Two categorical variables

p1 = proportion of green tea consumers to get prostate cancer p2 = proportion of placebo consumers to get prostate cancer State the null hypotheses. H0: p1 = p2 H0: p1 < p2 H0: p1 > p2 H0: p1 ≠ p2 The null hypothesis always includes an equals sign.

p1 = proportion of green tea consumers to get prostate cancer p2 = proportion of placebo consumers to get prostate cancer State the alternative hypotheses. Ha: p1 = p2 Ha: p1 < p2 Ha: p1 > p2 Ha: p1 ≠ p2 The alternative hypothesis is what the researchers are aiming to prove.

Randomization Test State hypotheses Collect data Calculate statistic:
Simulate statistics that could be observed, just by random chance, if the null hypothesis were true (create a randomization distribution) How extreme is the observed statistic? Is the null hypothesis (random chance) a plausible explanation?

Based on the randomization distribution, would the observed statistic of be extreme if the null hypothesis were true? Yes No

Do you think the null hypothesis is a plausible explanation for these results? Yes No

In a hypothesis test for H0:  = 12 vs Ha:  < 12, we have a sample with n = 45 and 𝑥 = What do we require about the method to produce randomization samples? We need to generate randomization samples assuming the null hypothesis is true.  = 12  < 12 𝑥 =10.2

In a hypothesis test for H0:  = 12 vs Ha:  < 12, we have a sample with n = 45 and 𝑥 =10.2. Where will the randomization distribution be centered? Randomization distributions are always centered around the null hypothesized value. 10.2 12 45 1.8

Randomization Distribution Center
A randomization distribution simulates samples assuming the null hypothesis is true, so A randomization distribution is centered at the value of the parameter given in the null hypothesis.

In a hypothesis test for H0:  = 12 vs Ha:  < 12, we have a sample with n = 45 and 𝑥 =10.2. What will we look for on the randomization distribution? We want to see how extreme the observed statistic is. How extreme 10.2 is How extreme 12 is How extreme 45 is What the standard error is How many randomization samples we collected

In a hypothesis test for H0: 1 = 2 vs Ha: 1 > 2 , we have a sample with 𝑥 1 =26, 𝑥 2 =21. What do we require about the method to produce randomization samples? We need to generate randomization samples assuming the null hypothesis is true. 1 = 2 1 > 2 𝑥 1 =26, 𝑥 2 =21 𝑥 1 − 𝑥 2 =5

In a hypothesis test for H0: 1 = 2 vs Ha: 1 > 2 , we have a sample with 𝑥 1 =26, 𝑥 2 =21. Where will the randomization distribution be centered? The randomization distribution is centered around the null hypothesized value, 1 - 2 = 0 1 21 26 5

In a hypothesis test for H0: 1 = 2 vs Ha: 1 > 2 , we have a sample with 𝑥 1 =26, 𝑥 2 =21. What do we look for on the randomization distribution? We want to see how extreme the observed difference in means is. The standard error The center point How extreme 26 is How extreme 21 is How extreme 5 is

Back to Tea vs Coffee… What do we do when the “extremity” of the observed statistic isn’t obvious? We need a formal way of measuring how extreme a statistic would be, if H0 were true…

p-value The p-value is the proportion of samples, when the null hypothesis is true, that would give a statistic as extreme as (or more extreme than) the observed sample.

Tea vs Coffee Distribution of statistic if H0 true Proportion as extreme as observed statistic p-value Demo this on StatKey first observed statistic If there is no difference between tea and coffee regarding immunity, we would only see results this extreme 26 out of 1000 times

Calculating a p-value What kinds of statistics would we get, just by random chance, if the null hypothesis were true? (randomization distribution) What proportion of these statistics are as extreme as our original sample statistic? (p-value)

Distribution of statistic if H0 true
Green Tea Supplements Distribution of statistic if H0 true p-value Demo this on StatKey first observed statistic If green tea supplements do not help prevent cancer, the chance of seeing results this extreme is only (or 1 out of 2000 samples).

p-value Use the randomization distribution below to test
H0 :  = vs Ha :  > 0 Match the sample statistics: r = 0.1, r = 0.3, and r = 0.5 With the p-values: , , and Which sample statistic goes with which p-value? r = 0.1 with p = 0.5, r = 0.3 with 0.15, and r = 0.5 with p = Emphasize that as the sample statistic gets farther in the tail, the p-value gets smaller.

Alternative Hypothesis
Tea versus coffee: Ha: µT > µC Green tea: Ha: p1 < p2 UPPER TAIL LOWER TAIL

Alternative Hypothesis
A one-sided alternative contains either > or < A two-sided alternative contains ≠ The p-value is the proportion in the tail in the direction specified by Ha For a two-sided alternative, the p-value is twice the proportion in the smallest tail

Tea versus Coffee Ha: µT = µC Ha: µT < µC Ha: µT > µC
In the tea versus coffee example, suppose instead of asking whether tea boosts immunity, the study was designed to investigate whether tea or coffee is better for the immune system. State the alternative hypothesis: Ha: µT = µC Ha: µT < µC Ha: µT > µC Ha: µT ≠ µC No specific direction is specified in the question of interest.

Tea versus Coffee p-value = 2 x = 0.052 When Ha contains ≠, the p-value is twice the proportion in the smallest tail (In StatKey, you can equivalently click Two-Tail and add the two tails)

p-value and Ha H0:  = 0 Upper-tail Ha:  > 0 (Right Tail) 𝑥 =2
𝑥 =−1 Lower-tail (Left Tail) H0:  = 0 Ha:  ≠ 0 𝑥 =2 Two-tailed

Warning: Check Order of Groups!
The p-value can be calculated based on the direction of the alternative hypothesis, as long as the order in Ha matches the order when the statistic is calculated! As a check, remember that if the data support the alternative hypothesis, the p-value of a one- sided test should not be more than 0.5!

Summary The randomization distribution shows what types of statistics would be observed, just by random chance, if the null hypothesis were true A p-value is the chance of getting a statistic as extreme as that observed, if H0 is true A p-value can be calculated as the proportion of statistics in the randomization distribution as extreme as (or more extreme than) the observed sample statistic

Measuring Evidence with p-values

Similar presentations

Presentation on theme: "Measuring Evidence with p-values"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Measuring Evidence with p-values

Similar presentations

Presentation on theme: "Measuring Evidence with p-values"— Presentation transcript:

Similar presentations

About project

Feedback