Presentation is loading. Please wait.

Presentation is loading. Please wait.

Summary.

Similar presentations


Presentation on theme: "Summary."— Presentation transcript:

1 summary

2 Central limit theorem

3 Statistical inference
If we can’t conduct a census, we collect data from the sample of a population. Goal: make conclusions about that population

4 Confidence interval for 𝑛≥30: 𝑥 ±𝑍× 𝑠 𝑛 for 𝑛<30: 𝑥 ± 𝑡 𝑛−1 × 𝑠 𝑛
critical value kritická hodnota for 𝑛≥30: 𝑥 ±𝑍× 𝑠 𝑛 margin of error možná odchylka for 𝑛<30: 𝑥 ± 𝑡 𝑛−1 × 𝑠 𝑛

5 The margin of error depends on
margin of error=neco× 𝑠 𝑛 The margin of error depends on neco – the confidence level (common is 95%) s or σ – the variability of data the sample size 𝑛 Margin of error does not measure anything else than chance variation. It doesn’t measure any bias or errors that happen during the process. It does not tell anything about the correctness of your data!!!

6 hypothesis testing

7 Aim of hypothesis testing
decision making

8 Engagement self-assessment
Hopefully, you like this course so far. How to measure this? At the scale between 1 and 10, self-report how engaged you think you’re during the lecture (1 is the lowest value, 10 is the highest value).

9 Engagement distribution

10 Engagement distribution
𝜇=7.8 𝜎=0.76

11 Engagement distribution
𝑛=30 𝑀=𝜇=7.8 𝑆𝐸= 𝜎 𝑛 = =0.14 𝜇=7.8 𝜎=0.76 7.8

12 Engagement distribution
𝑛=30 𝑀=𝜇=7.8 𝑆𝐸= 𝜎 𝑛 = =0.14 𝜇=7.8 𝜎=0.76 7.8

13 Hypothesis testing song
𝑛=30 𝑥 =8.2 8.2 1 7.8

14 Did the song help? 𝑛=30, 𝑀=7.8, 𝑆𝐸=0.14
Mean engagement of randomly chosen 30 students from the population of 100 students can lie anywhere on the blue curve. What is the probability of getting the value of at least 8.2? Use Z-tables. 8.2− =2.85=𝑍 corresponding probability is 8.2 1 7.8 𝑛=30, 𝑀=7.8, 𝑆𝐸=0.14

15 Situation on the battlefield
From the no-song population, we have 0.22% chance that randomly sampled data (n=30) will have the mean engagement of 8.2 or more. 30 students that were subjected to the song testing showed mean engagement of 8.2. If there was no difference between no-song and song populations, it is rather unlikely (0.22% is really not that much) that we have chosen 30 students with the mean engagement of 8.2. Conclusion: Because of such a low probability, we interpret 8.2 as a significat increase over 7.8 caused by undeniable pedagogical qualities of the 'Hypothesis testing song'.

16 Hypothesis testing So, where are the hypotheses? Null hypothesis H0
The song did not cause any observable effect. H0 generally – there is no significant difference between current population parameters and new population parameters after some sort of intervention (𝜇= 𝜇 𝐼 ). Alternative (research) hypothesis H1, Ha The song improved students' engagement H1 generally – there is a significant difference between current population parameters and new population parameters after an intervention.

17 Null hypothesis H0 states that nothing happened, there is no change, no difference. Does the song improve students' engagement? H0: students' engagement is the same regardless the song. Does diet coke tastes differently than whole coke? H0: there is no difference in the taste between diet and whole coke. Does this drug increases the blood pressure? H0: this drug causes no increase in the blood pressure. Is this a fair coin? H0: the coin is fair.

18 Alternative hypothesis
The alternative hypothesis states the opposite to the null and is usually the hypothesis you are trying to prove. Does the song improve students' engagement? Ha: song improved students' engagement. Does diet coke tastes differently than whole coke? Ha: diet coke tastes much worse/better than whole coke. Does this drug increases the blood pressure? Ha: this drug leads to increase in the blood pressure. Is this a fair coin? Ha: the coin is unfair.

19 Hypothesis testing Formulate the null hypothesis. There is no statistical testing without the null. You assume that H0 is true. I.e., you assume that the hypothesis song did not influence students' engagement. Then, you must find an evidence for rejecting or not rejecting the null. Collect the data. Population data, no song: μ = 7.8, 𝜎 = 0.76 Sample data, song fired: n = 30, 𝑥 =8.2

20 Hypothesis testing Then, you will calculate the probability of observing your sample results (or more extreme) given that the null hypothesis is true. Null hypothesis which we consider true: teaching with and without song produces the same effect. Higher value in the mean of the song sample ( 𝑥 =8.2) compared to the no-song mean (μ = 7.8) is caused only by the variability in random sampling.

21 Hypothesis testing If there is really no difference between the two teaching methods (with and without song) in the population (i.e., given that the null hypothesis is true), how likely would it be to see a difference in the mean engagement between the two teaching methods as large as (or larger than) that which has been observed in our sample? Given the null is true (both populations are the same), there is only 0.22% chance that the random sampling would lead to the mean engagement of 8.2. The value of is referred to as p-value.

22 Levels of likelihood Conclusion: 8.2 is a statistically significant increase over 7.8. We want a crisp decision, i.e. which probability is still considered 'unlikely' and which is considered as 'likely'?

23 Levels of likelihood - 𝛼 levels
0.05 (5%) 0.01 (1%) 0.001 (0.1%) If the probability of getting sample mean is less than 0.05 – 0.01 – then it is usually considered unlikely. These are called the 𝜶 levels. Or significance levels (hladiny významnosti). 𝛼 level is our criteria for deciding if something is likely or unlikely.

24 Lingo Probability by which we decide if the result is likely or unlikely is called p-value. And we compare the p-value to the α level: if the p-value is less than the α level then such a result is considered to be unlikely. Or, alternatively, we can calculate Z-score of the result and compare it with the Z-score corresponding to the α level (so-called Z-critical value).

25 Z-critical value If the probability of obtaining a particular sample mean is less than alpha level then it will fall in this tail which is called the critical region. Z* Z-critical value If the Z-score of the sample mean is greater than the Z-critical value we have an evidence that this mean is different from the regular population (the population that had not watched the musical lesson).

26 Critical regions What is the Z-critical value for 𝛼=0.05?
Using Z-table you find Z-value for 0.95 probability. Which is 1.65. What is the Z-critical value for 𝛼=0.01? 2.33 What is the Z-critical value for 𝛼=0.001? 3.08

27 We take mean from the sample of size 𝑛. Then we calculate its Z-score
𝑍= 𝑥 −𝜇 𝜎 𝑛 And we get the Z-score of 1.82. We say that this is significant at 𝑝<0.05. 1.82 is somewhere in the red region in the previous picture. It is less than 0.05, but not less than 0.01. It means that a probability of obtaining this sample mean is less than 5%, but is not less than 1%. And remember, 0.05 is the alpha level. Test statistic

28 Significance quiz Z-score Significant at 3.14 p < 2.07 2.57 14.31
𝛼 level Z-critical value 0.05 1.65 0.01 2.32 0.001 3.08 Significance quiz Z-score Significant at 3.14 p < 2.07 2.57 14.31

29 Significance quiz Z-score Significant at 3.14 p < 0.001 2.07
𝛼 level Z-critical value 0.05 1.65 0.01 2.32 0.001 3.08 Significance quiz Z-score Significant at 3.14 p < 0.001 2.07 p < 0.05 2.57 p < 0.01 14.31

30 Sampling distribution
Quiz Sampling distribution Focus on 𝛼=0.05 Which of the following are true, if the null hypothesis were true? If the probability of getting a particular sample mean is less than 𝛼, it is unlikely to occur. If a sample mean has a Z-score greater than Z*, it is “unlikely” to occur. If the probability of getting a particular sample mean is “unlikely”, the sample mean is in he orange region. The alpha level corresponds to the orange region. Z*

31 Another engagement score
After the sample of 30 students heard the „Hypotheses testing song“, their mean engagement score is 7.13. Just to remind you. The population parameters are 𝜇=7.8, 𝜎=0.76 Z-score of this sample mean: 𝑍= 7.13− =−4.83 Z-score is -4.83, what does it mean?

32

33 Two-tailed test (oboustranný test)
mean engagement score of 7.13 is significant at p < 0.05 𝛼=0.05 Z=??

34 One-tailed Two-tailed 𝛼 2 𝛼 2 𝜶 level Z-critical value 0.05 ±1.65 0.01
critical region probability = 𝛼 critical region probability = 𝛼 𝛼 2 𝛼 2 The critical region can also be on the left. 𝜶 level Z-critical value 0.05 ±1.65 0.01 ±2.32 0.001 ±3.08 𝜶 level Z-critical value 0.05 ±1.96 0.01 ±2.57 0.001 ±3.27 In the critical region, we (most likely) did not get sample mean by chance.

35 One-tailed and two-tailed
one-tailed (directional) test two-tailed (non-directional) test

36 One-tailed or two-tailed
In general, we use two-tailed tests. One exception to this general rule is when we’re comparing a new treatment with an established treatment. In such cases we often only care if the new treatment is better than the old one. And we would use a one-tailed directional test.

37 Alternative hypothesis
𝜇≠ 𝜇 𝐼 𝜇< 𝜇 𝐼 𝜇> 𝜇 𝐼 two-tailed test one-tailed test

38 Quiz – reject the null What does it mean to reject the null using two-tailed test? Our sample mean falls within/outside the critical region. The Z-score of our sample mean is less than/greater than the Z-critical value. The probability of obtaining the sample mean is less than/greater than the alpha level.

39 Four steps of hypothesis testing
Formulate the null and the alternative (this includes one- or two-directional test) hypothesis. Select the significance level α – a criterion upon which we decide that the claim being tested is true or not. --- COLLECT DATA --- Compute the p-value. The p-value is the probability that the data would be at least as extreme as those observed, if the null hypothesis were true. Compare the p-value to the α-level. If p ≤ α, the observed effect is statistically significant, the null is rejected, and the alternative hypothesis is valid.


Download ppt "Summary."

Similar presentations


Ads by Google