Summary.

Slides:



Advertisements
Similar presentations
Hypothesis Testing Steps in Hypothesis Testing:
Advertisements

PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
ANOVA example 4 Polychlorinated biphenyls (PCBs) previously used in the manufacture of large electrical transformers and capacitors, are extremely hazardous.
PSY 307 – Statistics for the Behavioral Sciences Chapter 20 – Tests for Ranked Data, Choosing Statistical Tests.
Chapter Seventeen HYPOTHESIS TESTING
Independent Sample T-test Formula
Part I – MULTIVARIATE ANALYSIS
Using Statistics in Research Psych 231: Research Methods in Psychology.
MARE 250 Dr. Jason Turner Hypothesis Testing II To ASSUME is to make an… Four assumptions for t-test hypothesis testing: 1. Random Samples 2. Independent.
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
Analysis of Variance. Experimental Design u Investigator controls one or more independent variables –Called treatment variables or factors –Contain two.
MARE 250 Dr. Jason Turner Hypothesis Testing III.
Lecture 9: One Way ANOVA Between Subjects
Copyright © 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide Are the Means of Several Groups Equal? Ho:Ha: Consider the following.
Analysis of Variance Chapter 3Design & Analysis of Experiments 7E 2009 Montgomery 1.
One-way Between Groups Analysis of Variance
Inferences About Process Quality
Independent Sample T-test Classical design used in psychology/medicine N subjects are randomly assigned to two groups (Control * Treatment). After treatment,
Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan.
Richard M. Jacobs, OSA, Ph.D.
Inferential Statistics
Choosing Statistical Procedures
Analysis of Variance (ANOVA) Quantitative Methods in HPELS 440:210.
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
AM Recitation 2/10/11.
F-Test ( ANOVA ) & Two-Way ANOVA
ANALYSIS OF VARIANCE. Analysis of variance ◦ A One-way Analysis Of Variance Is A Way To Test The Equality Of Three Or More Means At One Time By Using.
PS 225 Lecture 15 Analysis of Variance ANOVA Tables.
STAT 3130 Statistical Methods I Session 2 One Way Analysis of Variance (ANOVA)
1 STATISTICAL HYPOTHESES AND THEIR VERIFICATION Kazimieras Pukėnas.
Repeated Measures ANOVA
RMTD 404 Lecture 8. 2 Power Recall what you learned about statistical errors in Chapter 4: Type I Error: Finding a difference when there is no true difference.
One-Way Analysis of Variance Comparing means of more than 2 independent samples 1.
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
9 Mar 2007 EMBnet Course – Introduction to Statistics for Biologists Nonparametric tests, Bootstrapping
SUMMARY Hypothesis testing. Self-engagement assesment.
Sociology 5811: Lecture 14: ANOVA 2
© Copyright McGraw-Hill CHAPTER 12 Analysis of Variance (ANOVA)
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
PSY 307 – Statistics for the Behavioral Sciences Chapter 16 – One-Factor Analysis of Variance (ANOVA)
ANOVA (Analysis of Variance) by Aziza Munir
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Analysis of Variance.
Between-Groups ANOVA Chapter 12. >When to use an F distribution Working with more than two samples >ANOVA Used with two or more nominal independent variables.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Hypothesis Testing A procedure for determining which of two (or more) mutually exclusive statements is more likely true We classify hypothesis tests in.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Lecture 9-1 Analysis of Variance
Chapter 17 Comparing Multiple Population Means: One-factor ANOVA.
Experimental Research Methods in Language Learning Chapter 10 Inferential Statistics.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Comparing k Populations Means – One way Analysis of Variance (ANOVA)
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
ANOVA P OST ANOVA TEST 541 PHL By… Asma Al-Oneazi Supervised by… Dr. Amal Fatani King Saud University Pharmacy College Pharmacology Department.
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
 Assumptions are an essential part of statistics and the process of building and testing models.  There are many different assumptions across the range.
Handout Six: Sample Size, Effect Size, Power, and Assumptions of ANOVA EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr.
Chapter 14: Analysis of Variance One-way ANOVA Lecture 9a Instructor: Naveen Abedin Date: 24 th November 2015.
Model adequacy checking in the ANOVA Checking assumptions is important –Normality –Constant variance –Independence –Have we fit the right model? Later.
Hypothesis Tests u Structure of hypothesis tests 1. choose the appropriate test »based on: data characteristics, study objectives »parametric or nonparametric.
Quantitative methods and R – (2) LING115 December 2, 2009.
I231B QUANTITATIVE METHODS Analysis of Variance (ANOVA)
Statistics (cont.) Psych 231: Research Methods in Psychology.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Inferential Statistics Psych 231: Research Methods in Psychology.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
The 2 nd to last topic this year!!.  ANOVA Testing is similar to a “two sample t- test except” that it compares more than two samples to one another.
Part Four ANALYSIS AND PRESENTATION OF DATA
P-value Approach for Test Conclusion
Presentation transcript:

summary

Homoscedasticity http://blog.minitab.com/blog/statistics-and-quality-data-analysis/dont-be-a-victim-of-statistical-hippopotomonstrosesquipedaliophobia

Tests for homoscedasticity 𝐻 0 : 𝜎 1 = 𝜎 2 F-test of equality of variances (Hartley's test), 𝐹 𝑛 𝐿 −1, 𝑛 𝑆 −1 𝑛 𝐿 −1, 𝑛 𝑆 −1 = 𝑠 𝐿 2 𝑠 𝑆 2 F-test is extremely sensitive to the departures from the normality. Alternative test is the Levene's test – performed over absolute values of the deviations from the mean, test statistic distribution: F-distribution

Power of the test A probability that it correctly rejects the null hypothesis (H0) when it is false. Equivalently, it is the probability of correctly accepting the alternative hypothesis (Ha) when it is true - that is, the ability of a test to detect an effect, if the effect actually exists. Probability of FP is α Decision Reject H0 Retain H0 State of the world H0 true Type I error H0 false Type II error Probability of FN is β power = 1 - β

What factors affect the power? To increase the power of your test, you may do any of the following: Increase the effect size (the difference between the null and alternative values) to be detected The reasoning is that any test will have trouble rejecting the null hypothesis if the null hypothesis is only 'slightly' wrong. If the effect size is large, then it is easier to detect and the null hypothesis will be soundly rejected. Increase the sample size(s) – power analysis Decrease the variability in the sample(s) Increase the significance level (α) of the test The shortcoming of setting a higher α is that Type I errors will be more likely. This may not be desirable.

new stuff

Effect size When a difference is statistically significant, it does not necessarily mean that it is big, important or helpful in decision-making. It simply means you can be confident that there is a difference. For example, you evaluate the effect of sun erruptions on student knowledge (𝑛=2000). The mean score on the pretest was 84 out of 100. The mean score on the posttest was 83. Although you find that the difference in scores is statistically significant (because of a large sample size), the difference is very small suggesting that erruptions do not lead to a meaningful decrease in student knowledge.

Effect size To know if an observed difference is not only statistically significant, but also factually important, you have to calculate its effect size. The effect size in our case is 84 – 83 = 1. The effect size is transformed on a common scale by standardizing (i.e., the difference is divided by a s.d.).

Power analysis To ensure that your sample size is big enough, you will need to conduct a power analysis. For any power calculation, you will need to know: What type of test you plan to use (e.g., independent t-test) The alpha value (usually 0.05) The expected effect size The sample size you are planning to use Because the effect size can only be calculated after you collect data, you will have to use an estimate for the power analysis. Cohen suggests that for t-test values of 0.2, 0.5, and 0.8 represent small, medium and large effect sizes respectively.

Power analysis in R (paired t-test) install.packages("pwr") library(pwr) pwr.t.test(d=0.8,power=0.8,sig.level=0.05,type="paired",alternative="two.sided") Paired t test power calculation n = 14.30278 d = 0.8 sig.level = 0.05 power = 0.8 alternative = two.sided NOTE: n is number of *pairs*

Check for normality – histogram

Check for normality – QQ-plot qqnorm(rivers) qqline(rivers)

Check for normality – tests The graphical methods for checking data normality still leave much to your own interpretation. If you show any of these plots to ten different statisticians, you can get ten different answers. H0: Data follow a normal distribution. Shapiro-Wilk test > shapiro.test(rivers) Shapiro-Wilk normality test data: rivers W = 0.6666, p-value < 2.2e-16

p-value < 2.2e-16 log p-value = 3.945e-05

Nonparametric statistics Small samples from considerably non-normal distributions. non-parametric tests No assumption about the shape of the distribution. No assumption about the parameters of the distribution (thus they are called non-parametric). Simple to do, however their theory is extremely complicated. Of course, we won't cover it at all. However, they are less accurate than their parametric counterparts. So if your data fullfill the assumptions about normality, use paramatric tests (t-test, F-test).

Nonparametric tests If the normality assumption of the t-test is violated, then its nonparametric alternative should be used. The nonparametric alternative of t-test is Wilcoxon test. wilcox.test() http://stat.ethz.ch/R-manual/R-patched/library/stats/html/wilcox.test.html

anova (analýza rozptylu)

A problem You're comparing three brands of beer.

A problem You buy four bottles of each brand for the following prices. What do you think, which of these brands have significantly different prices? No significant difference between any of these. Primátor and Kocour Primátor and Matuška Kocour and Matuška Primátor Kocour Matuška 15 39 65 12 45 14 48 32 11 60 38

t-test We can do three t-tests to show if there is a significant difference between these brands. How many t-tests would you need to compare four samples? 6 To compare 10 samples, you need 45 t-tests! This is a lot. We don’t want to do a million t-tests. But in this lesson you'll learn a simpler method. Its called Analysis of variance (Analýza rozptylu) – ANOVA.

Multiple comparisons problem If you make two comparisons and assuming that both null hypothesis are true, what is the chance that both comparisons will not be statistically significant (𝛼=0.5)? 0.95×0.95=0.9025 And what is the chance that one or both comparisons will result in a statistically significant conclusion just by chance? 1.0−0.9025=0.0975 ~ 10% For N comparisons, this probability is generally 1.00 − 0.95𝑁. So, for example, for 13 independent tests there is about 50:50 chance of obtaining at least one FP.

Multiple comparisons problem Bennet et al., Journal of Serendipitous and Unexpected Results, 1, 1-5, 2010 http://www.graphpad.com/guides/prism/6/statistics/index.htm?beware_of_multiple_comparisons.htm

Correcting for multiple comparisons Bonferroni correction – the simplest approach is to divide the α value by the number of comparisons N. Then define the particular comparison as statistically significant when its p-value is less than 𝛼/𝑁. For example, for 100 comparisons reject the null in each if its p-value is less than 0.05 100= 0.0005. However, this is a bit too conservative, other approaches exist. > p.adjust() “There seems no reason to use the unmodified Bonferroni correction because it is dominated by Holm's method”

Main idea of ANOVA To compate three or more samples, we can use the same ideas that underlie t-tests. In t-test, the general form of t-statistic is 𝑡= 𝑥 1 − 𝑥 2 𝑆𝐸 Similarly, for three or more samples we assess the variability between sample means in numerator and the error (variability within samples) in denominator. Variability between sample means Error, variability within samples

Variability between sample means

Variability within samples

ANOVA hypothesis 𝐻 0 : 𝜇 1 = 𝜇 2 = 𝜇 3 𝐻 0 : 𝜇 1 = 𝜇 2 = 𝜇 3 𝐻 1 : at least one pair of samples is significantly different Follow-up multiple comparison steps – see which means are different from each other.

F ratio 𝐹= between−group variability within−group variability As between-group variability (variabilita mezi skupinami) increases, F-statistic increases and this leans more in favor of the alternative hypothesis that at least one pair of means is significantly different. As within-group variability (variabilita v rámci skupin) increases, F-statistic decreases and this leans more in favor of the null hypothesis that the means are not siginificantly different. 𝐹= between−group variability within−group variability

Beer brands – a boxplot 𝑥 𝑃 𝑥 𝐺 𝑥 𝐾 𝑥 𝑀 13 35 45 48 Primátor Kocour Matuška 15 39 65 12 45 14 48 32 11 60 38 13 35 45 48 𝑥 𝑃 𝑥 𝐺 𝑥 𝐾 𝑥 𝑀

Between-group variability SS – sum of squares, součet čtverců MS – mean square, průměrný čtverec SSB – součet čtverců mezi skupinami MSB – průměrný čtverec mezi skupinami 𝑥 𝑃 − 𝑥 𝐺 2 𝑥 𝑀 − 𝑥 𝐺 2 𝑆𝑆𝐵= 𝑘 𝑛 𝑘 𝑥 𝑘 − 𝑥 𝐺 2 𝑑 𝑓 𝐵 =𝑘−1 𝑀𝑆𝐵= 𝑘 𝑛 𝐾 𝑥 𝑘 − 𝑥 𝐺 2 𝑘−1 𝑥 𝐾 − 𝑥 𝐺 2 13 35 45 48 𝑥 𝑃 𝑥 𝐺 𝑥 𝐾 𝑥 𝑀

Within-group variability SSW – součet čtverců uvnitř skupin MSW – průměrný čtverec uvnitř skupin 𝑀𝑆𝑊= 𝑆𝑆𝑊 𝑑 𝑓 𝑊 = 𝑘 𝑥 𝑖 − 𝑥 𝑘 2 𝑁−𝑘

The summary of variabilities 𝑀𝑆𝑊= 𝑆𝑆𝑊 𝑑 𝑓 𝑊 = 𝑘 𝑥 𝑖 − 𝑥 𝑘 2 𝑁−𝑘 𝑀𝑆𝐵= 𝑆𝑆𝐵 𝑑 𝑓 𝐵 = 𝑘 𝑥 𝑘 − 𝑥 𝐺 2 𝑘−1 Primátor Kocour Matuška 15 39 65 12 45 14 48 32 11 60 38 𝑥 𝑖 ... value of each data point 𝑥 𝑘 ... sample mean 𝑁 ... total number of data points 𝑘 ... number of samples 𝑛 𝐾 ... number of data points in each sample 𝑥 𝐺 ... grand mean

𝐹 𝑑 𝑓 𝐵 , 𝑑 𝑓 𝑊 = 𝑀𝑆𝐵 𝑀𝑆𝑊 𝑑 𝑓 𝐵 =𝑘−1 𝑑 𝑓 𝑊 =𝑁−𝑘 F-ratio 𝐹 𝑑 𝑓 𝐵 , 𝑑 𝑓 𝑊 = 𝑀𝑆𝐵 𝑀𝑆𝑊 𝑑 𝑓 𝐵 =𝑘−1 𝑑 𝑓 𝑊 =𝑁−𝑘

F-distribution

F distribution

Beer prices 𝑀𝑆𝐵= 𝑆𝑆𝐵 𝑑 𝑓 𝐵 =1505.3 𝑀𝑆𝑊= 𝑆𝑆𝑊 𝑑 𝑓 𝑊 =95.78 𝑥 𝑘 𝑀𝑆𝐵= 𝑆𝑆𝐵 𝑑 𝑓 𝐵 =1505.3 Primátor Kocour Matuška 15 39 65 12 45 14 48 32 11 60 38 13 𝑀𝑆𝑊= 𝑆𝑆𝑊 𝑑 𝑓 𝑊 =95.78 𝑥 𝑘 𝑥 𝐺 =35.33 𝑆𝑆𝐵=𝑛 𝑘 𝑥 𝑘 − 𝑥 𝐺 2 =3011 𝑑 𝑓 𝐵 =𝑘−1=2 𝑆𝑆𝑊= 𝑘 𝑥 𝑖 − 𝑥 𝑘 2 =862 𝑑 𝑓 𝑊 =𝑁−𝑘=9 𝐹 2,9 = 𝑀𝑆𝐵 𝑀𝑆𝑊 =15.72 𝐹 2,9 ∗ =4.25

F9,2 F2,9

Beer brands – ANOVA