SUMMARY Hypothesis testing. Self-engagement assesment.

Slides:



Advertisements
Similar presentations
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Advertisements

Inferential Statistics
Chapter 10 Section 2 Hypothesis Tests for a Population Mean
AP Statistics – Chapter 9 Test Review
Confidence Interval and Hypothesis Testing for:
Summary.
Lecture 3 Miscellaneous details about hypothesis testing Type II error
Chapter 8 Hypothesis Testing I. Significant Differences  Hypothesis testing is designed to detect significant differences: differences that did not occur.
Using Statistics in Research Psych 231: Research Methods in Psychology.
MARE 250 Dr. Jason Turner Hypothesis Testing II To ASSUME is to make an… Four assumptions for t-test hypothesis testing: 1. Random Samples 2. Independent.
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
10 Hypothesis Testing. 10 Hypothesis Testing Statistical hypothesis testing The expression level of a gene in a given condition is measured several.
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
Statistical Methods in Computer Science Hypothesis Testing I: Treatment experiment designs Ido Dagan.
T-Tests Lecture: Nov. 6, 2002.
Copyright © 2010 Pearson Education, Inc. Chapter 24 Comparing Means.
Statistical Methods in Computer Science Hypothesis Testing I: Treatment experiment designs Ido Dagan.
Using Statistics in Research Psych 231: Research Methods in Psychology.
Getting Started with Hypothesis Testing The Single Sample.
Probability Population:
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Sample Size Determination Ziad Taib March 7, 2014.
Inferential Statistics
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Choosing Statistical Procedures
Chapter Ten Introduction to Hypothesis Testing. Copyright © Houghton Mifflin Company. All rights reserved.Chapter New Statistical Notation The.
Statistics for the Social Sciences
AM Recitation 2/10/11.
Hypothesis Testing:.
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
Overview of Statistical Hypothesis Testing: The z-Test
Week 9 Chapter 9 - Hypothesis Testing II: The Two-Sample Case.
Descriptive statistics Inferential statistics
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Mid-semester feedback In-class exercise. Chapter 8 Introduction to Hypothesis Testing.
Tuesday, September 10, 2013 Introduction to hypothesis testing.
Education 793 Class Notes T-tests 29 October 2003.
Week 8 Fundamentals of Hypothesis Testing: One-Sample Tests
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
T tests comparing two means t tests comparing two means.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Making decisions about distributions: Introduction to the Null Hypothesis 47:269: Research Methods I Dr. Leonard April 14, 2010.
One-sample In the previous cases we had one sample and were comparing its mean to a hypothesized population mean However in many situations we will use.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Hypothesis Testing A procedure for determining which of two (or more) mutually exclusive statements is more likely true We classify hypothesis tests in.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
1 ConceptsDescriptionHypothesis TheoryLawsModel organizesurprise validate formalize The Scientific Method.
Statistical Inference Statistical Inference involves estimating a population parameter (mean) from a sample that is taken from the population. Inference.
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
1 Chapter 8 Introduction to Hypothesis Testing. 2 Name of the game… Hypothesis testing Statistical method that uses sample data to evaluate a hypothesis.
Experimental Design and Statistics. Scientific Method
AP Statistics Chapter 24 Comparing Means.
1 URBDP 591 A Lecture 12: Statistical Inference Objectives Sampling Distribution Principles of Hypothesis Testing Statistical Significance.
SUMMARY. Central limit theorem Statistical inference If we can’t conduct a census, we collect data from the sample of a population. Goal: make conclusions.
T tests comparing two means t tests comparing two means.
Summary.
Statistical Analysis II Lan Kong Associate Professor Division of Biostatistics and Bioinformatics Department of Public Health Sciences December 15, 2015.
© Copyright McGraw-Hill 2004
WS 2007/08Prof. Dr. J. Schütze, FB GW KI 1 Hypothesis testing Statistical Tests Sometimes you have to make a decision about a characteristic of a population.
SUMMARY Hypothesis testing. Self-engagement assesment.
Comparing Means Chapter 24. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side.
Psych 230 Psychological Measurement and Statistics Pedro Wolf October 21, 2009.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Review Statistical inference and test of significance.
Inferential Statistics Psych 231: Research Methods in Psychology.
Chapter 9 Introduction to the t Statistic
Chapter 9: Hypothesis Tests for One Population Mean 9.5 P-Values.
Presentation transcript:

SUMMARY Hypothesis testing

Self-engagement assesment

Null hypothesis no song song Null hypothesis: I assume that populations without and with song are same. At the beginning of our calculations, we assume the null hypothesis is true.

Hypothesis testing song Because of such a low probability, we interpret 8.2 as a significant increase over 7.8 caused by undeniable pedagogical qualities of the 'Hypothesis testing song'.

Four steps of hypothesis testing 1. Formulate the null and the alternative (this includes one- or two-directional test) hypothesis. 2. Select the significance level α – a criterion upon which we decide that the claim being tested is true or not. --- COLLECT DATA Compute the p-value. The p-value is the probability that the data would be at least as extreme as those observed, if the null hypothesis were true. 4. Compare the p-value to the α-level. If p ≤ α, the observed effect is statistically significant, the null is rejected, and the alternative hypothesis is valid.

One-tailed and two-tailed one-tailed (directional) test two-tailed (non-directional) test Z-critical value, what is it?

NEW STUFF

Decision errors Hypothesis testing is prone to misinterpretations. It's possible that students selected for the musical lesson were already more engaged. And we wrongly attributed high engagement score to the song. Of course, it's unlikely to just simply select a sample with the mean engagement of 8.2. The probability of doing so is , pretty low. Thus we concluded it is unlikely. But it's still possible to have randomly obtained a sample with such a mean mean.

Four possible things can happen Decision Reject H 0 Retain H 0 State of the world H 0 true 13 H 0 false 24 In which cases we made a wrong decision?

Four possible things can happen Decision Reject H 0 Retain H 0 State of the world H 0 true 1 H 0 false 4 In which cases we made a wrong decision?

Four possible things can happen Decision Reject H 0 Retain H 0 State of the world H 0 true Type I error H 0 false Type II error

Type I error When there really is no difference between the populations, random sampling can lead to a difference large enough to be statistically significant. You reject the null, but you shouldn't. False positive – the person doesn't have the disease, but the test says it does

Type II error When there really is a difference between the populations, random sampling can lead to a difference small enough to be not statistically significant. You do not reject the null, but you should. False negative - the person has the disease but the test doesn't pick it up Type I and II errors are theoretical concepts. When you analyze your data, you don't know if the populations are identical. You only know data in your particular samples. You will never know whether you made one of these errors.

The trade-off If you set α level to a very low value, you will make few Type I/Type II errors. But by reducing α level you also increase the chance of Type II error.

Clinical trial for a novel drug Drug that should treat a disease for which there exists no therapy If the result is statistically significant, drug will me marketed. If the result is not statistically significant, work on the drug will cease. Type I error: treat future patients with ineffective drug Type II error: cancel the development of a functional drug for a condition that is currently not treatable. Which error is worse? I would say Type II error. To reduce its risk, it makes sense to set α = 0.10 or even higher. Harvey Motulsky, Intuitive Biostatistics

Clinical trial for a me-too drug Drug that should treat a disease for which there already exists another therapy Again, if the result is statistically significant, drug will me marketed. Again, if the result is not statistically significant, work on the drug will cease. Type I error: treat future patients with ineffective drug Type II error: cancel the development of a functional drug for a condition that can be treated adequately with existing drugs. Thinking scientifically (not commercially) I would minimize the risk of Type I error (set α to a very low value). Harvey Motulsky, Intuitive Biostatistics

Engagement example, n = 30 Z = 0.79 Z = – Statistics

Engagement example, n = 30 Decision Reject H 0 Retain H 0 State of the world H 0 true H 0 false Which of these four quadrants represent the result of our hypothesis test? – Statistics

Engagement example, n = 30 Decision Reject H 0 Retain H 0 State of the world H 0 true X H 0 false Which of these four quadrants represent the result of our hypothesis test?

Engagement example, n = 50 Z = 1.02 Z = – Statistics

Engagement example, n = 50 Decision Reject H 0 Retain H 0 State of the world H 0 true H 0 false Which of these four quadrants represent the result of our hypothesis test? – Statistics

Engagement example, n = 50 Decision Reject H 0 Retain H 0 State of the world H 0 true X H 0 false Which of these four quadrants represent the result of our hypothesis test? – Statistics

population of students that did not attend the musical lesson population of students that did attend the musical lesson parameters are known sample statistic is known

Test statistic test statistic Z-test

New situation An average engagement score in the population of 100 students is 7.5. A sample of 50 students was exposed to the musical lesson. Their engagement score became 7.72 with the s.d. of 0.6. DECISION: Does a musical performance lead to the change in the students' engagement? Answer YES/NO. Setup a hypothesis test, please.

Hypothesis test

Formulate the test statistic but this is unknown! population of students that did attend the musical lesson sample population of students that did not attend the musical lesson known unknown

t-statistic one sample t-test jednovýběrový t-test

t-distribution

One-sample t-test

Quiz

Z-test vs. t-test

Typical example of one-sample t-test

Dependent t-test for paired samples Two samples are dependent when the same subject takes the test twice. paired t-test (párový t-test) This is a two-sample test, as we work with two samples. Examples of such situations: Each subject is assigned to two different conditions (e.g., use QWERTZ keyboard and AZERTY keyboard and compare the error rate). Pre-test … post-test. Growth over time.

Example student 1 student 2 student n no song song

Do the hypothesis test

Dependent samples e.g., give one person two different conditions to see how he/she reacts. Maybe one control and one treatment or two types of treatments. Advantages we can use fewer subjects cost-effective less time-consuming Disadvantages carry-over effects order may influence results

Independent samples

This is true only if two samples are independent!

Independent samples

An example

Summary of t-tests two-sample tests

F-test of equality of variances source: Wikipedia

t-test in R t.test() Let's have a look into R manual: See my website for link to pdf explaining various t-test in R (with examples).

Assumptions 1. Unpaired t-tests are highly sensitive to the violation of the independence assumption. 2. Populations samples come from should be approximately normal. This is less important for large sample sizes. What to do if these assumptions are not fullfilled 1. Use paired t-test 2. Let's see further

Check for normality – histogram

Check for normality – QQ-plot qqnorm(rivers) qqline(rivers)

Check for normality – tests The graphical methods for checking data normality still leave much to your own interpretation. If you show any of these plots to ten different statisticians, you can get ten different answers. H 0 : Data follow a normal distribution. Shapiro-Wilk test shapiro.test(rivers): Shapiro-Wilk normality test data: rivers W = , p-value < 2.2e-16

Nonparametric statistics Small samples from considerably non-normal distributions. non-parametric tests No assumption about the shape of the distribution. No assumption about the parameters of the distribution (thus they are called non-parametric). Simple to do, however their theory is extremely complicated. Of course, we won't cover it at all. However, they are less accurate than their parametric counterparts. So if your data fullfill the assumptions about normality, use paramatric tests (t-test, F-test).

Nonparametric tests If the normality assumption of the t-test is violated, and the sample sizes are too small, then its nonparametric alternative should be used. The nonparametric alternative of t-test is Wilcoxon test. wilcox.test()