Quantitative methods and R – (2) LING115 December 2, 2009.

Slides:



Advertisements
Similar presentations
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Advertisements

PSY 307 – Statistics for the Behavioral Sciences Chapter 20 – Tests for Ranked Data, Choosing Statistical Tests.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Variance Chapter 16.
Nonparametric tests and ANOVAs: What you need to know.
Single Sample t-test Purpose: Compare a sample mean to a hypothesized population mean. Design: One group.
Independent Sample T-test Formula
T-tests Part 2 PS1006 Lecture 3
Cross Tabulation and Chi Square Test for Independence.
PSY 307 – Statistics for the Behavioral Sciences
Chapter 14 Conducting & Reading Research Baumgartner et al Chapter 14 Inferential Data Analysis.
BHS Methods in Behavioral Sciences I
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
Lecture 9: One Way ANOVA Between Subjects
One-way Between Groups Analysis of Variance
Business 205. Review Survey Designs Survey Ethics.
Today Concepts underlying inferential statistics
Introduction to Analysis of Variance (ANOVA)
Basic Analysis of Variance and the General Linear Model Psy 420 Andrew Ainsworth.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
ANOVA Chapter 12.
AM Recitation 2/10/11.
Business Research Methods William G. Zikmund Chapter 22: Bivariate Analysis - Tests of Differences.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
PS 225 Lecture 15 Analysis of Variance ANOVA Tables.
Inferential Statistics: SPSS
ANOVA Analysis of Variance.  Basics of parametric statistics  ANOVA – Analysis of Variance  T-Test and ANOVA in SPSS  Lunch  T-test in SPSS  ANOVA.
T-test Mechanics. Z-score If we know the population mean and standard deviation, for any value of X we can compute a z-score Z-score tells us how far.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
- Interfering factors in the comparison of two sample means using unpaired samples may inflate the pooled estimate of variance of test results. - It is.
Descriptive Statistics e.g.,frequencies, percentiles, mean, median, mode, ranges, inter-quartile ranges, sds, Zs Describe data Inferential Statistics e.g.,
One-Way Analysis of Variance Comparing means of more than 2 independent samples 1.
Chapter 11 HYPOTHESIS TESTING USING THE ONE-WAY ANALYSIS OF VARIANCE.
ANOVA One Way Analysis of Variance. ANOVA Purpose: To assess whether there are differences between means of multiple groups. ANOVA provides evidence.
t(ea) for Two: Test between the Means of Different Groups When you want to know if there is a ‘difference’ between the two groups in the mean Use “t-test”.
Biostat 200 Lecture 7 1. Hypothesis tests so far T-test of one mean: Null hypothesis µ=µ 0 Test of one proportion: Null hypothesis p=p 0 Paired t-test:
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
PSY 307 – Statistics for the Behavioral Sciences Chapter 16 – One-Factor Analysis of Variance (ANOVA)
Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.
Copyright © 2004 Pearson Education, Inc.
Exploring Marketing Research William G. Zikmund Chapter 22: Bivariate Statistics- Tests of Differences.
Testing Hypotheses about Differences among Several Means.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
I. Statistical Tests: A Repetive Review A.Why do we use them? Namely: we need to make inferences from incomplete information or uncertainty þBut we want.
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 7 - Sampling Distribution of Means.
© Copyright McGraw-Hill 2000
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Comparing k Populations Means – One way Analysis of Variance (ANOVA)
Testing Differences between Means, continued Statistics for Political Science Levin and Fox Chapter Seven.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.
T Test for Two Independent Samples. t test for two independent samples Basic Assumptions Independent samples are not paired with other observations Null.
PSY 1950 Factorial ANOVA October 8, Mean.
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Section 6.4 Inferences for Variances. Chi-square probability densities.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Research Methods William G. Zikmund Bivariate Analysis - Tests of Differences.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 4 Investigating the Difference in Scores.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Nonparametric statistics. Four levels of measurement Nominal Ordinal Interval Ratio  Nominal: the lowest level  Ordinal  Interval  Ratio: the highest.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Comparing Models.
Repeated Measures ANOVA
Introduction to Statistics
Chapter 13 Group Differences
Some statistics questions answered:
Statistical Inference for the Mean: t-test
Presentation transcript:

Quantitative methods and R – (2) LING115 December 2, 2009

Two sample t-test Check if the means of two samples are different – Calculate the difference between the two means – Normalize it by the standard error Which standard error to use becomes an issue since there are two samples

Two sample t-test – (2) If the two sample variances are roughly the same, pool the two sample variances and then estimate SE – Weighted average of the two variances, each of which is weighted by its degrees of freedom (n-1) df = n a + n b -2

Two sample t-test – (3) If the two sample variances are not the same, SE is estimated by the following formula: Degrees of freedom is calculated differently depending on the sample size – If both samples consist of more than 30 data points, we can use the normal distribution as the distribution of t-scores – If not, estimate the degrees of freedom by the following formula:

Two sample t-test in R $ cd /home/ling115/r $ R

Two sample t-test in R – (2)

Paired t-test Some data make more sense when paired – F1 of a set of vowels from males and females – Difference in frequency of the same set of words between two corpora – Can control for the variation due to the factor by which observations are paired (e.g. what the vowel is) Calculate the difference in score for each pair Run one sample t-test to see if the mean difference value is different from zero

Paired t-test in R

Parametric vs. Nonparametric Parametric test (of statistical significance) – Assumes normal distribution – Data are measured in interval scales – Makes use of parameters such as mean, variance Nonparametric test – Does not assume normal distribution – Knowledge of parameters is not necessary – e.g. Wilcoxon test instead of t-test Shapiro-Wilk test for normality

ANOVA Data points are grouped by a factor with more than two levels – F1 of a set of vowels produced by speakers from five different dialect groups – Difference in frequency of the same set of words among ten corpora The goal of Analysis of Variance is to check if the differences among the means of different groups is greater than the differences among the observations in the data set generally

Variance among groups Calculate the mean for each group Calculate the overall mean of the data pooled from all groups Calculate the squared deviation of each group mean from the overall mean Multiply the squared deviation by the number of data points in each group so that the amount reflects the size of each group Add up the values (SS_group) Normalize SS_group by the degrees of freedom – df = number of groups minus one

Variance within the entire data On first thought, this may be the sample variance But this variance includes variance due to group difference So we want to get the variance of the whole with the variance due to group difference removed

Variance within the entire data – (2) Sum of squares of error (SS_error) – Method 1 Calculate the sum of squared deviations over all data points (SS_total) Subtract the sum of squared deviations due to group difference (SS_group), i.e. SS_total – SS_group – Method 2 Calculate the sum of squared deviations within each group Add up the SS Normalize SS_error by the degrees of freedom – df = number of data points minus the number of groups

ANOVA and F-ratio If variance among groups is different from variance within the entire data, we assume the group means are different F = VAR_group / VAR_error – F = 1 if the two variances are exactly the same – The more F is farther away from 1, the less likely it is that the two variances are the same

F-distribution Probability distribution of ratio of variances – Note that each variance will have its own degrees of freedom F=1 if two variances are the same The farther away F is from 1, the less likely it is that the two variances are the same F-distribution is sensitive to whether the population distribution is normal

F-distribution graph (from wikimedia)

Comparison of variance in R var.test(x,y) We can also calculate F-ratio by var(x)/var(y)

Two factors ANOVA we discussed so far assumed there is a single factor which divides the data points into multiple groups There may be more than one factor – Number of adjectives in each sentence – Length of each sentence

Interaction With two factors, assuming they are meaningful, there are three ways the factors can affect the value of an observation – Factor 1 – Factor 2 – Interaction of factor 1 and factor 2

Repeated measures ANOVA discussed so far assumes the data points are independent from each other As in the case of paired t-test, some data make more sense when observations are matched