Bivariate Testing (ttests and proportion tests)

Bivariate Testing (ttests and proportion tests)
HMI 7530– Programming in R STATISTICS MODULE: Bivariate Testing (ttests and proportion tests) Jennifer Lewis Priestley, Ph.D. Kennesaw State University 1

STATISTICS MODULE Basic Descriptive Statistics and Confidence Intervals Basic Visualizations Histograms Pie Charts Bar Charts Scatterplots Ttests One Sample Paired Independent Two Sample Proportion Testing ANOVA Chi Square and Odds Regression Basics 2 2 2

STATISTICS MODULE A side note of interest from Wikipedia:
The t-statistic was introduced in 1908 by William Sealy Gosset, a chemist working for the Guiness Brewery in Dublin, Ireland. Gosset had been hired due to Claude Guinness's innovative policy of recruiting the best graduates from Oxford and Cambridge to apply biochemistry and statistics to Guinness' industrial processes. Gosset devised the t-test as a way to cheaply monitor the quality of beer. He published the test in Biometrika in 1908, but was forced to use a pen name by his employer, who regarded the fact that they were using statistics as a trade secret. 3

STATISTICS MODULE: Bivariate Testing
Ttests take three forms: One Sample Ttest - compares the mean of the sample to a given number. e.g. Is average monthly revenue per customer >$50 ? Formal Hypothesis Statement examples: H0:   $50 H1:  > $50 H0:  = $50 H1:   $50 4

#here, the syntax looks like this – One sample, two sided, confidence level at 95%, tested against a designated value: t.test(vector, alternative=c("two.sided"), mu=55, conf.level=0.95) One sample, one sided, confidence level at 99%, tested against a designated value: t.test(vector, alternative=c("greater"), mu=55, conf.level=0.99) 5

#here, the output looks like this – One Sample t-test data: Activity t = , df = 39, p-value = alternative hypothesis: true mean is not equal to 55 95 percent confidence interval: sample estimates: mean of x 59.3 6

#note that you can also execute ttests by group…this is NOT a two sample test but rather two one sample tests – t.test(Activity[Group=="NORMAL"], mu=55, alternative = "two.sided", conf.level = 0.99) t.test(Activity[Group=="HYPER"], mu=55, alternative = "two.sided", conf.level = 0.99) 7

2. Paired Sample Ttest - compares the mean of the differences in the observations to a given number. e.g. Is there a difference in the production output of a facility after the implementation of new procedures? Formal Hypothesis Statement example: H0: diff = 0 H1: diff  0 8

# here the syntax looks like this: t.test(vector1, vector2, paired = TRUE, conf.level = 0.90) #and the output looks like this: Paired t-test data: WidgeOne$Post_Training_Productivity and WidgeOne$Pre_Training_Productivity t = , df = 39, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of the differences 9

Note that mathematically, the one sample ttest and the paired ttest are almost the same. Therefore, we can do this: WidgeOne$Diff <- WidgeOne$Post_Training_Productivity - WidgeOne$Pre_Training_Productivity mean(WidgeOne$Diff) t.test(WidgeOne$Diff, conf.level = 0.95) 10

3. Two Sample Ttest - compares the mean of the first sample minus the mean of the second sample to a given number. e.g. Is there a difference in the production output of two facilities? Formal Hypothesis Statement examples: H0: a - b = 0 H1: a - b  0 H0: a - b < 0 H1: a - b > 0 11

When dealing with two sample, it is important to check the following assumptions: The samples are independent The samples have approximately equal variance The distribution of each sample is approximately normal Note – if the assumptions are violated and/or if the sample sizes are very small, we first try a transformation (e.g., take the log or the square root). If this does not work, then we engage in non-parametric analysis: Wilcoxon Rank Sum test (or Mann Whitney). 12

#here the code looks like this: t.test(Activity ~ Drug, alternative = "two.sided", conf.level = 0.90) And the output looks like this: data: Activity by Group t = , df = , p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean in group HYPER mean in group NORMAL 13

Proportion testing works effectively the same way as ttesting – the main difference is that you need to use the Chi Square distribution because there is no estimateable standard deviation. 14

One Sample Proportion Test - compares the proportion of the sample to a given number. e.g. Is the proportion of students who believe in love at first sight greater than 50%? H0: p  0.50 H1: p > 0.50 15

#The code here takes a bit of work… Table object1<-table(factor) Sum(object1) Prop.test(object1[factor level],totaln, correct=FALSE, p= null hypothesis) Example: loveatfirst.count <- table(PSU$atfirst) prop.test(loveatfirst.count[3], sum(loveatfirst.count), correct=FALSE, p=0.50) Note that the “3” indicates the third level of the factor – which is “Yes”. 16

The output looks like this: data: grtpers.count[2] out of sum(grtpers.count), null probability 0.5 X-squared = , df = 1, p-value = alternative hypothesis: true p is greater than 0.5 95 percent confidence interval: sample estimates: p 17

2. Two Sample Proportion Test - compares the proportion of the first sample minus the proportion of the second sample to a given number. It is of common interest to test of two population proportions are equal. e.g. Is the proportion of students who believe in love at first sight different by gender? 18 18

#basically, you need to create a table, and then execute the prop.test function: sex.by.grtpers.count<-table(PSU3b$Sex,(droplevels(PSU3b)$grtpers)) #note that this will compare the Female % No to the Male % No #this is because "no" is in the first column prop.test(sex.by.grtpers.count, correct=FALSE) data: sex.by.grtpers.count X-squared = , df = 1, p-value = alternative hypothesis: two.sided 95 percent confidence interval: sample estimates: prop 1 prop 2 19

Bivariate Testing (ttests and proportion tests)

Similar presentations

Presentation on theme: "Bivariate Testing (ttests and proportion tests)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Bivariate Testing (ttests and proportion tests)

Similar presentations

Presentation on theme: "Bivariate Testing (ttests and proportion tests)"— Presentation transcript:

Similar presentations

About project

Feedback