Ttests Programming in R. The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion.

Slides:



Advertisements
Similar presentations
Introduction to the t Statistic
Advertisements

Machine Learning Group University College Dublin Evaluation in Machine Learning Pádraig Cunningham.
5/15/2015Slide 1 SOLVING THE PROBLEM The one sample t-test compares two values for the population mean of a single variable. The two-sample test of a population.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Significance Testing Chapter 13 Victor Katch Kinesiology.
Lecture 10 Non Parametric Testing STAT 3120 Statistical Methods I.
Lecture 4 t-Tests. History (from Wikipedia) Introduced in 1908 by William Sealy Gosset, a chemist working for the Guinness brewery in Dublin, Ireland.
Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.
10 Hypothesis Testing. 10 Hypothesis Testing Statistical hypothesis testing The expression level of a gene in a given condition is measured several.
Final Jeopardy $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 LosingConfidenceLosingConfidenceTesting.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Independent Samples and Paired Samples t-tests PSY440 June 24, 2008.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
T-Tests Lecture: Nov. 6, 2002.
EXPERIMENTAL DESIGN Random assignment Who gets assigned to what? How does it work What are limits to its efficacy?
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
S519: Evaluation of Information Systems
15-1 Introduction Most of the hypothesis-testing and confidence interval procedures discussed in previous chapters are based on the assumption that.
Getting Started with Hypothesis Testing The Single Sample.
Lecture 4 Ttests STAT 3120 Statistical Methods I.
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
Chapter 10 Hypothesis Testing
STAT 3130 Statistical Methods I Session 2 One Way Analysis of Variance (ANOVA)
Confidence Intervals and Hypothesis Testing - II
II.Simple Regression B. Hypothesis Testing Calculate t-ratios and confidence intervals for b 1 and b 2. Test the significance of b 1 and b 2 with: T-ratios.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Section #4 October 30 th Old: Review the Midterm & old concepts 1.New: Case II t-Tests (Chapter 11)
T-test Mechanics. Z-score If we know the population mean and standard deviation, for any value of X we can compute a z-score Z-score tells us how far.
NONPARAMETRIC STATISTICS
Warm-up 9.1 Confidence Interval of the Mean. Answers to H.W. 8.2 E#26 – 32 and E#34 H A : The proportion of students wearing backpacks is not 60%.
1 Introduction to Hypothesis Testing. 2 What is a Hypothesis? A hypothesis is a claim A hypothesis is a claim (assumption) about a population parameter:
Hypothesis Testing of Proportions INCM 9102 Quantitative Methods.
Hypothesis Testing A procedure for determining which of two (or more) mutually exclusive statements is more likely true We classify hypothesis tests in.
Large sample CI for μ Small sample CI for μ Large sample CI for p
1 Nonparametric Statistical Techniques Chapter 17.
Jeopardy Hypothesis Testing t-test Basics t for Indep. Samples Related Samples t— Didn’t cover— Skip for now Ancient History $100 $200$200 $300 $500 $400.
PSY 1950 t-tests, one-way ANOVA October 1, vs.
Confidence Intervals Lecture 3. Confidence Intervals for the Population Mean (or percentage) For studies with large samples, “approximately 95% of the.
Chap 8-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 8 Introduction to Hypothesis.
Chapter 9 Introduction to the t Statistic. 9.1 Review Hypothesis Testing with z-Scores Sample mean (M) estimates (& approximates) population mean (μ)
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
Lecture 5 TtestsAbout Proportions STAT 3120 Statistical Methods I.
Chapter 10 The t Test for Two Independent Samples
MTH3003 PJJ SEM II 2014/2015 F2F II 12/4/2015.  ASSIGNMENT :25% Assignment 1 (10%) Assignment 2 (15%)  Mid exam :30% Part A (Objective) Part B (Subjective)
T Test for Two Independent Samples. t test for two independent samples Basic Assumptions Independent samples are not paired with other observations Null.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Handout Six: Sample Size, Effect Size, Power, and Assumptions of ANOVA EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr.
T tests comparing two means t tests comparing two means.
Ttests INCM 9102 Quantitative Methods. Ttests The term “Ttest” comes from the application of the t-distribution to evaluate a hypothesis. Note: a “t-statistic”
Introduction to the t statistic. Steps to calculate the denominator for the t-test 1. Calculate variance or SD s 2 = SS/n-1 2. Calculate the standard.
Nonparametric Statistics - Dependent Samples How do we test differences from matched pairs of measurement data? If the differences are normally distributed,
Lecture 5 Tests About Proportions STAT 3120 Statistical Methods I.
Hypothesis Testing Involving One Population Chapter 11.4, 11.5, 11.2.
1 Nonparametric Statistical Techniques Chapter 18.
T-TEST. Outline  Introduction  T Distribution  Example cases  Test of Means-Single population  Test of difference of Means-Independent Samples 
1 Underlying population distribution is continuous. No other assumptions. Data need not be quantitative, but may be categorical or rank data. Very quick.
Chapter 9 Introduction to the t Statistic
The 2 nd to last topic this year!!.  ANOVA Testing is similar to a “two sample t- test except” that it compares more than two samples to one another.
Data Analysis Module: One Way Analysis of Variance (ANOVA)
Data Analysis Module: Bivariate Testing
Bivariate Testing (ttests and proportion tests)
Bivariate Testing (ttests and proportion tests)
SA3202 Statistical Methods for Social Sciences
Daniela Stan Raicu School of CTI, DePaul University
Bivariate Testing (ttests and proportion tests)
Daniela Stan Raicu School of CTI, DePaul University
Comparing two Rates Farrokh Alemi Ph.D.
Last Update 12th May 2011 SESSION 41 & 42 Hypothesis Testing.
Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03
Presentation transcript:

Ttests Programming in R

The first part of these notes will address ttesting basics. The second part of these notes will address z test (or proportion testing) basics.

Ttests in R The term “Ttest” comes from the application of the t- distribution to evaluate a hypothesis. The t-distribution is used when the sample size is too small (less than 30) to use s/SQRT(n) as a substitute for the population std. In practice, even hypothesis tests with sample sizes greater than 30, which utilize the normal distribution, are commonly referred to as “ttests”. Note: a “t-statistic” and a “z-score” are conceptually similar – both convert measurements into standardized scores which follow a roughly normal distribution.

A side note of interest from Wikipedia: The t-statistic was introduced in 1908 by William Sealy Gosset, a chemist working for the Guiness Brewery in Dublin, Ireland. Gosset had been hired due to Claude Guinness's innovative policy of recruiting the best graduates from Oxford and Cambridge to apply biochemistry and statistics to Guinness' industrial processes. Gosset devised the t-test as a way to cheaply monitor the quality of beer. He published the test in Biometrika in 1908, but was forced to use a pen name by his employer, who regarded the fact that they were using statistics as a trade secret. Ttests in R

Ttests take three forms: 1.One Sample Ttest - compares the mean of the sample to a given number. e.g. Is average monthly revenue per customer who switches >$50 ? Formal Hypothesis Statement examples: H 0 :   $50 H 1 :  > $50 H 0 :  = $50 H 1 :   $50 Ttests in R

Example: After a massive outbreak of salmonella, the CDC determined that the source was from a particular manufacturer of ice cream. The CDC sampled 9 production runs if the manufacturer, with the following results (all in MPN/g): Use this data to determine if the avg level of salmonella is greater than.3 MPN/g, which is considered to be dangerous. Ttests in R

First, Identify the Hypothesis Statements, including the Type I and Type II errors…and your assignment of alpha. Then, do the computation by hand… Ttests in R

#here, the syntax is: t.test(vector to be analyzed, vector to be analyzed, * alternative hypothesis) * paired = TRUE for a paired ttest One sample t test is the default Ttests in R

2.Two Sample Ttest - compares the mean of the first sample minus the mean of the second sample to a given number. e.g. Is there a difference in the production output of two facilities? Formal Hypothesis Statement examples: H 0 :  a -  b =0 H 1 :  a -  b  0 Ttests in R

When dealing with two sample or paired ttests, it is important to check the following assumptions: 1.The samples are independent 2.The samples have approximately equal variance 3.The distribution of each sample is approximately normal Note – if the assumptions are violated and/or if the sample sizes are very small, we first try a transformation (e.g., take the log or the square root). If this does not work, then we engage in non-parametric analysis: Wilcoxan Rank Sum or Wilcoxan Signed Rank tests. Ttests in R

# here the syntax is: t.test(vector to be tested~two level factor, data = data, var.equal=FALSE*) plot(t.test(vector to be tested~two level factor, data = data) *If the variances are similar, this would be set to TRUE

3.Paired Sample Ttest - compares the mean of the differences in the observations to a given number. e.g. Is there a difference in the production output of a facility after the implementation of new procedures? Formal Hypothesis Statement example: H 0 :  diff =0 H 1 :  diff  0 Ttests in R

#here, the syntax is: t.test (vector to be analyzed, vector to be analyzed, paired = TRUE for a paired ttest, alternative = “greater”*) *the alternative hypothesis could also be “less than”. The default is not equal.

Z testing…or proportion based testing…

The testing formula for a one sample proportion is a simple z calculation: Z = (sample estimate – Null value)/Null Standard Error For a proportion, this would be: Z=(p-p o )/SQRT((p o (1-p o )/n) Proportion tests in R

Example of a one sample proportion test: If 30% of cars on a street are found to be speeding, the city will install “traffic calming” devices. John used his radar gun to measure the speeds of 400 cars on his street. He found that 32% were speeding. Will John get “traffic calming” devices on his street? Proportion tests in R

Table object1<-table(factor) Sum(object1) Prop.test(object1[factor level],totaln, correct=FALSE, p= null hypothesis) Example: loveatfirst.count <- table(PSU$atfirst) prop.test(loveatfirst.count[3],227, correct=FALSE, p=0.45) Note that the “3” indicates the third level of the factor – which is “Yes”.

Answer the following: 1.Identify the Null and Alternative Hypotheses 2.Identify the Type I and Type II errors, including the implications 3.What is an appropriate alpha value? 4.What is the associated p-value? 5.What is your conclusion? Proportion tests in R

2. Two Sample Test - compares the proportion of the first sample minus the proportion of the second sample to a given number. It is of common interest to test of two population proportions are equal. e.g. Is there a difference in the percentage of students who pass a standardized test between those who took a prep course and those who did not? Formal Hypothesis Statement examples: H 0 : p a - p b =0 H 0 : p a - p b <0 H 1 : p a - p b  0 H 1 : p a - p b > 0 Proportion tests in R

Before you undertake a two sample test, there are few things to be determined: 1.The two samples must be independent 2.The number of individuals with each trait of interest and the number without the trait of interest must be at least 10 in each sample. Proportion tests in R

#here, the code is pretty easy…just make the 2x2 table and then apply the prop.test function: FactorVar1.by.FactorVar2<-table(FactorVar1,FactorVar2) prop.test(FactorVar1.by.FactorVar2, correct=FALSE) Example: PSU$Wt <- ifelse(PSU$WtFeel=="RightWt","Right", ifelse(PSU$WtFeel=="OverWt"|PSU$WtFeel=="UnderWt", "Wrong","",)) PSU <- PSU[-which(PSU$Wt==""),] sex.by.wt <- table(PSU$Sex, PSU$Wt) prop.test(sex.by.wt, correct=FALSE)

Answer the following: 1.Identify the Null and Alternative Hypotheses 2.Identify the Type I and Type II errors, including the implications 3.What is an appropriate alpha value? 4.Using the formula on page 529, determine the test statistic. What is the associated p-value? 5.What is your conclusion? Proportion tests in R