Comparing Two Samples: Part I. Normality Check Frequency histogram (Skewness & Kurtosis) Probability plot, K-S test Normality Check Frequency histogram.

Slides:



Advertisements
Similar presentations
Chapter 16 Introduction to Nonparametric Statistics
Advertisements

PSY 307 – Statistics for the Behavioral Sciences Chapter 20 – Tests for Ranked Data, Choosing Statistical Tests.
Introduction to Statistics
PSY 307 – Statistics for the Behavioral Sciences
MARE 250 Dr. Jason Turner Hypothesis Testing II To ASSUME is to make an… Four assumptions for t-test hypothesis testing: 1. Random Samples 2. Independent.
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
Comparing Two Samples: Part II
Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.
Topic 2: Statistical Concepts and Market Returns
Final Review Session.
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
Lecture 9: One Way ANOVA Between Subjects
Statistical Methods in Computer Science Hypothesis Testing I: Treatment experiment designs Ido Dagan.
Analysis of variance (3). Normality Check Frequency histogram (Skewness & Kurtosis) Probability plot, K-S test Normality Check Frequency histogram (Skewness.
Inference about a Mean Part II
Analysis of variance (2) Lecture 10. Normality Check Frequency histogram (Skewness & Kurtosis) Probability plot, K-S test Normality Check Frequency histogram.
T-Tests Lecture: Nov. 6, 2002.
Chapter 2 Simple Comparative Experiments
Chapter 11: Inference for Distributions
Statistical Methods in Computer Science Hypothesis Testing I: Treatment experiment designs Ido Dagan.
5-3 Inference on the Means of Two Populations, Variances Unknown
Statistics for Managers Using Microsoft® Excel 5th Edition
Non-parametric statistics
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
AM Recitation 2/10/11.
F-Test ( ANOVA ) & Two-Way ANOVA
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Hypothesis Testing:.
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
Copyright, Gerry Quinn & Mick Keough, 1998 Please do not copy or distribute this file without the authors’ permission Experimental design and analysis.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Education 793 Class Notes T-tests 29 October 2003.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
NONPARAMETRIC STATISTICS
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 1 Two-sample comparisons Underlying principles.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 23/10/2015 9:22 PM 1 Two-sample comparisons Underlying principles.
Parametric tests (independent t- test and paired t-test & ANOVA) Dr. Omar Al Jadaan.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Lesson 15 - R Chapter 15 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
© Copyright McGraw-Hill 2000
Chapter 13 - ANOVA. ANOVA Be able to explain in general terms and using an example what a one-way ANOVA is (370). Know the purpose of the one-way ANOVA.
Experimental Design and Statistics. Scientific Method
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
Data Analysis.
CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.
PCB 3043L - General Ecology Data Analysis.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
 Assumptions are an essential part of statistics and the process of building and testing models.  There are many different assumptions across the range.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
1 Testing Statistical Hypothesis The One Sample t-Test Heibatollah Baghi, and Mastee Badii.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.
Nonparametric statistics. Four levels of measurement Nominal Ordinal Interval Ratio  Nominal: the lowest level  Ordinal  Interval  Ratio: the highest.
Statistics for Education Research Lecture 4 Tests on Two Means: Types and Paired-Sample T-tests Instructor: Dr. Tung-hsien He
1 Nonparametric Statistical Techniques Chapter 18.
Chapter 10: The t Test For Two Independent Samples.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 9 Introduction to the t Statistic
Virtual University of Pakistan
Two-Sample Hypothesis Testing
PCB 3043L - General Ecology Data Analysis.
Presentation transcript:

Comparing Two Samples: Part I

Normality Check Frequency histogram (Skewness & Kurtosis) Probability plot, K-S test Normality Check Frequency histogram (Skewness & Kurtosis) Probability plot, K-S test Descriptive statistics Measurements (data) Measurements (data) Mean, SD, SEM, 95% confidence interval YES Check the Homogeneity of Variance Data transformation NO Data transformation NO Median, range, Q1 and Q3 Non-Parametric Test(s) For 2 samples: Mann- Whitney For 2-paired samples: Wilcoxon For >2 samples: Kruskal-Wallis Sheirer-Ray-Hare Non-Parametric Test(s) For 2 samples: Mann- Whitney For 2-paired samples: Wilcoxon For >2 samples: Kruskal-Wallis Sheirer-Ray-Hare Parametric Tests Student’s t tests for 2 samples; ANOVA for  2 samples; post hoc tests for multiple comparison of means Parametric Tests Student’s t tests for 2 samples; ANOVA for  2 samples; post hoc tests for multiple comparison of means YES

Procedures for comparing two samples Test normality of the data; if passed, then Compare the measurements both of location (central value) and dispersion (spread or range) of the thickness indices between the two sites If there is a difference only in the measure of location, a parametric statistical test based upon the difference between the two sample means

One sample t test: t = (X -  )/S X The 2-tailed t test for significant difference between a mean longevity of horses and a hypothesized population mean of  = 22 yr Ho:  = 22 year Age at death (in year) of 25 horses: t = ( )/ (4.25/  25) = = n - 1 = = 24 t 0.05 (2), 24 = Thus reject Ho 0.01<p<0.02 S X = Standard error of mean = SD/  n

Confidence interval for mean A researcher often needs to know how close a sample mean (x) is to the population mean (  ). For example, the amount of dissolved/dispersed petroleum hydrocarbons (DDPH) in the upper ocean. It is impossible to sample the entire ocean so the population mean at any time is unknown. If data of the population follow the normality, the sample mean will be normally distributed. The standard deviation (or standard error ) of the mean can be obtained by: S X = S.E.M. = S/  n In general, the population mean lies within one standard deviation (S) of the sample mean with about 68% confidence. It is usual to quote 95% confidence intervals by multiplying S.E.M. with the (1) appropriate value of z, namely 1.96 (for n >30) or (2) appropriate value of t, df = n -1 (for n < 30).

Confidence interval for mean e.g. 50 measurements of DDPH in the upper ocean were made. The sample mean = 4.75 ppb and S = 3.99 ppb Then S X = 3.99/  50 = The 95% confidence intervals are obtained as X  z 0.05 S X = 4.75  (1.96)(0.5643) = 4.75  ppb i.e and ppb e.g. Only 10 measurements of DDPH in the upper ocean were made. The sample mean = 78.2 ppb and S = 38.6 ppb Then S X = 38.6/  10 = The 95% confidence intervals are obtained as X  t 0.05, n-1 S X = 78.2  (2.262)(12.206) = 78.2  ppb

Two-tailed and one-tailed tests If we do not know which of the means is greater than the other, we reject extreme values in either direction (i.e. H A :  a   b ); this procedure is known as a two-tailed test. However, if sufficient information is available for us to specify the direction of the population means, we could frame the alternative hypothesis as H A :  a >  b or  a <  b For example, comparison of total organic carbon (TOC) concentrations in the sewage effluent before and after being upgraded to an advance treatment. In this case, a reduced in TOC would be expected after improvement of the sewage treatment, and therefore, H A :  before >  after The corresponding critical value of z for 1-tailed tests is slightly different from those for 2-tailed tests (see Table B3).

The difference between two sample means with limited data If n<30, the above method gives an unreliable estimate of z This problem was solved by ‘Student’ who introduced the t-test early in the 20th century Similar to z-test, but instead of referring to z, a value of t is required (Table B3) df = 2n - 2 for n 1 = n 2 For all degrees of freedom below infinity, the curve appears leptokurtic compared with the normal distribution, and this property becomes extreme at small degrees of freedom.

AB Mean measured unit Comparison of two samples

AB Measured unit Error bar =  2 SD Two-sample t test

AB Measured unit Error bar =  2 SD Two-sample t test

AB Measured unit Error bar =  2 SD

If the measure of dispersion (i.e. homogeneity of variance) also differs significantly between the samples: –data transformation procedure: if there was no significant difference between the variances of transformed data, then a parametric test could be applied; otherwise you should consider –non-parametric test or Welch’s approximate t’ (Zars p ) Importance of Equal Variance

Assumptions for z and t tests For the t and z tests to be valid: –the measurements should be at least approximately normally distributed and –with similar sample variances (i.e., homoscedastic or homogeneity of variance) Need to check Normality of both datasets Homoscedasticity can be checked by a F-test The largest sample variance F ratio = The smallest variance

Check for homogeneity of variance Ho: equal variance between the two samples H A : unequal variances Given that S a 2 = (n = 15) and S b 2 = (n =12), Then, F ratio = / = Use the F table (Table B4, Zar 99, App.34), Critical F 0.05(2), 14, 11 = 3.36 > Accept Ho If Ho rejected, then transform data and redo F test If still failed, non-parametric or Welch’s approximate t’

An example Birds’ eggshells are thought to be influenced by acid rain which reduces the egg thickness index (egg shell mass/ surface area; mg cm -2 ) We investigate gulls’ eggs at two nesting sites: a control site and a site affected by acid rain Taking a single egg at random from a number of nests at each site; determining the egg thickness index

Two sample Z test Ho: no significant difference in the the thickness indices between the two sites i.e. the thickness indices belong to the same population (or probability density function). Then the population means for the two sites will be the same and their difference will be zero. Based on the central limit theorem, a collection of sample means will follow a normal distribution. It can also be shown that the distribution of the difference of two means (X a - X b ) will also be normally distributed. Recall the distribution of z: if (X a - X b ) can be divided by an appropriate standard deviation, a value of z can be calculated:

Two sample Z test The standard deviation of (X a - X b ) =  [(1/n)(S a 2 + S b 2 )] Then, Z = (X a - X b )/  [(1/n)(S a 2 + S b 2 )] This equation can be used provided the number of measurements in each sample is reasonably large (n > 30), the measurements are approximately normally distributed and n in each sample is the same. Ho:  a =  b ; H A :  a   b If the sample means are similar, the z value will be small and Ho will be accepted. The critical values of z can be obtained from the t table, Table B3, (df =  ).

Two sample Z test The following data were obtained at two gull nesting sites. 50 eggs were taken at random from each site, with 1 egg taken randomly from each nest. Egg thickness index results (mg cm -2 ) Site a (control)b n5050 mean S Ho:  a =  b ; H A :  a   b z = (X a - X b )/  [(1/n)(S a 2 + S b 2 )] = ( )/  [(1/50)( )] z = As critical z  =0.05, df =  = 1.96, accept Ho. Remember to always check the homogeneity of variance before running the t test.

Exercise The following data were obtained at two gull nesting sites. 50 eggs were taken at random from each site, with 1 egg taken randomly from each nest. Egg thickness index results (mg cm -2 ) Site a (control)b n5050 mean S Test the Ho:  a =  b (H A :  a   b ) using the two sample z test z = (X a - X b )/  [(1/n)(S a 2 + S b 2 )] Please do it later ! Remember to always check the homogeneity of variance before running the t test.

A t test with equal measurements in each sample (n a = n b ) e.g. The chemical oxygen demand (COD) is measured at two industrial effluent outfalls, a and b, as part of consent procedure. Test the null hypothesis: Ho:  a =  b while H A :  a   b (Given that the data are normally distributed) n1212 mean S s p 2 = (SS 1 + SS 2 ) / ( υ 1 + υ 2 ) = [(0.257 × 11) + (0.366 × 11)]/(11+11) = s X 1 – X 2 = √ (s p 2 /n 1 + s p 2 /n 2 ) = √ (0.312/12) × 2 = t = (X 1 – X 2 ) / s X 1 – X 2 = (3.701 – 3.406) / = df = 2n - 2 = 22 t  = 0.05, df = 22, 2-tailed = >  t observed  = 1.294, p > 0.05 The calculated t-value is less than the critical t value. Thus, accept Ho. SS = sum of square = S 2 × υ Remember to always check the homogeneity of variance before running the t test. Example

Growth of 30-weeks old non-transgenic and transgenic tilapia was determined by measuring the body mass (wet weight). Since transgenic fish cloned with growth hormone (GH) related gene OPAFPcsGH are known to grow faster in other fish species (Rahman et al. 2001), it is hypothesized that H A :  transgenic >  non-transgenic while the null hypothesis is given as Ho :  transgenic   non-transgenic Example

Ho:  transgenic   non-transgenic H A :  transgenic >  non-transgenic Given that mass (g) of tilapia are normally distributed. n 8 8 mean S s p 2 = (SS 1 + SS 2 ) / ( υ 1 + υ 2 ) = s X 1 – X 2 = √ (s p 2 /n 1 + s p 2 /n 2 ) = t = (X 1 – X 2 ) / s X 1 – X 2 = 8.29 df = 2n - 2 = 14 t  = 0.05, df = 14, 1-tailed = << 8.29 ; p < The t-value is greater than the critical t value. Thus, reject Ho. Remember to always check the homogeneity of variance before running the t test. Example

e.g. The data are human blood-clotting time (in minutes) of individuals given one of two different drugs. It is hypothesized that Ho:  a =  b while H A :  a   b (Given that the data are normally distributed) n67 mean S s p 2 = (SS 1 + SS 2 ) / ( υ 1 + υ 2 ) = s X 1 – X 2 = √ (s p 2 /n 1 + s p 2 /n 2 ) = t = (X 1 – X 2 ) / s X 1 – X 2 = t  = 0.05, df = = 11, 2-tailed = < 2.470; 0.02<p<0.05 Thus, reject Ho but accept H A. Example

Power of the two-sample t test Power of two-sample t test is greatest when the number of measurements in each sample is the same (n 1 = n 2 ) Power of two-sample t test for different numbers of measurements in each sample (n 1  n 2 ) is smaller When n 1  n 2, effective n = 2n 1 n 2 /(n 1 +n 2 ) –e.g. n 1 =6, n 2 =7, effective n = 2(6 × 7)/ (6+7) = 6.46, which is smaller than the average of 6 and 7 (6.5) Therefore, we should always use ‘balanced design’ where possible.

Normality Check Frequency histogram (Skewness & Kurtosis) Probability plot, K-S test Normality Check Frequency histogram (Skewness & Kurtosis) Probability plot, K-S test Descriptive statistics Measurements (data) Measurements (data) Mean, SD, SEM, 95% confidence interval YES Check the Homogeneity of Variance Data transformation NO Data transformation NO Median, range, Q1 and Q3 Non-Parametric Test(s) For 2 samples: Mann- Whitney For 2-paired samples: Wilcoxon For >2 samples: Kruskal-Wallis Sheirer-Ray-Hare Non-Parametric Test(s) For 2 samples: Mann- Whitney For 2-paired samples: Wilcoxon For >2 samples: Kruskal-Wallis Sheirer-Ray-Hare Parametric Tests Student’s t tests for 2 samples; ANOVA for  2 samples; post hoc tests for multiple comparison of means Parametric Tests Student’s t tests for 2 samples; ANOVA for  2 samples; post hoc tests for multiple comparison of means YES F-test z test t tests Mann- Whitney test For comparison of two independent samples

Why do we need to transform the data into normal distribution for parametric tests when we can apply non-parametric tests? Parametric tests are more powerful and more sensitive than non-parametric ones. Non-parametric methods for comparison of two independent samples

Non-parametric tests for two samples - the Mann-Whitney test Non-parametric tests are also called ‘distribution-free’ test because they are independent of the underlying population distribution, the only assumption being independence of observations and continuity of the variable which is being measured. Most of these tests involve a ranking procedure Mann-Whitney test is useful for two independent samples and can be used as an alternative to the t-test, particularly where the assumptions for t-test cannot be demonstrated It can also be applied to ordinal data It is usually assumed that the distribution of the measurements in the two samples are of the same general form (shown by frequency histogram or stem-and-leaf plot) This test can be undertaken with any sample size. But the method of calculating the test statistic (U) depends upon the size of n.

Mann-Whitney test - basic principle Case Case Case Case 4 Group A Group B Unequal medians

Non-parametric tests for two samples - the Mann-Whitney test In Mann-Whitney test, the null hypothesis cannot be stated in terms of population parameters, but is defined as the equality of the medians of the populations from which the two samples are drawn. Example: Mann-Whiney test where n is small Sample A Sample B These measurements are combined and ranked in descending order: B B B A B B A A A Example

Example I: Mann-Whiney test (2-tailed test) These measurements are combined and ranked in descending order : B B B A B B A A A Ho: equal medians Summation of ranks for A score  R 2 = = 28 Summation of ranks for B score  R 1 = = 17 U = n 1 n 2 + [n 1 (n 1 + 1)/2] -  R 1 where the greatest measurement (smaller sum of ranks) in either of the two groups is given as rank 1 (R 1 ); Thus group B is R 1 in this case. U = (5)(4) + [(5)(5 +1)/2] - 17 = 18 U 0.05(2), 5, 4 = U 0.05(2), 4, 5 = 19 > 18 ; thus accept Ho; i.e. equal medians Example

Sample A Sample B Sample A Sample B Sample A Sample B Sample A Sample B Sample A RankSample BRank Sample A RankSample BRank Ho: equal medians U = n 1 n 2 + [n 1 (n 1 + 1)/2] -  R 1 where the greatest measurement in either of the two groups is given as rank 1 (R 1 ); Thus group B is R 1 in this case. U = (5)(4) + [(5)(5 +1)/2] - 17 = 18 U 0.05(2), 5, 4 = U 0.05(2), 4, 5 = 19 > 18 ; thus accept Ho; i.e. equal medians R 1 = 17 R 2 = 28 Example

Sample A Sample B Sample A Sample B Sample A Sample B Sample A Sample B Sample A RankSample BRank Sample A RankSample BRank U = n 1 n 2 + [n 1 (n 1 + 1)/2] -  R 1 U = (5)(4) + [(5)(5 +1)/2] - 16 = 19 U 0.05(2), 5, 4 = U 0.05(2), 4, 5 = 19; thus reject Ho at p = 0.05; i.e. unequal medians R 1 = 16 R 2 = Example

Sample A Sample B Sample A Sample B Sample A Sample B Sample A Sample B Sample A RankSample BRank Sample A RankSample BRank U = n 1 n 2 + [n 1 (n 1 + 1)/2] -  R 1 U = (5)(4) + [(5)(5 +1)/2] - 15 = 20 U 0.05(2), 5, 4 = U 0.05(2), 4, 5 = 19 < 20; thus reject Ho at p = 0.02; i.e. unequal medians R 1 = 15 R 2 = Example

Example II: Mann-Whiney test (2-tailed test) The following data provide representative soil moisture contents on south and north-facing slopes under grassland in June. The Mann- Whitney test is used to test the null hypothesis that the population medians of the two samples are the same. A variance ratio test will show that the measurements are heteroscedastic. Also, the measurements are percentages, which would not be expected to be normally distributed. Example

A variance ratio test will show that the measurements are heteroscedastic. Also, the measurements are percentages, which would not be expected to be normally distributed. U = n 1 n 2 + [n 1 (n 1 + 1)/2] -  R 1 U = (14)(17) + [(14)(14 + 1)/2] = 227 U 0.05(2), 14, 17 = 169 < 227 p < 0.001, thus reject Ho The medians of the two samples are significantly different n = 14  R 1 = 116 n = 17  R 2 = 380 Example

Non-parametric tests for two samples - the Mann-Whitney test for n > 20 Critical tables for U become unwieldy as n increases above 20 and Mann & Whitney obtained z as a function of U and n as shown below: z = (U - n 1 n 2 /2)/  [n 1 n 2 (n 1 + n 2 + 1)/12] Suppose that the test value of U was found to be 148 with n 1 = 16 and n 2 = 29. Then z = (148 - (16  29/ 2)/  [(16)(29)( )/12] = -84/ = For a 2-tailed test, the modulus (1.99) is taken. Referring Table B3, df = infinity, at p = 0.05, z = 1.96 < 1.99 Thus, reject Ho

Important Notes Comparisons between two samples can be made with reference to the difference between the sample means or medians Comparison between sample means are made by calculating their difference and dividing by an appropriate sample standard deviation Where the sample size is large (e.g. n > 50) a two-sample z test can be used. For small samples, a Student’s t-test can be used The parametric t and z tests should be applied to independent interval/ ratio measurements which are at least approximately normally distributed A simple test of homoscedasticity (F ratio test) should be applied prior to application of t and z tests For non-normally distributed or heteroscedastic data, the non- parametric Mann-Whitney test can be utilized