Biomedical Presentation Name: 牟汝振 Teach Professor: 蔡章仁.

Slides:



Advertisements
Similar presentations
Hypothesis Testing. To define a statistical Test we 1.Choose a statistic (called the test statistic) 2.Divide the range of possible values for the test.
Advertisements

Biomedical Statistics Testing for Normality and Symmetry Teacher:Jang-Zern Tsai ( 蔡章仁 ) Student: 邱瑋國.
Chapter 16 Introduction to Nonparametric Statistics
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 16 l Nonparametrics: Testing with Ordinal Data or Nonnormal Distributions.
statistics NONPARAMETRIC TEST
Lecture 10 Non Parametric Testing STAT 3120 Statistical Methods I.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Chapter 14 Analysis of Categorical Data
Two Population Means Hypothesis Testing and Confidence Intervals With Known Standard Deviations.
Statistics 07 Nonparametric Hypothesis Testing. Parametric testing such as Z test, t test and F test is suitable for the test of range variables or ratio.
PSY 1950 Nonparametric Statistics November 24, 2008.
HIM 3200 Normal Distribution Biostatistics Dr. Burton.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
© 2004 Prentice-Hall, Inc.Chap 10-1 Basic Business Statistics (9 th Edition) Chapter 10 Two-Sample Tests with Numerical Data.
Chapter 11: Inference for Distributions
15-1 Introduction Most of the hypothesis-testing and confidence interval procedures discussed in previous chapters are based on the assumption that.
PSY 307 – Statistics for the Behavioral Sciences
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 14: Non-parametric tests Marshall University Genomics.
Non-parametric statistics
Mann-Whitney and Wilcoxon Tests.
7.1 Lecture 10/29.
Nonparametrics and goodness of fit Petter Mostad
Chapter 15 Nonparametric Statistics
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
Education 793 Class Notes T-tests 29 October 2003.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
14 Elements of Nonparametric Statistics
NONPARAMETRIC STATISTICS
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
16-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 16 The.
14 Elements of Nonparametric Statistics
Lesson Inferences about the Differences between Two Medians: Dependent Samples.
Copyright © 2012 Pearson Education. Chapter 23 Nonparametric Methods.
Nonparametric Statistics aka, distribution-free statistics makes no assumption about the underlying distribution, other than that it is continuous the.
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, Kruskal- Wallis test, Spearman’ rank correlation.
Ordinally Scale Variables
Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics.
Nonparametric Statistics. In previous testing, we assumed that our samples were drawn from normally distributed populations. This chapter introduces some.
1 Nonparametric Statistical Techniques Chapter 17.
Nonparametric Statistics
Lesson 15 - R Chapter 15 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Experimental Design and Statistics. Scientific Method
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
Ch11: Comparing 2 Samples 11.1: INTRO: This chapter deals with analyzing continuous measurements. Later, some experimental design ideas will be introduced.
Week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n.
GG 313 Lecture 9 Nonparametric Tests 9/22/05. If we cannot assume that our data are at least approximately normally distributed - because there are a.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.
BPS - 5th Ed. Chapter 251 Nonparametric Tests. BPS - 5th Ed. Chapter 252 Inference Methods So Far u Variables have had Normal distributions. u In practice,
Statistical Inference Drawing conclusions (“to infer”) about a population based upon data from a sample. Drawing conclusions (“to infer”) about a population.
Biostatistics Nonparametric Statistics Class 8 March 14, 2000.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Univariate Normality PowerPoint Prepared by Alfred.
Nonparametric Tests with Ordinal Data Chapter 18.
Chapter Eleven Performing the One-Sample t-Test and Testing Correlation.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai The Normal Curve and Univariate Normality PowerPoint.
Copyright © Cengage Learning. All rights reserved. 15 Distribution-Free Procedures.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
1 Nonparametric Statistical Techniques Chapter 18.
16/23/2016Inference about µ1 Chapter 17 Inference about a Population Mean.
Two-Sample Hypothesis Testing
Hypothesis tests for the difference between two means: Independent samples Section 11.1.
Lecture Slides Elementary Statistics Twelfth Edition
The Rank-Sum Test Section 15.2.
Lecture 10/24/ Tests of Significance
Hypothesis Testing and Confidence Intervals
Nonparametric Statistics
Chapter Fifteen Frequency Distribution, Cross-Tabulation, and
Presentation transcript:

Biomedical Presentation Name: 牟汝振 Teach Professor: 蔡章仁

Outline Symmetry, Skewness and Kurtosis a. Symmetry and Skewness b. Kurtosis Resampling a. One sample case b. Two independent samples c. Two matched samples

Skewness and Kurtosis We consider a random variable x and a data set S = {x 1, x 2, …, x n } of size n which contains possible values of x. Looking at S as representing a distribution, the skewness of S is a measure of symmetry and kurtosis measure of peakedness of the data in S.

Symmetry and Skewness We use skewness as a measure of symmetry. If the skewness of S = 0 then the distribution represented by S is perfectly symmetric. If the skewness is negative, then the distribution is skewed to the left, Contrary to the positive.

Consistent with Excel we calculate the skewness of S as follows: where is the mean and s is the standard deviation of S.

Observation: When a distribution is symmetric, the mean = median, when the distribution is positively skewed the mean > median and when the distribution is negatively skewed the mean < median.

Example: Suppose S = {2, 5, -1, 3, 4, 5, 0, 2}. The skewness of S = -0.43, i.e. SKEW(R) = where R is a range in an Excel worksheet containing the data in S. Since this value is negative, the curve representing the distribution is skewed to the left (i.e. the fatter part of the curve is on the right). Also SKEW.P(R) =

Kurtosis We use kurtosis as a measure of peakedness (or flatness). Positive kurtosis indicates a relatively peaked distribution. Consistent with Excel we calculate the kurtosis of S as follows: where is the mean and s is the standard deviation of S.

Example: Suppose S = {2, 5, -1, 3, 4, 5, 0, 2}. The kurtosis of S = -0.94, i.e. KURT(R) = where R is a range in an Excel worksheet containing the data in S. Since this value is negative, the curve representing the distribution is relatively flat.

Resample Resampling procedures are based on the assumption that the underlying population distribution is the same as a given sample. Resampling is useful when the population distribution is unknown or other techniques are not available.

We consider two types of resampling procedures: bootstrapping, where sampling is done with replacement, and permutation (also known as randomization tests), where all possible permutations of the data are made.

One sample case Example 1. Calculate a 95% confidence interval around the median for the memory loss program described in Example 1 of the Sign Test, but with the data given in columns A and B of Figure 1.

Figure. 1 – Resampling – One sample case

We treat the sample as the population and draw 2,000 samples of size 20 (the same size as the original sample) with replacement.

Referring to Figure 1, each element in each sample is selected using the following function: =INDEX(B4:B23,RANDBETWEEN(1,20))

We now take the median of each of the 2,000 samples (only the first 21 samples are shown in Figure 1) and plot their distribution in a histogram. The results are displayed in Figure 2.

Figure. 2 – Analysis for Example 1

The value at the 2.5% percentile is 3 and the value at the 97.5% percentile is 13. Thus we can consider the confidence interval as [3, 13], which contains the sample median of 9.5.

Two independent samples We now consider the case where we have two independent samples. When the data is normally distributed, we would use the t- test. We can also use the Wilcoxon Rank Sum or Mann-Whitney non-parametric test. We now show how to address such problems using the permutation version of resampling.Wilcoxon Rank SumMann-Whitney

Example 2. Using resampling determine whether there is a significant difference between the median life expectancy of smokers and non- smokers using the data described in Figure 3 Figure. 3 – Data for Example 2

Note that the median score of the non- smokers is 76.5 while the median score of smokers is 70.5, a difference of 6. The null hypothesis is that there is no difference between the two groups, i.e. H 0 : the median score for the population of smokers and non-smokers are the same.

Based on the null hypothesis, we can assume that we have a single population of 78. To test the hypothesis we take 2,000 random samples of size 78 from this population without replacement and assume that for each sample the first 40 scores come from the non-smokers and the remaining 38 come from the smokers.

We use formulas of form =INDEX(J4:CI4,1,RANK(DC6,DC6:GB6))

where the range J4:CI4 contains all 78 data elements in the “population” and DC6:GB6 contains 78 random numbers, generated using RAND(). For each of the 2,000 samples we calculate the median of the non-smokers and smokers and record the difference.

Figure. 4 – Resampling for two independent samples Now we need to check whether the mean difference of the original sample is in the extreme 2.5% of the above data (2-tail test). From Figure 14.20, we see that 1.60% of the samples have a median difference of -6 or less and 2.35% of the samples have a median difference of 6 or more, for a total of 3.95%.

This means that the probability of getting a sample in either tail based on the null hypothesis is.0395 <.05 = α, and so we reject the null hypothesis and conclude with 95% confidence that there is a significant difference between the life expectancy of smokers and non-smokers.

Two matched samples We now consider the case where we have two matched samples. we would use the Paired Sample t-test. Even for non-normal data we can use the Wilcoxon Signed-Ranks non-parametric test.Paired Sample t-testWilcoxon Signed-Ranks

Example 3: Using resampling determine whether there is a significant difference between the median life expectancy of smokers and non-smokers using the data described in Figure 3 The null hypothesis is there is no difference between the right and left eye’s ability to recognize objects, i.e. the median difference is zero.

If the null hypothesis is true then each of the 15 scores for the right eye is just as likely to be larger as smaller than the scores for the left eye. This is a form of sampling without replacement. The absolute values of the elements in each sample are as in the population, only the signs are variable.

Figure 5 shows the first 16 samples (out of 2,000). Figure. 5 – Resampling for paired samples

and similarly for the other 1,999 samples. For each sample we calculate the median and create a histogram of the 2,000 median values as in Figure 6. Figure. 6 – Analysis for Example 3

The median of the original sample (i.e. the resampling “population”) is 3. From Figure 6 we see that 10.00% all the samples have a median ≤ -3 and 12.30% have a median ≥ 3. Since % = 22.30% ≥ 5% = α, we cannot reject the null hypothesis, and so conclude there is no significant difference between the right and left eye of the population.