Midterm. T/F (a) False—step function (b) False, F n (x)~Bin(n,F(x)) so Inverting and estimating the standard error we see that a factor of n -1/2 is missing.

Slides:



Advertisements
Similar presentations
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Advertisements

Economics 105: Statistics Go over GH 11 & 12 GH 13 & 14 due Thursday.
Dealing With Statistical Uncertainty
Computing the ranks of data is only one of several possible so- called scoring methods that are in use... Section 2.7 reviews three of them – we’ll look.
Lecture 10 Non Parametric Testing STAT 3120 Statistical Methods I.
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
Dealing With Statistical Uncertainty Richard Mott Wellcome Trust Centre for Human Genetics.
Elementary hypothesis testing
Lecture 5 Outline – Tues., Jan. 27 Miscellanea from Lecture 4 Case Study Chapter 2.2 –Probability model for random sampling (see also chapter 1.4.1)
Analysis of Variance. Experimental Design u Investigator controls one or more independent variables –Called treatment variables or factors –Contain two.
Lesson #25 Nonparametric Tests for a Single Population.
Final Review Session.
Statistics 07 Nonparametric Hypothesis Testing. Parametric testing such as Z test, t test and F test is suitable for the test of range variables or ratio.
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
1 Distribution-free testing If the data are normally distributed, we may apply a z- test or t-test when the parameter of interest is . But what if this.
Inference about a Mean Part II
Lecture 9 Today: –Log transformation: interpretation for population inference (3.5) –Rank sum test (4.2) –Wilcoxon signed-rank test (4.4.2) Thursday: –Welch’s.
Chapter 2 Simple Comparative Experiments
Chapter 11: Inference for Distributions
15-1 Introduction Most of the hypothesis-testing and confidence interval procedures discussed in previous chapters are based on the assumption that.
5-3 Inference on the Means of Two Populations, Variances Unknown
Multivariate Analysis of Variance, Part 1 BMTRY 726.
Chapter 15 Nonparametric Statistics
AM Recitation 2/10/11.
F-Test ( ANOVA ) & Two-Way ANOVA
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Dealing With Statistical Uncertainty Richard Mott Wellcome Trust Centre for Human Genetics.
Ch 11 – Inference for Distributions YMS Inference for the Mean of a Population.
NONPARAMETRIC STATISTICS
Comparing Two Proportions
1 G Lect 10a G Lecture 10a Revisited Example: Okazaki’s inferences from a survey Inferences on correlation Correlation: Power and effect.
Statistical Methods II Session 8 Non Parametric Testing – The Wilcoxon Signed Rank Test.
Biostat 200 Lecture 7 1. Hypothesis tests so far T-test of one mean: Null hypothesis µ=µ 0 Test of one proportion: Null hypothesis p=p 0 Paired t-test:
9 Mar 2007 EMBnet Course – Introduction to Statistics for Biologists Nonparametric tests, Bootstrapping
Nonparametric Statistical Methods: Overview and Examples ETM 568 ISE 468 Spring 2015 Dr. Joan Burtner.
Previous Lecture: Categorical Data Methods. Nonparametric Methods This Lecture Judy Zhong Ph.D.
Analysis of variance Petter Mostad Comparing more than two groups Up to now we have studied situations with –One observation per object One.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Wilcoxon rank sum test (or the Mann-Whitney U test) In statistics, the Mann-Whitney U test (also called the Mann-Whitney-Wilcoxon (MWW), Wilcoxon rank-sum.
DATA IDENTIFICATION AND ANALYSIS. Introduction  During design phase of a study, the investigator must decide which type of data will be collected and.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, Kruskal- Wallis test, Spearman’ rank correlation.
Ordinally Scale Variables
Nonparametric Tests IPS Chapter 15 © 2009 W.H. Freeman and Company.
Large sample CI for μ Small sample CI for μ Large sample CI for p
1 Nonparametric Statistical Techniques Chapter 17.
Nonparametric Statistics
Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
Stefano Vezzoli, CROS NT
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
BPS - 3rd Ed. Chapter 161 Inference about a Population Mean.
Ch11: Comparing 2 Samples 11.1: INTRO: This chapter deals with analyzing continuous measurements. Later, some experimental design ideas will be introduced.
Week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 14 th February 2013.
- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.
Computing the ranks of data is only one of several possible so-called scoring methods that are in use... Section 2.7 reviews three of them – we’ll look.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
Chapter 22 Comparing Two Proportions.  Comparisons between two percentages are much more common than questions about isolated percentages.  We often.
ENGR 610 Applied Statistics Fall Week 7 Marshall University CITE Jack Smith.
Copyright © Cengage Learning. All rights reserved. 15 Distribution-Free Procedures.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Two-Sample-Means-1 Two Independent Populations (Chapter 6) Develop a confidence interval for the difference in means between two independent normal populations.
Hypothesis Testing and Statistical Significance
1 Nonparametric Statistical Techniques Chapter 18.
Non-parametric Tests Research II MSW PT Class 8. Key Terms Power of a test refers to the probability of rejecting a false null hypothesis (or detect a.
Chapter 9 Hypothesis Testing.
Some Nonparametric Methods
Nonparametric Statistics
Distribution-Free Procedures
Presentation transcript:

Midterm

T/F (a) False—step function (b) False, F n (x)~Bin(n,F(x)) so Inverting and estimating the standard error we see that a factor of n -1/2 is missing (c) False, we would change n (by deleting the ties) (d) True—the averages cannot get outside the range (e) True—it looks at the sign of the pairwise slopes

The effect of a sleep treatment The average amount of sleep in two weeks were recorded for a control group (n=15) and a treatment group (m=20). The treatment was advise on how to get more sleep.

A shift plot

A two-sample test of equal location X 1,...,X n and Y 1,...,Y m iid samples from two distributions, F and G. Let r i be the rank of X i in the combined sample, and W = Σr i W is called the Wilcoxon two- sample statistic An equivalent statistic, due to Mann and Whitney counts the number U of X i > Y j. Ransom Whitney Henry Mann

Sleep treatment data Treatment Control Sum of treatment ranks 324 U = 324 – 20*21/2 = 114

Test procedure Reject for large or small values of U = W – n(n+1)/2 The distributions of U and W are symmetric about their midpoints To see that for U, consider the case n=1. Under H 0 these m+1 variables are iid, so Y 1 is equally likely to be between any two X i. Thus #{X i – Y 1 >0} is equally likely to be 0,...,m, a distribution symmetric around m/2. Thus U is the sum of n iid Unif{0,...,m}, also symmetric, and E(U)=nm/2.

Null distribution For small values of n,m use exact distribution ( dwilcox(x,n,m) in R) For larger values (n,m≥30) a normal approximation works well, using the variance Var(U)=mn(m+n+1)/12. For dealing with a null hypothesis of a shift θ, we just subtract θ from each Y j Confidence band : go in equal number from each side among ordered X i - Y j

Estimate Possible confidence levels for m=15, n=20 are computed by, e.g., 1-2*pwilcox(70:120,15,20) 99%: 73 in from either side 95%: 90 in 90%: 100 in The Hodges-Lehmann estimator corresponding to WMW is the median of the mn differences, here (difference in medians is )

Sleep data, cont. P-value = % CI (-0.96,0.25)

Null hypothesis The null distribution actually requires P(X>Y-θ)= 1/2. That follows if Y-X has a symmetric distribution about θ. If G(y)=F(x-θ) this is true, and in that case we are just comparing medians. The WMW test does not work well when G and F have different shape (in particular, different spread)

Dealing with ties For any rank-based method ties can be dealt with by replacing the tied values by their average rank, the midrank This affects the variance For the Wilcoxon test there is an R function called wilcox.exact in the library exactRankTests, or you can use wilcox.test in the package coin Note that since all we need is ranks, the WMW test can be used for ordinal data

Comparison with t-test The WMW test is equivalent to the two-sample t-test with equal variances applied to the ranks instead of the data This approach is particularly helpful if there are outliers in the data

How about the sign test? For the sleep treatment data, the overall median is Assuming that the two samples have the same median, we can set down a 2x2 table Why aren’t there 20 treated values? What (row and column) totals are fixed? Sample<7.05>7.05Total Treatment11819 Control6917 Total17 34

Fisher’s exact test Consider a table n 11 n 12 n 1 n 21 n 22 n 2 n 1 n 2 n Think of column 1 as success (in our example obs < 7.05), column 2 as failure, while the rows are different groups (in our case treatment and control). Since all row and column sums are given, only one observation matters, say N 11 =n 11. What is the distribution of N 11 ?

Odds and odds ratio In a 2x2-table, a “natural” parameter is the odds ratio: If the treatment has no effect, the odds ratio is 1. The larger the odds ratio, the stronger the effect of the treatment.

Estimating the odds ratio CI? Figure out possible values x of n 11 from the hypergeometric distribution, write

Fisher’s test revisited P-value 2 P(X ≥ 11) = 0.49 To get confidence interval, use x=7,8,...,12, so the odds ratio CI is between 0.29 and 3.43 (R function uses a different calculation).

Assumptions iid observations distribution of X-Y is symmetric Fisher’s exact test of median equality WMW test