The logic behind a statistical test. A statistical test is the comparison of the probabilities in favour of a hypothesis H 1 with the respective probabilities.

Slides:



Advertisements
Similar presentations
Prepared by Lloyd R. Jaisingh
Advertisements

STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
STATISTICS POINT ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Overview of Lecture Parametric vs Non-Parametric Statistical Tests.
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
Quantitative Methods Lecture 3
SADC Course in Statistics Tests for Variances (Session 11)
Assumptions underlying regression analysis
Chapter 4: Basic Estimation Techniques
Elementary Statistics
Tests of Significance and Measures of Association
Chi-square, Goodness of fit, and Contingency Tables
Contingency Table Analysis Mary Whiteside, Ph.D..
Chapter 16 Goodness-of-Fit Tests and Contingency Tables
Chi-Square and Analysis of Variance (ANOVA)
Hypothesis Tests: Two Independent Samples
Comparing Two Population Parameters
STATISTICAL ANALYSIS. Your introduction to statistics should not be like drinking water from a fire hose!!
Statistical Inferences Based on Two Samples
© The McGraw-Hill Companies, Inc., Chapter 10 Testing the Difference between Means and Variances.
Chapter 18: The Chi-Square Statistic
Chapter 8 Estimation Understandable Statistics Ninth Edition
ABOUT TWO INDEPENDENT POPULATIONS
Experimental Design and Analysis of Variance
Lecture 11 One-way analysis of variance (Chapter 15.2)
Simple Linear Regression Analysis
Multiple Regression and Model Building
Commonly Used Distributions
Contingency Tables Chapters Seven, Sixteen, and Eighteen Chapter Seven –Definition of Contingency Tables –Basic Statistics –SPSS program (Crosstabulation)
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Objectives (BPS chapter 24)
Variance and covariance M contains the mean Sums of squares General additive models.
What is the probability that of 10 newborn babies at least 7 are boys? p(girl) = p(boy) = 0.5 Lecture 10 Important statistical distributions Bernoulli.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
BCOR 1020 Business Statistics
Correlation and linear regression
Final Review Session.
Statistics for the Social Sciences Psychology 340 Fall 2006 Hypothesis testing.
Inference about a Mean Part II
Statistics for the Social Sciences Psychology 340 Fall 2006 Hypothesis testing.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
5-3 Inference on the Means of Two Populations, Variances Unknown
1 Nominal Data Greg C Elvers. 2 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics.
AM Recitation 2/10/11.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
Education 793 Class Notes Presentation 10 Chi-Square Tests and One-Way ANOVA.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Statistical Testing of Differences CHAPTER fifteen.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Two-Way (Independent) ANOVA. PSYC 6130A, PROF. J. ELDER 2 Two-Way ANOVA “Two-Way” means groups are defined by 2 independent variables. These IVs are typically.
1 Chi-square Test Dr. T. T. Kachwala. Using the Chi-Square Test 2 The following are the two Applications: 1. Chi square as a test of Independence 2.Chi.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
I. ANOVA revisited & reviewed
The Chi Square Test A statistical method used to determine goodness of fit Chi-square requires no assumptions about the shape of the population distribution.
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 20th February 2014  
Do data match an expected ratio
Part Three. Data Analysis
Hypothesis Theory examples.
Analyzing the Association Between Categorical Variables
Statistics II: An Overview of Statistics
UNIT V CHISQUARE DISTRIBUTION
S.M.JOSHI COLLEGE, HADAPSAR
Presentation transcript:

The logic behind a statistical test. A statistical test is the comparison of the probabilities in favour of a hypothesis H 1 with the respective probabilities of an appropriate null hypothesis H 0. Type I error Type II error Power of a test Accepting the wrong hypothesis H 1 is termed type I error. Rejecting the correct hypothesis H 1 is termed ttype II error. Lecture 11 Parametric hypothesis testing

Testing simple hypotheses Karl Pearson threw times a coin and wanted to see whether in the real world deviations from the expectation of numbers and eagles occur. He got time the numbers. Does this result deviate from our expectation? The exact solution of the binomial The normal approximation

2 test Assume a sum of variances of Z-transformed variables Each variance is one. Thus the expected value of 2 is n The 2 distribution is a group of distributions of variances in dependence on the number of elements n. Observed values of 2 can be compared to predicted and allow for statistical hypthesis testing. Pearsons coin example Probability of H 0

9 times green, yellow seed 3 times green, green seed 3 times yellow, yellow seed 1 time yellow, green seed Does the observation confirm the prediction? The Chi 2 test has K-1 degrees of freedom.

All statistical programs give the probability of the null hypothesis, H 0.

Advices for applying a χ 2 -test χ 2 -tests compare observations and expectations. Total numbers of observations and expectations must be equal. The absolute values should not be too small (as a rule the smallest expected value should be larger than 10). At small event numbers the Yates correction should be used. The classification of events must be unequivocal. χ 2 -tests were found to be quite robust. That means they are conservative and rather favour H 0, the hypothesis of no deviation. The applicability of the χ 2 -test does not depend on the underlying distributions. They need not to be normally of binomial distributed. Dealing with frequencies

G-test or log likelihood test 2 relies on absolute differences between observed and expected frequencies. However, it is also possible to take the quotient L = observed / expected as a measure of goodness of fit G is approximately 2 distributed with k - 1 degrees of freedom

A species - area relation is expected to follow a power function of the form S = 10A Do the following data points (Area, species number) confirm this expectations: A 1 (1,12), A 2 (2,18), A 3 (4,14), A 4 (8,30), A 5 (16,35), A 6 (32,38), A 7 (64,33), A 8 (128,35), A 9 (256,56), A 10 (512,70)? We try different tests. Both tests indicate that the regression line doesnt fit The pattern is better seen in a double log plot. We have seven points above and 3 points below the regression line. Is there a systematic error?

Tests for systematic errors. The binomial The 2 test

Now we try the best fit model the G-test identified even the best fit model as having larger deviations than expected from a simple normal random sample model.

The best fit model Observation and expectation can be compared by a Kolmogorov-Smirnov test. The test compares the maximum cumulative deviation with that expected from a normal distribution. Kolmogorov-Smirnov test Both results are qualitatively identical but differ quantitatively. The programs use different algorithms

2x2 contingency table 1000 Drosophila flies with normal and curled wings and two alleles A and B suposed to influence wing form. Do flies with allele have more often curled wings than fiels with allele B? A contingency table chi2 test with n rows and m columns has (n-1) * (m-1) degrees of freedom. The 2x2 table has 1 degree of freedom Predicted number of allele A and curled wings

Relative abundance distributions Dominant species Rare species Intermediate species The hollow curve Evenness Abundance is the total number of individuals in a population. Density refers to the number of individuals in a unit of measurement. The log-normal distribution

The distribution of species abundance distributions across vertebrates and invertebrates 3 types of distributions: log- series, power function, lognormal. We compare 99 such distributions from all over the world. Row and column sums are identical due to our classification. We expect equal entries for each cell:

Do vertebrates and invertebrates differ in abundance distributions? But if we take the whole pattern we get Number of log-normal best fits only:

Students t-test for equal sample sizes and similar variances Welch t-test for unequal variances and sample sizes Bivariate comparisons of means F-test

In a physiological experiment mean metabolism rates had been measured. A first treatment gave mean = 100, variance = 45, a second treatment mean = 120, variance = 55. In the first case 30 animals in the second case 50 animals had been tested. Do means and variances differ? N 1 +N 2 -2 Degrees of freedom The probability level for the null hypothesis

The comparison of variances Degrees of freedom: N-1 The probability for the null hypothesis of no difference, H =0.713: probability that the first variance (50) is larger than the second (30). One sided test *0.287 Past gives the probability for a two sided test that one variance is either larger or smaller than the second. Two sided test

Power analysis Effect size In an experiment you estimated two means Each time you took 20 replicates. Was this sample size large enough to confirm differences between both means? We use the t- distribution with 19 degrees of freedom. You needed 15 replicates to confirm a difference at the 5% error level.

The t-test can be used to estimate the number of observations to detect a significant signal for a given effect size. From a physiological experiment we want to test whether a certain medicament enhances short time memory. How many persons should you test (with and without the treatment) to confirm a difference in memory of about 5%? We dont know the variances and assume a Poisson random sample. Hence 2 = We dont know the degrees of freedom: We use a large number and get t:

Home work and literature Refresh: 2 test Mendel rules t-test F-test Contingency table G-test Prepare to the next lecture: Coefficient of correlation Maximum, minimum of functions Matrix multiplication Eigenvalue Literature: Łomnicki: Statystyka dla biologów