Contingency (frequency) tables

Slides:



Advertisements
Similar presentations
Contingency Table Analysis Mary Whiteside, Ph.D..
Advertisements

CHI-SQUARE(X2) DISTRIBUTION
15- 1 Chapter Fifteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Basic Statistics The Chi Square Test of Independence.
Random variable Distribution. 200 trials where I flipped the coin 50 times and counted heads no_of_heads in a trial.
PSY 340 Statistics for the Social Sciences Chi-Squared Test of Independence Statistics for the Social Sciences Psychology 340 Spring 2010.
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
AM Recitation 2/10/11.
Categorical Data Prof. Andy Field.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 13: Nominal Variables: The Chi-Square and Binomial Distributions.
Chapter 11 Chi-Square Procedures 11.3 Chi-Square Test for Independence; Homogeneity of Proportions.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Chi-squared Tests. We want to test the “goodness of fit” of a particular theoretical distribution to an observed distribution. The procedure is: 1. Set.
© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey All Rights Reserved HLTH 300 Biostatistics for Public Health Practice, Raul.
Statistical test for Non continuous variables. Dr L.M.M. Nunn.
Analysis of Qualitative Data Dr Azmi Mohd Tamil Dept of Community Health Universiti Kebangsaan Malaysia FK6163.
Data Analysis for Two-Way Tables. The Basics Two-way table of counts Organizes data about 2 categorical variables Row variables run across the table Column.
HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence.
Chapter Outline Goodness of Fit test Test of Independence.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Making Comparisons All hypothesis testing follows a common logic of comparison Null hypothesis and alternative hypothesis – mutually exclusive – exhaustive.
Chi Square & Correlation
Outline of Today’s Discussion 1.The Chi-Square Test of Independence 2.The Chi-Square Test of Goodness of Fit.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Chi Square Test Dr. Asif Rehman.
I. ANOVA revisited & reviewed
Basic Statistics The Chi Square Test of Independence.
CHI-SQUARE(X2) DISTRIBUTION
Dependent-Samples t-Test
Chapter 9: Non-parametric Tests
Presentation 12 Chi-Square test.
Correlation I have two variables, practically „equal“ (traditionally marked as X and Y) – I ask, if they are independent and if they are „correlated“,
Analysis of variance ANOVA.
Making Comparisons All hypothesis testing follows a common logic of comparison Null hypothesis and alternative hypothesis mutually exclusive exhaustive.
Inference and Tests of Hypotheses
Chapter Fifteen McGraw-Hill/Irwin
Association between two categorical variables
Hypothesis Testing Review
Hypothesis testing. Chi-square test
Categorical Data Aims Loglinear models Categorical data
Chapter 25 Comparing Counts.
Qualitative data – tests of association
Spearman’s rho Chi-square (χ2)
Hypothesis Testing Summer 2017 Summer Institutes.
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Data Analysis for Two-Way Tables
The Chi-Square Distribution and Test for Independence
Discrete Event Simulation - 4
Hypothesis testing. Chi-square test
Comparing Populations
Statistical Inference about Regression
Association, correlation and regression in biomedical research
Statistical Analysis Chi-Square.
STAT 312 Introduction Z-Tests and Confidence Intervals for a
ECOSYSTEMS & ENERGY FLOW
Inference on Categorical Data
Chapter 26 Comparing Counts.
Goodness of Fit.
Power and Sample Size I HAVE THE POWER!!! Boulder 2006 Benjamin Neale.
Parametric versus Nonparametric (Chi-square)
Exact Test Fisher’s Statistics
UNIT V CHISQUARE DISTRIBUTION
Chapter Nine: Using Statistics to Answer Questions
S.M.JOSHI COLLEGE, HADAPSAR
Chapter Outline Goodness of Fit test Test of Independence.
CLASS 6 CLASS 7 Tutorial 2 (EXCEL version)
Karl L. Wuensch Department of Psychology East Carolina University
Quadrat sampling & the Chi-squared test
Quadrat sampling & the Chi-squared test
Presentation transcript:

Contingency (frequency) tables Dependence of two qualitative variables

Examples of problems Is survival of a person send to choleric area dependent on the fact whether the person have been vaccinated against cholera or not? Is there any connection between hair colour and sex? Are parasite species distributed independently?

Contingency table

Dependence of survival on vaccination Mutual dependence of two species

Relationship between two categorial variables in table in the case, when one from the variables is manipulated in the case, when one of the variables is probably a cause and the second one is a consequence (response), but the study is based on non-manipulative observations And finally, in the case, when the possible causality is unclear

Basic rules from theory of probability Probability of common occurrence of two independent events is Pi,j = Pi . Pj Example: In population is a half of its members male gender (Pmale=0.5) and a tenth of all individuals are albino (Palbino =0.1). If albinos are equally common in both sexes (i.e. albinism and sex are independent events), then probability that randomly chosen individual is albino male is Pmale * Palbino 0.5 * 0.1 = 0.05

Basic rules from theory of probability Expected number of successes E(a) from n experiments, where probability of a success is Pa is E(a)=Pa . n Example: Probability that mutation occurs is 0.02 - in 100 randomly chosen individuals we expect 2 individuals with this mutation

How we compute 2 ? How we obtain expected values? H0 says – events are independent – so, with help of probability of common occurrence of two independent events.

Calculation of expected values With help of marginal sums Pi. = Ri /n P.j = Cj / n Pij=Pi.P.j, E (fij) = Pij . n = (Ri / n) . (Cj / n) . n = Ri . Cj / n

What I need to know to know result of complete experiment (given the fixed marginal frequencies?) df = (c-1) . (r - 1) number of rows number of columns

Critical value on 5% level of significance by df=3.

What we usually write to our paper This area is 0.029, so we write 2 =8.99, df=3, P=0.029

Even here is sometimes (when extremely low expected frequencies) used Yates’ correlation better protection against Type I error, but weaker test

Another test criteria, but also with 2 distribution so-called 2 likelihood ratio (LR)

Similar results “Normal” 2 =8.99

2 by 2 tables Notice, that for null hypothesis’ table holds ad = bc

Statistical and causal dependence Causal dependence can be proved just due to manipulative experiment For “correct” experiment everyone has to be vaccinated, but half of them gets just placebo (compare what is possible and what is demanded by statistics).

Fundamentals of experimenter Every treatment has to have its control Control differs from treatment just in impact, which I want to prove (it is often very difficult) I have to have independent replications

Advantages of experiment and observation study Causality can be proved due to experiment Range of experimental manipulations is usually limited Almost every experimental impact has side effects, which are sometimes unpredictable

Fisher’s exact test How big is probability, that I get such or more different table in given marginal frequencies (providing that null hypothesis is true, computed with help of combinatorics). It is used for 2 x 2 table when numbers of observations are low.

If I have table Than Fisher’s test computes directly probability of this table, and all (from the view of H0) more extreme, i.e. Sum of all these probabilities is reached level of significance for one-way test (that’s why statistics also prints 2*p)

Let us compare two tables: 2 and power of test grow with number of observations - hereat both tables are choice from one population in great probability

Measurements of association stregth in 2 x 2 table – independent on sample size Y = ad/bc =f11f22 / f21f12 - disadvantage - asymmetric: 0 for negative association, 1 for independence, to + infinity for positive association from -1 over 0 for independence to + 1; -1 and + 1 (maximal possible association for given values of marg. frequencies) from -1 over 0 for independence to + 1; -1 and + 1 (maximal possible association for any values of marg. frequenies)

Multidimensional frequency tables Years present Species A absent present absent Species B Nowadays generalized linear models are used in these cases.