生物统计学 林隆慧.

Slides:



Advertisements
Similar presentations
Tests of Significance and Measures of Association
Advertisements

CHI-SQUARE(X2) DISTRIBUTION
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Chapter 11 Inferences About Population Variances
 R = Red eyesRr X Rr  r = White eyes. Used to confirm whether a set of data follows a specific probability distribution. IE…how likely is it that deviations.
The Chi-Square Test for Association
 What is chi-square  CHIDIST  Non-parameteric statistics 2.
S519: Evaluation of Information Systems
Statistical Significance What is Statistical Significance? What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant?
Χ 2 Distribution Given n random independent variants (x 1, x 2, …, x n ), their distribution will be a normal distribution. If these variants are all squared.
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
Statistical Significance What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant? How Do We Know Whether a Result.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Statistics 04-1 Testing for Differences. 平均数的显著性检验(样本与总体) 总体正态分布 总体方差已知 总体方差未知 总体非正态分布 平均数差异的显著性检验(总体与总体) 两组样本独立 两个总体方差 σ 1 2 、 σ 2 2 未知 两个总体方差不等 两组样本相关.
Chapter 9: Introduction to the t statistic
Statistical hypothesis testing – Inferential statistics I.
Inferential Statistics
The normal distribution
Chi-Squared Test.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Hypothesis Testing:.
Sections 8-1 and 8-2 Review and Preview and Basics of Hypothesis Testing.
Overview Basics of Hypothesis Testing
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
余红梅 Department of Health Statistics School of Public Health, Shanxi Medical University 卫生统计学 Health Statistics 第九章 检验( II ) chi-square test ( II )
Chi-Square X 2. Parking lot exercise Graph the distribution of car values for each parking lot Fill in the frequency and percentage tables.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Chi Square Analysis The chi square analysis allows you to use statistics to determine if your data “good” or not. In our fruit fly labs we are using laws.
State the ‘null hypothesis’ State the ‘alternative hypothesis’ State either one-tailed or two-tailed test State the chosen statistical test with reasons.
© Copyright McGraw-Hill 2004
Leftover Slides from Week Five. Steps in Hypothesis Testing Specify the research hypothesis and corresponding null hypothesis Compute the value of a test.
林隆慧 卡方检验. Chi-square (x) goodness of fit Chi-square goodness of fit is widely used to infer whether the population from which a sample of nominal data.
Chap 10-1 Password: shnu2010.
_ z = X -  XX - Wow! We can use the z-distribution to test a hypothesis.
Lecture 11. The chi-square test for goodness of fit.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
Type I and Type II Errors. For type I and type II errors, we must know the null and alternate hypotheses. H 0 : µ = 40 The mean of the population is 40.
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 11 Chi-Square Tests and Strategies.
S519: Evaluation of Information Systems Social Statistics Inferential Statistics Chapter 15: Chi-square.
Hypothesis Testing Hypothesis vs Theory  Hypothesis  An educated guess about outcome of an experiment  Theory  An explanation of observed facts that.
15 Inferential Statistics.
The Chi Square Test A statistical method used to determine goodness of fit Chi-square requires no assumptions about the shape of the population distribution.
Statistical Analysis: Chi Square
Chi-Squared Χ2 Analysis
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
Chapter 9 Hypothesis Testing.
杭州师范大学 林隆慧 Distribuions Probability distributions
Data analysis Research methods.
Review and Preview and Basics of Hypothesis Testing
Inference and Tests of Hypotheses
Hypothesis Testing Review
Qualitative data – tests of association
Patterns of inheritance
Chapter 6 Hypothesis tests.
P-value Approach for Test Conclusion
The Chi Square Test A statistical method used to determine goodness of fit Goodness of fit refers to how close the observed data are to those predicted.
Chi Square SBI3UP.
Analysis of count data 1.
Analyzing Data c2 Test….”Chi” Square.
The Chi Square Test A statistical method used to determine goodness of fit Goodness of fit refers to how close the observed data are to those predicted.
Inference for Categorical Data
Chi Square (2) Dr. Richard Jackson
The Chi Square Test A statistical method used to determine goodness of fit Goodness of fit refers to how close the observed data are to those predicted.
Inference on Categorical Data
Last Update 12th May 2011 SESSION 41 & 42 Hypothesis Testing.
Inference for Distributions of Categorical Data
Will use Fruit Flies for our example
What is Chi-Square and its used in Hypothesis? Kinza malik 1.
Presentation transcript:

生物统计学 林隆慧

Chi-square (x) goodness of fit Chi-square goodness of fit is widely used to infer whether the population from which a sample of nominal data came conforms to a certain theoretical distribution. e.g., a plant geneticist may raise 100 progeny from a cross that is hypothesized to result a 3:1 phenotypic ratio of pink-flowered to white-flowered. Perhaps a ratio of 84 pink: 16 white is observed, although out of this total of 100 roses, the geneticist’s hypothesis would predict a ratio of 75 pink: 25 white. The question to be answered, then, is whether the observed frequencies deviate significantly from the frequencies expected if the hypothesis were true

Chi-square (x) goodness of fit The following calculation of a statistic called chi-square is used as a measure of how far a sample distribution deviate from a theoretical distribution Here, Oi is the frequency, or number of counts, observed in class i, Ei is the frequency expected in class i if the null hypothesis is true, and the summation is performed over all k categories of data. Larger disagreement between observed and expected frequencies will results in a larger x2 value. Thus, this type of calculation is referred to as a measure of goodness of fit. A calculated x2 value can be as small as zero, in the case of perfect fit.

Chi-square goodness of fit for two categories Calculation of chi-square goodness of fit for k = 2 (e.g., data consisting of 100 flower colors to a hypothesized color ratio of 3: 1) H0: The sample data came from a population having a 3: 1 ratio of pink to white flowers HA: The sample data came from a population not having a 3: 1 flower color ratio Categories (flower color) Pink White n Oi 84 16 100 (Ei ) (75) (25) degree of freedom =  = k – 1 = 2 – 1 = 1 = (84 – 75)2/75 + (16 – 25)2/25 = 4.320 0.025 < P < 0.05. Therefore, reject H0 and accept HA

Statistical errors in hypothesis testing A probability of 5% or less is commonly used as the criterion for rejection of H0. The probability used as the criterion for rejection is termed the significance level, denoted by , and the value of the test statistic corresponding to this probability is the critical value (临界值) of the statistic. It is very important to realize that a true null hypothesis occasionally will be rejected, which of course means that we have committed an error. This error will be committed with a frequency of . That is, if H0 is in fact a true statement about a statistical population, it will be concluded erroneously to be false 5% of the time.

Two types of statistical errors Type I error: The rejection of a null hypothesis when it is in fact a true statement is a Type I error (also called  error, or an error of the first kind). (弃真) Type II error: On the other hand, if H0 is in fact false, our test may occasionally not detected this fact, and we shall have reached an erroneous conclusion by not rejecting H0. This error, of not rejecting the null hypothesis when it is in fact false, is a Type II error (also called  error, or an error of the second kind).(纳伪) If H0 is true If H0 is false No error Type I error  Type II error If H0 is rejected If H0 is not rejected  1- 1-

Chi-square goodness of fit for more than two categories Calculation of chi-square goodness of fit for k = 4 H0: The sample from a population having a 9: 3: 3: 1 color pattern of flowers HA: The sample from a population not having a 9: 3: 3: 1 color pattern of flowers Categories (flower color) Red rayed Red margined Blizzard Rayed n Red margined Oi 152 39 53 6 250 (Ei ) (140.6) (46.9) (46.9) (15.6) Red rayed  = k – 1 = 4 – 1 = 3 Rayed Blizzard = 8.956 0.025 < P < 0.05. Therefore, reject H0 and accept HA

Chi-square correction for continuity Chi-square values obtained from actual data belonging to discrete or discontinuous distribution. However, the theoretical x2 distribution is a continuous distribution. x2 values calculated obtained from discrete data ( = 1 in particular) are often overestimated and may therefore cause us to commit the Type I error with a probability greater than the stated . The Yates correction (see below) should routinely be used when  = 1

The log-likelihood ratio (G-test) The x2 test is the traditional method for tests of GOF. The G-test is an alternative to the x2 test for analyzing frequencies. The two methods are interchangeable. The G-test is increasingly used because: it is easier to calculate; mathematicians believe it has theoretical advantages in advanced applications G = 2  O ln (O/E) (ln = natural logarithm) The G-test statistic (G) uses the same tables as the x2 test. The G-test is based on the principle that the ratios of two probabilities can be used as a test statistic to measure the degree of agreement between sampled and expected frequencies. Williams (1976) recommends G be used in preference to x2 whenever any > expected frequency The two methods often yield the same conclusions; when they do not, many statiscians prefer G test and therefore recommend its routine use

G-test for more than two categories H0: The sample from a population having a 9: 3: 3: 1 color pattern of flowers HA: The sample from a population not having a 9: 3: 3: 1 color pattern of flowers Categories (flower color) Red rayed Red margined Blizzard Rayed n Red margined Oi 152 39 53 6 250 (Ei ) (140.6) (46.9) (46.9) (15.6) Red rayed  = k – 1 = 4 – 1 = 3 Rayed Blizzard G = 2  O ln (O/E) = 10.807 0.001 < P < 0.025. Therefore, reject H0 and accept HA

2×2 联表的独立性检验

2×2 联表的独立性检验

2×2 联表的独立性检验

2×2 联表的独立性检验

2×2 联表的独立性检验

R×C 列联表的独立性检验

R×C 列联表的独立性检验

R×C 列联表的独立性检验 Fisher’s Exact Test

R×C 列联表的独立性检验 Fisher’s Exact Test

R×C 列联表的独立性检验 Fisher’s Exact Test

配对卡方检验 把每一份样本平分为两份,分别用两种检测方法进行检测,比较两种方法的结果(两类计数资料)是否具有一致性或两种方法在哪些地方不一致 分别采用两种方法对同一批动、植物进行检查,比较此两种方法的结果是否有本质不同

配对卡方检验

配对卡方检验

配对卡方检验