Chi-square and F Distributions

Slides:



Advertisements
Similar presentations
Chi-square and F Distributions
Advertisements

Statistical Techniques I
Chapter 6 Confidence Intervals.
Objectives 10.1 Simple linear regression
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
Classical Regression III
The Normal Distribution. n = 20,290  =  = Population.
Statistics Are Fun! Analysis of Variance
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
Introduction to Analysis of Variance (ANOVA)
Chapter 7 Inferences Regarding Population Variances.
The t-test Inferences about Population Means when population SD is unknown.
AM Recitation 2/10/11.
The Chi-Square Distribution 1. The student will be able to  Perform a Goodness of Fit hypothesis test  Perform a Test of Independence hypothesis test.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
Confidence Intervals Chapter 6. § 6.1 Confidence Intervals for the Mean (Large Samples)
1 Level of Significance α is a predetermined value by convention usually 0.05 α = 0.05 corresponds to the 95% confidence level We are accepting the risk.
SECTION 6.4 Confidence Intervals for Variance and Standard Deviation Larson/Farber 4th ed 1.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Today’s lesson Confidence intervals for the expected value of a random variable. Determining the sample size needed to have a specified probability of.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 11 Inferences About Population Variances n Inference about a Population Variance n.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 ( ) Chapter 15 (15.5) Prof. Vera Adamchik.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
Testing Differences in Population Variances
© Copyright McGraw-Hill 2000
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
Inferences Concerning Variances
Statistics for Political Science Levin and Fox Chapter Seven
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
1 1 Slide © 2011 Cengage Learning Assumptions About the Error Term  1. The error  is a random variable with mean of zero. 2. The variance of , denoted.
Chapter 10 Section 5 Chi-squared Test for a Variance or Standard Deviation.
Confidence Intervals. Point Estimate u A specific numerical value estimate of a parameter. u The best point estimate for the population mean is the sample.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 9 Introduction to the t Statistic
The 2 nd to last topic this year!!.  ANOVA Testing is similar to a “two sample t- test except” that it compares more than two samples to one another.
Chapter 13 Analysis of Variance (ANOVA). ANOVA can be used to test for differences between three or more means. The hypotheses for an ANOVA are always:
Statistical Inferences for Population Variances
Hypothesis Testing – Two Population Variances
Virtual University of Pakistan
Sampling distribution of
Chapter 13 f distribution and 0ne-way anova
Lecture Nine - Twelve Tests of Significance.
Chapter 4. Inference about Process Quality
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
SOME IMPORTANT PROBABILITY DISTRIBUTIONS
Testing a Claim About a Mean:  Not Known
John Loucks St. Edward’s University . SLIDES . BY.
CHAPTER 12 ANALYSIS OF VARIANCE
Inferences Regarding Population Variances
Section 6-4 – Confidence Intervals for the Population Variance and Standard Deviation Estimating Population Parameters.
Chapter 11 Inferences About Population Variances
Interval Estimation and Hypothesis Testing
Inferences Regarding Population Variances
Hypothesis Tests for a Standard Deviation
Section 6-4 – Confidence Intervals for the Population Variance and Standard Deviation Estimating Population Parameters.
Chapter 10 Introduction to the Analysis of Variance
Chapter 6 Confidence Intervals.
Chapter 10 – Part II Analysis of Variance
Statistical Inference for the Mean: t-test
Presentation transcript:

Chi-square and F Distributions Children of the Normal

Questions What is the chi-square distribution? How is it related to the Normal? How is the chi-square distribution related to the sampling distribution of the variance? Test a population value of the variance; put confidence intervals around a population value.

Questions How is the F distribution related the Normal? To Chi-square?

Distributions There are many theoretical distributions, both continuous and discrete. Howell calls these test statistics We use 4 test statistics a lot: z (unit normal), t, chi-square ( ), and F. Z and t are closely related to the sampling distribution of means; chi-square and F are closely related to the sampling distribution of variances.

Chi-square Distribution (1) z score z score squared Make it Greek What would its sampling distribution look like? Minimum value is zero. Maximum value is infinite. Most values are between zero and 1; most around zero.

Chi-square (2) What if we took 2 values of z2 at random and added them? Same minimum and maximum as before, but now average should be a bit bigger. Chi-square is the distribution of a sum of squares. Each squared deviation is taken from the unit normal: N(0,1). The shape of the chi-square distribution depends on the number of squared deviates that are added together.

Chi-square 3 The distribution of chi-square depends on 1 parameter, its degrees of freedom (df or v). As df gets large, curve is less skewed, more normal.

Chi-square (4) The expected value of chi-square is df. The mean of the chi-square distribution is its degrees of freedom. The expected variance of the distribution is 2df. If the variance is 2df, the standard deviation must be sqrt(2df). There are tables of chi-square so you can find 5 or 1 percent of the distribution. Chi-square is additive.

Distribution of Sample Variance Sample estimate of population variance (unbiased). Multiply variance estimate by N-1 to get sum of squares. Divide by population variance to stadnardize. Result is a random variable distributed as chi-square with (N-1) df. We can use info about the sampling distribution of the variance estimate to find confidence intervals and conduct statistical tests.

Testing Exact Hypotheses about a Variance Test the null that the population variance has some specific value. Pick alpha and rejection region. Then: Plug hypothesized population variance and sample variance into equation along with sample size we used to estimate variance. Compare to chi-square distribution.

Example of Exact Test Test about variance of height of people in inches. Grab 30 people at random and measure height. Note: 1 tailed test on small side. Set alpha=.01. Mean is 29, so it’s on the small side. But for Q=.99, the value of chi-square is 14.257. Cannot reject null. Note: 2 tailed with alpha=.01. Now chi-square with v=29 and Q=.995 is 13.121 and also with Q=.005 the result is 52.336. N. S. either way.

Confidence Intervals for the Variance We use to estimate . It can be shown that: Suppose N=15 and is 10. Then df=14 and for Q=.025 the value is 26.12. For Q=.975 the value is 5.63.

Normality Assumption We assume normal distributions to figure sampling distributions and thus p levels. Violations of normality have minor implications for testing means, especially as N gets large. Violations of normality are more serious for testing variances. Look at your data before conducting this test. Can test for normality.

R functions for chi-square M(chisq) = df (expected value) SD(chisq) = 2df (expected value) qchisq(.95, df=7) = 14.07 qchisq(.99, df=7) = 18.48 pchisq(7, df=7) = .57 pchisq(100, df=100) = .52 rchisq(4, df=3) [1] 0.6 2.3 2.9 1.2

Review You have sample 25 children from an elementary school 5th grade class and measured the height of each. You wonder whether these children are more variable in height than typical children. Their variance in height is 4. Compute a confidence interval for this variance. If the variance of height in children in 5th grade nationally is 2, do you consider this sample ordinary?

The F Distribution (1) The F distribution is the ratio of two variance estimates: Also the ratio of two chi-squares, each divided by its degrees of freedom: In our applications, v2 will be larger than v1 and v2 will be larger than 2. In such a case, the mean of the F distribution (expected value) is v2 /(v2 -2).

F Distribution (2) F depends on two parameters: v1 and v2 (df1 and df2). The shape of F changes with these. Range is 0 to infinity. Shaped a bit like chi-square. F tables show critical values for df in the numerator and df in the denominator. F tables are 1-tailed; can figure 2-tailed if you need to (but you usually don’t).

F table – critical values e.g. critical value of F at alpha=.05 with 3 & 12 df =3.49

Testing Hypotheses about 2 Variances Suppose Note 1-tailed. We find Then df1=df2 = 15, and Going to the F table with 15 and 15 df, we find that for alpha = .05 (1-tailed), the critical value is 2.40. Therefore the result is significant. Hint: put the larger variance on the top. That way you are looking at the right tail.

R code F distribution qf(.95, df1=3, df2=12) = 3.49 pf(15, df1=3, df2=12) = .9998 1-pf(15, df1=3, df2=12) = .0002

A Look Ahead The F distribution is used in many statistical tests Test for equality of variances. Tests for differences in means in ANOVA. Tests for regression models (models relating one continuous variable to another like SAT and GPA). R-square and R-square change or increment.

Relations among Distributions – the Children of the Normal Chi-square is drawn from the normal. N(0,1) deviates squared and summed. F is the ratio of two chi-squares, each divided by its df. A chi-square divided by its df is a variance estimate, that is, a sum of squares divided by degrees of freedom. F = t2. If you square t, you get an F with 1 df in the numerator.

Review How is F related to the Normal? To chi-square? Suppose we have 2 samples and we want to know whether they were drawn from populations where the variances are equal. Sample1: N=50, s2=25; Sample 2: N=60, s2=30. How can we test? What is the best conclusion for these data?