The Chi-Square Distribution 1. The student will be able to  Perform a Goodness of Fit hypothesis test  Perform a Test of Independence hypothesis test.

Slides:



Advertisements
Similar presentations
Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
Advertisements

Eight backpackers were asked their age (in years) and the number of days they backpacked on their last backpacking trip. Is there a linear relationship.
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Hypothesis Testing Steps in Hypothesis Testing:
Inference for Regression
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Correlation and Regression
July 1, 2008Lecture 17 - Regression Testing1 Testing Relationships between Variables Statistics Lecture 17.
Chapter 12 Simple Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Classical Regression III
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 10 Simple Regression.
Statistics Are Fun! Analysis of Variance
Linear Regression and Correlation
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Introduction to Probability and Statistics Linear Regression and Correlation.
Chapter 9: Correlation and Regression
SIMPLE LINEAR REGRESSION
Chi-Square and F Distributions Chapter 11 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Linear Regression/Correlation
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Lecture 5 Correlation and Regression
Correlation and Linear Regression
Chapter 12: Analysis of Variance
SIMPLE LINEAR REGRESSION
Confidence Intervals Chapter 8 Objectives 1. The student will be able to  Calculate and interpret confidence intervals for one population average and.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Regression Analysis (2)
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Copyright © 2004 Pearson Education, Inc.
Chapter 26 Chi-Square Testing
Copyright © 2010, 2007, 2004 Pearson Education, Inc Chapter 12 Analysis of Variance 12.2 One-Way ANOVA.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
One-Way Analysis of Variance
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Formula for Linear Regression y = bx + a Y variable plotted on vertical axis. X variable plotted on horizontal axis. Slope or the change in y for every.
SUMMARY EQT 271 MADAM SITI AISYAH ZAKARIA SEMESTER /2015.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 11 Multinomial Experiments and Contingency Tables 11-1 Overview 11-2 Multinomial Experiments:
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
Chapter 13 f distribution and 0ne-way anova
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Simple Linear Regression and Correlation
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

The Chi-Square Distribution 1

The student will be able to  Perform a Goodness of Fit hypothesis test  Perform a Test of Independence hypothesis test 2

 Chi-square is a distribution test statistics used to determine 3 things  Does our data fit a certain distribution? Goodness-of-fit  Are two factors independent? Test of independence  Does our variance change? Test of single variance 3

 Notation  new random variable ~  µ = df  2 = 2df  Facts about Chi-square  Nonsymmetrical and skewed right  value is always > zero  curve looks different for different degrees of freedom. As df gets larger curve approaches normal df > 90  mean is located to the right of the peak 4

 Hypothesis test steps are the same as always with the following changes  Test is always a right-tailed test  Null and alternate hypothesis are in words rather than equations  degrees of freedom = number of intervals - 1  test statistic defined as 5

A 6-sided die is rolled 120 times. The results are in the table below. Conduct a hypothesis test to determine if the die is fair. 6 Face ValueFrequency

 Contradictory hypotheses  H o : observed data fits a Uniform distribution (die is fair)  H a : observed data does not fit a Uniform distribution (die is not fair)  Determine distribution  Chi-square goodness-of-fit  right-tailed test  Perform calculations to find pvalue  enter observed into L1  enter expected into L2 7

 Perform calculations (cont.)  TI83 Access LIST, MATH, SUM enter sum((L1 - L2) 2 /L2) this is the test statistic For our problem chi-square = 13.6  Access DISTR and chicdf syntax is (test stat, 1  99, df) generate pvalue For our problem pvalue =  Make decision  since α > , reject null  Concluding statement  There is sufficient evidence to conclude that the observed data does not fit a uniform distribution. (The die is not fair.) 8

 Hypothesis testing steps the same with the following edit  Null and alternate in words  have a contingency table  expected values are calculated from the table (row total)(column total) sample size  Test statistic same  df = (#columns - 1)(#row - 1)  always right-tailed test 9

 Conduct a hypothesis test to determine whether there is a relationship between an employees performance in a company’s training program and his/her ultimate success on the job. Use a level of significance of 1%.  H o : Performance in training and success on job are independent  H a : Performance in training and success on job are not independent (or dependent). 10

 Performance on job versus performance in training Performance on Job 11 Below Average AverageAbove Average TOTAL Poor Average Very Good TOTAL Performance in training

 Determine distribution  right tailed  chi-square  Perform calculations to find pvalue  Calculator will calculated expected values. We must enter contingency table as a Matrix (ack!) Access MATRIX and edit Matrix A Access Chi-square test Matrix A = observed Matrix B calculator places expected here 12

 Perform calculations (cont.)  pvalue =  Make decision.   = 0.01 > pvalue =  reject null hypothesis  Concluding statement.  Performance in training and job success are dependent. 13

Linear Regression and Correlation Chapter Objectives 14

The student should be able to:  Discuss basic ideas of linear regression and correlation.  Create and interpret a line of best fit.  Calculate and interpret the correlation coefficient.  Find outliers. 15

 Method for finding the “best fit” line through a scatterplot of paired data  independent variable (x) versus dependent variable (y)  Recall from Algebra  equation of line y = a + bx where a is the y-intercept b is the slope of the line if b>0, slope upward to right if b<0, slope downward to right if b=0, line is horizontal 16

 The eye-ball method  Draw what looks to you to be the best straight line fit  Pick two points on the line and find the equation of the line  The calculated method  from calculus, we find the line that minimizes the distance each point is from the line that best fits the scatterplot  letting the calculator do the work using LinRegTTest 17 An example

Used to determine if the regression line is a “good fit”  ρ is the population correlation coefficient  r is the sample correlation coefficient Formidable equation  see text see text  Calculator does the work r positive - upward to right r negative - downward to right r zero - no correlation 18 Graphs

Determining if there is a “good fit”  Gut method if calculated r is close to 1 or -1, there’s a good fit  Hypothesis test (LinRegTest) Ho: ρ = 0 Ha ρ ≠ 0 Ho means here IS NOT a significant linear relationship(correlation) between x and y in the population. Ha means here IS A significant linear relationship (correlation) between x and y in the population To reject Ho means that there is a linear relationship between x and y in the population. Does not mean that one CAUSES the other.  Comparison to critical value Use table end of chaptertable Determine degrees of freedom df = n - 2 If r < negative critical value, then r is significant and we have a good fit If r > positive critical value, then r is significant and we have a good fit 19

 If the line is determined to be a good fit, the equation can be used to predict y or x values from x or y values  Plug the numbers into the equation  Equation is only valid for the paired data DOMAIN 20

Compare 1.9s to |y - yhat|for each (x, y) pair  if |y - yhat| > 1.9s, the point could be an outlier LinRegTest gives us s y – yhat is put into the RESID list when the LinRegTest is done  To see the RESID list: go to STAT, Edit, move cursor to a blank list name and type RESID, the residuals will show up. 21

F Distribution and ANOVA 22

The student should be able to:  Interpret the F distribution as the number of groups and the sample size change.  Discuss two uses for the F distribution and ANOVA.  Conduct and interpret ANOVA 23

 What is it good for?  Determines the existence of statistically significant differences among several group means.  Basic assumptions  Each population from which a sample is taken is assumed to be normal.  Each sample is randomly selected and independent.  The populations are assumed to have equal standard deviations (or variances).  The factor is the categorical variable.  The response is the numerical variable.  The Hypotheses  H o : µ 1 =µ 2 =µ 2 =…=µ k  H a : At least two of the group means are not equal  Always a right-tailed test 24

 Named after Sir Ronald Fisher  F statistic is a ratio (i.e. fraction)  two sets of degrees of freedom (numerator and denominator)  F ~ F df(num),df(denom)  Two estimates of variance are made  Variation between samples Estimate of σ 2 that is the variance of the sample means Variation due to treatment (i.e. explained variation)  Variation within samples Estimate of σ 2 that is the average of the sample variances Variations due to error (i.e. unexplained variation) 25

 Curve is skewed right.  Different curve for each set of degrees of freedom.  As the dfs for numerator and denominator get larger, the curve approximates the normal distribution  F statistic is greater than or equal to zero  Other uses  Comparing two variances  Two-Way Analysis of Variance 26

 Formula  MS between – mean square explained by the different groups  MS within – mean square that is due to chance  SS between – sum of squares that represents the variations among different samples  SS within – sum of squares that represents the variation within samples that is due to chance 27

 Enter the table data by columns into L1, L2, L3….  Do ANOVA test – ANOVA(L1, L2,..)  What the calculator gives  F – the F statistics  p – the pvalue  Factor – the between stuff df = # groups – 1 = k – 1 SS between MS between  Error – the within stuff df = total number of samples – # of groups = N – k SS within MS within 28

Four sororities took a random sample of sisters regarding their grade averages for the past term. The results are shown below: Using a significance level of 1%, is there a difference in grade averages among the sororities? 29 Sorority1Sorority 2Sorority 3Sorority

 What’s fair game  Chapter 1, Chapter 2., Chapter 3, Chapter 4, Chapter 5, Chapter 6, Chapter 7, Chapter 8, Chapter 9, Chapter 10, Chapter 11, Chapter 12 Chapter 1Chapter 2Chapter 3 Chapter 4Chapter 5Chapter 6 Chapter 7Chapter 8Chapter 9 Chapter 10Chapter 11Chapter 12  42 multiple choice questions  Do problems from each chapter  What to bring with you  Scantron (#2052), pencil, eraser, calculator, 2 sheets of notes (8.5x11 inches, both sides) 30

 Prepare for the Final exam  It has been a pleasure having you in class. Good luck and Godspeed with whatever path you take in life. 31