Example 12.4 Possible Gender Discrimination in Salary at Fifth National Bank of Springfield The Partial F Test.

Slides:



Advertisements
Similar presentations
Example 12.2 Multicollinearity | 12.3 | 12.3a | 12.1a | 12.4 | 12.4a | 12.1b | 12.5 | 12.4b a12.1a a12.1b b The Problem.
Advertisements

HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 12.2.
Soc 3306a Lecture 6: Introduction to Multivariate Relationships Control with Bivariate Tables Simple Control in Regression.
Example 12.3 Explaining Spending Amounts at HyTex Include/Exclude Decisions.
CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
ANOVA: Analysis of Variance
Example 14.3 Football Production at the Pigskin Company
Example 2.11 Comparison of Male and Female Movie Stars’ Salaries Exploring Data with Pivot Tables.
Chapter 12b Testing for significance—the t-test Developing confidence intervals for estimates of β 1. Testing for significance—the f-test Using Excel’s.
Example 10.1a Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant Hypothesis Tests for a Population Mean.
Chapter 9: Correlation and Regression
10-2 Correlation A correlation exists between two variables when the values of one are somehow associated with the values of the other in some way. A.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
| 13.1a | 13.2a | 13.2b | 13.3 | 13.3a | 13.4 | 13.3b | 13.5 | a13.2a13.2b a b Dummy Variables n Some potential.
Introduction to SPSS Short Courses Last created (Feb, 2008) Kentaka Aruga.
Example 16.3 Estimating Total Cost for Several Products.
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Example 10.1 Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant Concepts in Hypothesis Testing.
Chapter 12: Analysis of Variance
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Correlation and Regression
Simple Regression Scatterplots: Graphing Relationships.
Example 13.1 Forecasting Monthly Stereo Sales Testing for Randomness.
Chapter Correlation and Regression 1 of 84 9 © 2012 Pearson Education, Inc. All rights reserved.
Example 11.4 Demand and Cost for Electricity Modeling Possibilities.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Modeling Possibilities
Chapter 9 Comparing More than Two Means. Review of Simulation-Based Tests  One proportion:  We created a null distribution by flipping a coin, rolling.
Independent Samples t-Test (or 2-Sample t-Test)
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
CHAPTER 14 MULTIPLE REGRESSION
Basic concept Measures of central tendency Measures of central tendency Measures of dispersion & variability.
Inference About Regression Coefficients | 14.3 | 14.3a | 14.1a | 14.4 | 14.4a | 14.1b | 14.5 | 14.4b a14.1a a14.1b b BENDRIX.XLS.
Introduction to SPSS. Object of the class About the windows in SPSS The basics of managing data files The basic analysis in SPSS.
Example 13.6a Houses Sold in the Midwest Exponential Smoothing.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Chapter 13 Multiple Regression
Example 13.6 Houses Sold in the Midwest Moving Averages.
Example 13.2 Quarterly Sales of Johnson & Johnson Regression-Based Trend Models.
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Seven.
Example 10.2 Measuring Student Reaction to a New Textbook Hypothesis Tests for a Population Mean.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Example 16.6 Regression-Based Trend Models | 16.1a | 16.2 | 16.3 | 16.4 | 16.5 | 16.2a | 16.7 | 16.7a | 16.7b16.1a a16.7.
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Sundara Ram Matta Feb 16 th, Sundara Ram Matta Feb 16 th, 2015
Regression Analysis: Part 2 Inference Dummies / Interactions Multicollinearity / Heteroscedasticity Residual Analysis / Outliers.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 12 Multiple.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Example 13.3 Quarterly Sales at Intel Regression-Based Trend Models.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Example 16.2a Moving Averages | 16.1a | 16.2 | 16.3 | 16.4 | 16.5 | 16.6 | 16.7 | 16.7a | 16.7b16.1a a16.7b DOW.XLS.
Chapter Correlation and Regression 1 of 84 9 © 2012 Pearson Education, Inc. All rights reserved.
Copyright © 2009 Pearson Education, Inc t LEARNING GOAL Understand when it is appropriate to use the Student t distribution rather than the normal.
Example 3.1a Sensitivity Analysis and Solver Table Add-in.
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
Statistical hypothesis Statistical hypothesis is a method for testing a claim or hypothesis about a parameter in a papulation The statement H 0 is called.
Correlation and Regression
Dummy Variables Some potential explanatory variables are categorical and cannot be measured on a quantitative scale. However, we often need to use these.
DEPARTMENT OF COMPUTER SCIENCE
CHAPTER 26: Inference for Regression
 .
Performing the Runs Test Using SPSS
Chapter 7 Excel Extension: Now You Try!
Chapter 13 Excel Extension: Now You Try!
Chapter 9 Excel Extension: Now You Try!
Chapter 15 Excel Extension: Now You Try!
Presentation transcript:

Example 12.4 Possible Gender Discrimination in Salary at Fifth National Bank of Springfield The Partial F Test

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 Objective To use several partial F tests to see whether various groups of explanatory variables should be included in a regression equation for salary, given that other variables are already in the equation.

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 BANK.XLS n Recall from Example 11.3 that the Fifth National Bank has 208 employees. n The data for these employees are stored in this file. n In the previous chapter we ran several regressions for Salary to see whether there is convincing evidence of salary discrimination against females. n We will continue this analysis here.

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 Analysis Overview n First, we will regress Salary versus the Female dummy, YrsExper, and the interactions between Female and YrsExper, labeled Fem_YrsExper. This will be the reduced equation. n Then we’ll see whether the JobGrade dummies Job_2 to Job_6 add anything significant to the reduced equation. If so, we will then see whether the interactions between the Female dummy and the JobGrade dummies, labeled Fem_Job2 to Fem_Job6, add anything significant to what we already have.

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 Analysis Overview -- continued n If so, we’ll finally see whether the education dummies Ed_2 to Ed_5 add anything significant to what we already have.

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 Solution n First, note that we created all of the dummies and interaction variables with StatPro’s Data Utilities procedures. n Also, note that we have used three sets of dummies, for gender, job grad and education level. n When we use these in a regression equation, the dummy for one category of each should always be excluded; it is the reference category. The reference categories we have used are “male”, job grade 1 and education level 1.

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 Solution -- continued n The output for the “smallest” equation using Female, YrsExper, and Fem_YrsExper as explanatory variables is shown here.

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 Solution -- continued n We’re off to a good start. These three variables already explain 63.9% of the variation of Salary. n The output for the next equation which adds the explanatory variables Job_2 to Job_6 is on the next slide. n This equation appears much better. For example R 2 has increased to 81.1%. We check whether it is significantly better with the partial test in rows

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 Solution -- continued n The degrees of freedom in cell C28 is the same as the value in cell C12, the degrees of freedom for SSE. n Then we calculate the F-ratio in cell C29 with the formula =((Reduced!D12- Complete!D12)/Complete!C27)/Complete!E12 were Reduced!D12 refers to SSE for the reduced equation from the Reduced sheet. n Finally, we calculate the corresponding p-value in cell C30 with the formula =FDIST(C29,C27,C28). It is practically 0, so there is no doubt that the job grade dummies add significantly to the explanatory power of the equation.

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 Solution -- continued n Do the interactions between the Female dummy and the job dummies add anything more? n We again use the partial F test, but now the previous complete equation becomes the new reduced equation, and the equation that includes the new interaction terms becomes the new equation. n The output for this new complete equation is shown on the next slide. n We perform the partial F test in rows as exactly as before. The formula in C34 is =((Complete!D12- MoreComplete!D12)/MoreComplete!C32)/ MoreComplete!E12.

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 Solution -- continued n Again the p-value is extremely small, so there is no doubt that the interaction terms add significantly to what we already had. n Finally, we add the education dummies. n The resulting output is shown on the next slide. We see how the terms reduced and complete are relative. n This output now corresponds to the complete equation, and the previous output corresponds to the reduced equation.

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 Solution -- continued n The formula in cell C38 for the F-ratio is now =((MoreComplete!D12- StillMoreComplete!D12/StillMoreComplete!C36)/ StillMoreComplete!E12. The R 2 value increased from 84.0% to 84.7%. Also the p-value is not extremely small. n According to the partial F test, it is not quite enough to qualify for statistical significance at the 5% level. n Based on this evidence, there is not much to gain from including the education dummies in the equation, so we would probably elect to exclude them.

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 Concluding Comments n First, the partial test is the formal test of significance for an extra set of variables. Many users look only at the R 2 and/or s e values to check whether extra variables are doing a “good job”. n Second, if the partial F test shows that a block of variables is significant, it does not imply that each variable in this block is significant. Some of these values can have low t-values.

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 Concluding Comments -- continued n Third, producing all of these outputs and doing the partial F tests is a lot of work. Therefore, we included a “Block” option in StatPro to make life easier. To run the analysis in this example use StatPro/Regression analysis/Block menu item. After selecting Salary as the response variable, we see this dialog box.

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 Concluding Comments -- continued n We want four blocks of explanatory variables, and we want a given block to enter only if it passes the partial F test at the 5% level. In later dialog boxes we specify the explanatory variables. Once we have specified all this, the regression calculations are done in stages. The output from this appears on the next two slides. The output spans over two figures. Note that the output for Block 4 has been left off because it did not pass the F test at 5%.

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5

| 12.2 | 12.3 | 12.3a | 12.1a | 12.4a | a12.1a12.4a12.5 Concluding Comments -- continued n Finally, we have concentrated on the partial F test and statistical significance in this example. We don’t want you to lose sight, however, of the bigger picture. Once we have decided on a “final” regression equation we need to analyze its implications for the problem at hand. n In this case the bank is interested in possible salary discrimination against females, so we should interpret this final equation in these terms. Our point is simply that you shouldn’t get so caught in the details of statistical significance that you lose sight of the original purpose of the analysis!