Download presentation
Presentation is loading. Please wait.
Published byAndrew Norman Modified over 7 years ago
1
Statistical Inference and Regression Analysis: GB.3302.30
Professor William Greene Stern School of Business IOMS Department Department of Economics
2
Inference and Regression
Not Perfect Collinearity
3
Variance Inflation and Multicollinearity
When variables are highly but not perfectly correlated, least squares is difficult to compute accurately Variances of least squares slopes become very large. Variance inflation factors: For each xk, VIF(k) = 1/[1 – R2(k)] where R2(k) is the R2 in the regression of xk on all the other x variables in the data matrix
4
Gasoline Market Regression Analysis: logG versus logIncome, logPG
The regression equation is logG = logIncome logPG Predictor Coef SE Coef T P Constant logIncome logPG S = R-Sq = 93.6% R-Sq(adj) = 93.4% Analysis of Variance Source DF SS MS F P Regression Residual Error Total R2 = / =
5
Gasoline Market Regression Analysis: logG versus logIncome, logPG, ... The regression equation is logG = logIncome logPG logPNC logPUC logPPT Predictor Coef SE Coef T P Constant logIncome logPG logPNC logPUC logPPT S = R-Sq = 96.0% R-Sq(adj) = 95.6% Analysis of Variance Source DF SS MS F P Regression Residual Error Total R2 = / = logPG is no longer statistically significant when the other variables are added to the model.
6
Evidence of Multicollinearity: Regression of logPG on the other variables gives a very good fit.
7
Diagnostic Tools Look for incremental contributions to R2 when additional predictors are added Look for predictor variables not to be well explained by other predictors: (these are all the same) Look for “information” and independent sources of information Collinearity and influential observations can be related Removing influential observations can make it worse or better The relationship is far too complicated to say anything useful about how these two might interact.
8
NIST Statistical Reference Data Sets – Accuracy Tests
9
The Filipelli Problem
10
VIF for X10: R2 = VIF = D+15
12
Other software: Minitab reports the correct answer
Stata drops X10
13
Accurate and Inaccurate Computation of Filipelli Results
Accurate computation requires not actually computing (X’X)-1. We (and others) use the QR method. See text for details.
14
Stata Filipelli Results
15
Even after dropping two (random columns), results are only correct to 1 or 2 digits.
16
Inference and Regression
Testing Hypotheses
17
Testing Hypotheses
18
Hypothesis Testing: Criteria
19
The F Statistic has an F Distribution
20
Nonnormality or Large N
Denominator of F converges to 1. Numerator converges to chi squared[J]/J. Rely on law of large numbers for the denominator and CLT for the numerator: JF Chi squared[J] Use critical values from chi squared.
21
Significance of the Regression - R*2 = 0
22
Table of 95% Critical Values for F
24
+----------------------------------------------------+
| Ordinary least squares regression | | LHS=LOGBOX Mean = | | Standard deviation = | | Number of observs. = | | Residuals Sum of squares = | | Standard error of e = | | Fit R-squared = | | Adjusted R-squared = | |Variable| Coefficient | Standard Error |t-ratio |P[|T|>t]| Mean of X| |Constant| *** | |LOGBUDGT| ** | |STARPOWR| | |SEQUEL | | |MPRATING| | |ACTION | ** | |COMEDY | | |ANIMATED| * | |HORROR | | |PCBUZZ | *** | F = [( )/3] / [( )/(62 – 13)] = ; F* = 2.84
25
Inference and Regression
A Case Study
26
Mega Deals for Stars A Capital Budgeting Computation
Costs and Benefits Certainty: Costs Uncertainty: Benefits Long Term: Need for discounting
27
Baseball Story A Huge Sports Contract
Alex Rodriguez hired by the Texas Rangers for something like $25 million per year in 2000. Costs – the salary plus and minus some fine tuning of the numbers Benefits – more fans in the stands. How to determine if the benefits exceed the costs? Use a regression model.
28
The Texas Deal for Alex Rodriguez
2001 Signing Bonus = 10M Total: $252M ???
29
The Real Deal Year Salary Bonus Deferral 2001 21 2 5 to 2011
Deferrals accrue interest of 3% per year.
30
Costs Insurance: About 10% of the contract per year
(Taxes: About 40% of the contract) Some additional costs in revenue sharing revenues from the league (anticipated, about 17.5% of marginal benefits – uncertain) Interest on deferred salary - $150,000 in first year, well over $1,000,000 in 2010. (Reduction) $3M it would cost to have a different shortstop. (Nomar Garciaparra)
31
PDV of the Costs Using 8% discount factor (They used)
Accounting for all costs Roughly $21M to $28M in each year from 2001 to 2010, then the deferred payments from 2010 to 2020 Total costs: About $165 Million/Year in 2001 (Present discounted value)
32
Benefits More fans in the seats Gate Parking Merchandise
Increased chance at playoffs and world series Sponsorships (Loss to revenue sharing) Franchise value
33
How Many New Fans? Projected 8 more wins per year.
What is the relationship between wins and attendance? Not known precisely Many empirical studies (The Journal of Sports Economics) Use a regression model to find out.
34
Baseball Data 31 teams, 17 years (fewer years for 6 teams)
Winning percentage: Wins = 162 * percentage Rank Average attendance. Attendance = 81*Average Average team salary Number of all stars Manager years of experience Percent of team that is rookies Lineup changes Mean player experience Dummy variable for change in manager
35
Baseball Data (Panel Data)
36
A Dynamic Equation
41
About 220,000 fans
42
The Regression Model
45
Marginal Value of One Win
46
Marginal Value of an A Rod
8 games * 63,734 fans = 509,878 fans 509,878 fans * $18 per ticket $2.50 parking etc. $1.80 stuff (hats, bobble head dolls,…) $11.3 Million per year !!!!! It’s not close. (Marginal cost is at least $16.5M / year)
47
The IPN Player A-Rod and Yankees – The Iconic Performance Network Player Attendance rose to 4M in 2005, 4.3M in 2007 MVP in 2005 and 2007 Huge growth in the YES network Seemed certain to break Bonds’ HR record (Asterisk?) New deal: $275M over 10 years Chicago Cubs offer included team ownership. Drug Problems probably derailed this career path.
48
The Ghosts of Seasons Past: Long Run Implications - The Shadow Cost
The commitment to A-Rod limited the ability of the Texas Rangers to field a great team. The same problem now faces the Yankees. A-Rod is aging and becoming less likely to break the records. His steroid use has tarnished his reputation and reduced the value of his history. Why do teams do these long term mega deals for baseball players?
49
Kershaw vs. A Rod Shorter term, risk shifting onto the team Bargaining strength has shifted in favor of the player.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.