1 Final Review Econ 240A
2 Outline The Big Picture Processes to remember ( and habits to form) for your quantitative career (FYQC) Concepts to remember FYQC Discrete Distributions Continuous distributions Central Limit Theorem Regression
The Classical Statistical Trail Descriptive Statistics Inferential Statistics Probability Discrete Random Variables Discrete Probability Distributions; Moments Binomial Application Rates & Proportions
4 Where Do We Go From Here? Regression Properties Assumptions Violations Diagnostics Modeling Probability Count ANOVA Contingency Tables
5 Processes to Remember Exploratory Data Analysis Distribution of the random variable Histogram Lab 1 Stem and leaf diagram Lab 1 Box plot Lab 1 Time Series plot: plot of random variable y(t) Vs. time index t X-y plots: Y Vs. x 1, y Vs. x 2 etc. Diagnostic Plots Actual, fitted and residual Cross-section data: heteroskedasticity-White test Time series data: autocorrelation- Durbin- Watson statistic
6 Time Series
7
8 UCBudsh(t) = a + b*timex(t) + e(t) e(t) = 0.68*e(t-1) + u(t) 0.68*UCbudsh(t-1) = 0.68*a + b*0.68*timex(t-1) *e(t-1) [UCbudsh(t) – 0.68*UCbudsh(t-1)] = [(1-0.68)*a] + b*[timex – 0.68*timex(-1)] + u(t) Y(t) = a* + b*x(t) + u(t) Called autoregressive (auto-correlated) error
9
10
11 Concepts to Remember Random Variable: takes on values with some probability Flipping a coin Repeated Independent Bernoulli Trials Flipping a coin twice or more Random Sample Likelihood of a random sample Prob(e 1 ^e 2 …^e n ) = Prob(e 1 )*Prob(e 2 )…*Prob(e n )
12 Discrete Distributions Discrete Random Variables Probability density function: Prob(x=x*) Cumulative distribution function, CDF Equi-Probable or Uniform E.g x = 1, 2, 3 Prob(x=1) =1/3 = Prob(x=2) =Prob(x=3)
13 Discrete Distributions Binomial: Prob(k) = [n!/k!*(n-k)!]* p k (1-p) n-k E(k) = n*p, Var(k) = n*p*(1-p) Simulated sample binomial random variable Lab 2 Rates and proportions Poisson
14 Continuous Distributions Continuous random variables Density function, f(x) Cumulative distribution function Survivor function S(x*) = 1 – F(x*) Hazard function h(t) =f(t)/S(t) Cumulative hazard functin, H(t)
15 Continuous Distributions Simple moments E(x) = mean = expected value E(x 2 ) Central Moments E[x - E(x)] = 0 E[x – E(x)] 2 =Var x E[x – E(x)] 3, a measure of skewness E[x – E(x)] 4, a measure of kurtosis
16 Continuous Distributions Normal Distribution Simulated sample random normal variable Lab 3 Approximation to the binomial, n*p>=5, n*(1-p)>=5 Standardized normal variate: z = (x- )/ Exponential Distribution Weibull Distribution Cumulative hazard function: H(t) = (1/ ) t Logarithmic transform ln H(t) = ln (1/ ) + lnt
17
18
19 Central Limit Theorem Sample mean,
20 Population Random variable x Distribution f( f ? Sample Sample Statistic: Sample Statistic Pop.
21 The Sample Variance, s 2 Is distributed chi square with n-1 degrees of freedom (text, 12.2 “inference about a population variance) (text, pp , Chi-Squared distribution)
22 Regression Models Statistical distributions and tests Student’s t F Chi Square Assumptions Pathologies
23 Regression Models Time Series Linear trend model: y(t) =a + b*t +e(t) Lab 4 Exponential trend model: y(t) =exp[a+b*t+e(t)] Natural logarithmic transformation ln Ln y(t) = a + b*t + e(t) Lab 4 Linear rates of change: y i = a + b*x i + e i dy/dx = b Returns generating process: [r i (t) – r f 0 ] = + *[r M (t) – r f 0 ] + e i (t) Lab 6
24 Regression Models Percentage rates of change, elasticities Cross-section Ln assets i =a + b*ln revenue i + e i Lab 5 dln assets/dlnrevenue = b = [dassets/drevenue]/[assets/revenue] = marginal/average
25 Linear Trend Model Linear trend model: y(t) =a + b*t +e(t) Lab 4
26 Lab 4
27 Lab Four t-test: H 0 : b=0 H A : b≠0 t =[ – 0]/ = -14 F-test: F 1,36 = [R 2 /1]/{[1-R 2 ]/36} = 196 = Explained Mean Square/Unexplained mean square
28 Lab 4
29 Lab 4
30 Lab 4 2.5%
31 Lab Four % 196
32 Exponential Trend Model Exponential trend model: y(t) =exp[a+b*t+e(t)] Natural logarithmic transformation ln Ln y(t) = a + b*t + e(t) Lab 4
33 Lab Four
34 Lab Four
35 Percentage Rates of Change, Elasticities Percentage rates of change, elasticities Cross-section Ln assets i =a + b*ln revenue i + e i Lab 5 dln assets/dlnrevenue = b = [dassets/drevenue]/[assets/revenue] = marginal/average
36 Lab Five Elasticity b = H 0 : b=1 H A : b<1 t 25 = [0.778 – 1]/0.148 = t-crit(5%) = -1.71
37 Linear Rates of Change Linear rates of change: y i = a + b*x i + e i dy/dx = b Returns generating process: [r i (t) – r f 0 ] = + *[r M (t) – r f 0 ] + e i (t) Lab 6
38 Watch Excel on xy plots! True x axis: UC Net
39 Lab Six r GE = a + b*r SP500 + e
40 Lab Six
41 Lab Six
42 View/Residual tests/Histogram-Normality Test
43 Linear Multivariate Regression House Price, # of bedrooms, house size, lot size P i = a + b*bedrooms i + c*house_size i + d*lot_size i + e i
44 Lab Six price bedrooms House_size Lot_size
45 Price = a*dummy2 +b*dummy34 +c*dummy5 +d*house_size01 +e
46 Lab Six C captures three and four bedroom houses
47 Regression Models How to handle zeros? Labs Six and Seven: Lottery data-file Linear probability model: dependent variable: zero-one Logit: dependent variable: zero-one Probit: dependent variable: zero-one Tobit: dependent variable: lottery See PowerPoint application to lottery with Bern variable
48 Regression Models Failure time models Exponential Survivor: S(t) = exp[- *t], ln S(t) = - *t Hazard rate, h(t) = Cumulative hazard function, H(t) = *t Weibull Hazard rate, h(t) = f(t)/S(t) = ( / )(t/ ) -1 Cumulative hazard function: H(t) = (1/ ) t Logarithmic transform ln H(t) = ln (1/ ) + lnt
49 Applications: Discrete Distributions Binomial Equi-probable or uniform Poisson Rates & proportions, small samples, ex. Voting polls If I asked a question every day, without replacement, what is the chance I will ask you a question today? Approximate the binomial where p→0
50 Aplications: Discrete Distributions Multinomial More than two outcomes, ex each face of the die or 6 outcomes
51 Applications: Continuous Distributions Normal Equi-probable or uniform Students t Rates & proportions, np>5, n(1-p)>5; tests about population means given 2 Tests about population means, 2 not known; test regression parameter = 0
52 Applications: Continuous Distributions F Ch-Square, 2 Regression: ratio of explained mean square to unexplained mean square, i.e. R 2 /k÷(1-R 2 )/(n-k); test dropping 2 or more variables (Wald test) Contingency Table analysis; Likelihood ratio tests (Wald test)
53 Applications: Continuous Distributions Exponential Weibull Failure (survival) time with constant hazard rate Failure time analysis, test whether hazard rate is constant or increasing or decreasing
54 Labs 7, 8, 9 Lab 7 Failure Time Analysis Lab 8 Contingency Table Analysis Lab 9 One-Way and Two-Way ANOVA