Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linear model. a type of regression analyses statistical method – both the response variable (Y) and the explanatory variable (X) are continuous variables.

Similar presentations


Presentation on theme: "Linear model. a type of regression analyses statistical method – both the response variable (Y) and the explanatory variable (X) are continuous variables."— Presentation transcript:

1 Linear model

2 a type of regression analyses statistical method – both the response variable (Y) and the explanatory variable (X) are continuous variables – predict the value of one variable (the dependent variable - y)based on the value of other variables (independent variables x 1, x 2,…x k.)

3 3 Linear model Simplest linear model y = dependent variable x = independent variable  0 = y-intercept  1 = slope of the line = error variable  0 and  1 are unknown, therefore, are estimated from the data.

4 Linear model maximum likelihood estimates provide the ‘best’ estimates method of least squares least of residuals: vertical differences between the data and the fitted model

5 Linear model how to fit the simple linear regression model using lm () lm(formula, data=…, subset=…) By default, the lm () print out the estimates for the coefficients Usually, we store the results of the model in a variable, so that it can subsequently be queried for more

6 ref:www.pitt.edu/~njc23/Lecture10.pdf

7

8 Linear model Table one row represent each coefficient first column: estimate second column: standard error SE third column: t-statistic – testing null hypothes isH 0 : β = 0 fourth column: p-value for testing H 0 : β = 0 against the two-tailed alternative

9 Linear model residuals – difference between the actual values and predicted values from regression – if residuals look like a normal distribution when plotted, this indicates the mean of the difference between our predictions and the actual values is close to 0

10 Linear model estimated coefficient value of slope calculated by the regression standard error of estimated coefficient – measure of the variability in the estimate for the coefficient. Lower means better

11 Linear model R-square evaluate the goodness of fit of your model. Higher is better with 1 being the best. corresponds with the amount of variability in what you're predicting that is explained by the model.

12 Linear model Confidence intervals a type of interval estimate of a population parameter indicate the reliability of an estimate

13 Information Criterion measure of the relative quality of a statistical model log-likelihood value can be obtained, according to the formula -2*log-likelihood + k*npar, where npar represents the number of parameters in the fitted model – AIC: k = 2 – BIC or SBC (Schwarz's Bayesian criterion): k = log(n)

14 AIC Akaike Information Criterion (AIC) method for comparing models index takes into account a model’s statistical fit and the number of parameters needed to achieve this fit models with smaller AIC values indicate adequate fit with fewer parameter

15 ref: Robert Kabacoff. R in Action. New York:Manning Publsihing;2011

16 Homoskedasticity vs Heteroskedasticity variance of the error term is constant. (Homoscedasticity) If the error terms do not have constant variance, they are said to be heteroskedastic.

17 Robustness: Cook’s distance influential observation – observation that has a disproportionate impact on the determination of the model parameters based on the difference of the predicted values of y i for a given x i when the point (x i, y i )is and isn’t included in the calculation of the regression coefficients


Download ppt "Linear model. a type of regression analyses statistical method – both the response variable (Y) and the explanatory variable (X) are continuous variables."

Similar presentations


Ads by Google