Download presentation
Presentation is loading. Please wait.
Published byLucinda Chambers Modified over 8 years ago
1
Linear model
2
a type of regression analyses statistical method – both the response variable (Y) and the explanatory variable (X) are continuous variables – predict the value of one variable (the dependent variable - y)based on the value of other variables (independent variables x 1, x 2,…x k.)
3
3 Linear model Simplest linear model y = dependent variable x = independent variable 0 = y-intercept 1 = slope of the line = error variable 0 and 1 are unknown, therefore, are estimated from the data.
4
Linear model maximum likelihood estimates provide the ‘best’ estimates method of least squares least of residuals: vertical differences between the data and the fitted model
5
Linear model how to fit the simple linear regression model using lm () lm(formula, data=…, subset=…) By default, the lm () print out the estimates for the coefficients Usually, we store the results of the model in a variable, so that it can subsequently be queried for more
6
ref:www.pitt.edu/~njc23/Lecture10.pdf
8
Linear model Table one row represent each coefficient first column: estimate second column: standard error SE third column: t-statistic – testing null hypothes isH 0 : β = 0 fourth column: p-value for testing H 0 : β = 0 against the two-tailed alternative
9
Linear model residuals – difference between the actual values and predicted values from regression – if residuals look like a normal distribution when plotted, this indicates the mean of the difference between our predictions and the actual values is close to 0
10
Linear model estimated coefficient value of slope calculated by the regression standard error of estimated coefficient – measure of the variability in the estimate for the coefficient. Lower means better
11
Linear model R-square evaluate the goodness of fit of your model. Higher is better with 1 being the best. corresponds with the amount of variability in what you're predicting that is explained by the model.
12
Linear model Confidence intervals a type of interval estimate of a population parameter indicate the reliability of an estimate
13
Information Criterion measure of the relative quality of a statistical model log-likelihood value can be obtained, according to the formula -2*log-likelihood + k*npar, where npar represents the number of parameters in the fitted model – AIC: k = 2 – BIC or SBC (Schwarz's Bayesian criterion): k = log(n)
14
AIC Akaike Information Criterion (AIC) method for comparing models index takes into account a model’s statistical fit and the number of parameters needed to achieve this fit models with smaller AIC values indicate adequate fit with fewer parameter
15
ref: Robert Kabacoff. R in Action. New York:Manning Publsihing;2011
16
Homoskedasticity vs Heteroskedasticity variance of the error term is constant. (Homoscedasticity) If the error terms do not have constant variance, they are said to be heteroskedastic.
17
Robustness: Cook’s distance influential observation – observation that has a disproportionate impact on the determination of the model parameters based on the difference of the predicted values of y i for a given x i when the point (x i, y i )is and isn’t included in the calculation of the regression coefficients
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.