Presentation is loading. Please wait.

Presentation is loading. Please wait.

Generalized regression techniques

Similar presentations


Presentation on theme: "Generalized regression techniques"— Presentation transcript:

1 Generalized regression techniques
(and why you should be using them) Brady Brady & Scott Wise JMP Global Enablement Team SAS Institute, INc Discovery Summit 2015

2 Overview Current state For many engineers and scientists, modeling begins and ends with OLS and stepwise/best subsets regression. Unfortunately, the OLS assumptions are often violated, leading to substandard models.

3 overview Historically speaking… Helpful techniques such as GLMs, penalized regression and quantile regression are not new—but have not generally been embraced by those in applied (i.e., non-academic) settings, for several reasons: Algorithmic expense, computationally speaking Few, if any, easy-to-use GUIs Lack of scaffolding and “smart” defaults for inexperienced users; “I need to be a statistician to use this”

4 Let’s get these tools into every modeler’s toolbox!
Overview going forward… The present state of computing power and GUIs now makes these techniques much easier for applied practitioners to use and interpret. Engineers and scientists can easily use these techniques using modern statistical software. Let’s get these tools into every modeler’s toolbox!

5 “Wide” problems, where p>n, which cannot be estimated using ML.
Penalized Regression Why is it useful? Multicollinearity: penalized methods produce more stable estimates by biasing the estimates, in an effort to reduce estimate variability. Variable (feature) selection: Choose the best few from among many possible predictors. More stable than stepwise regression. “Wide” problems, where p>n, which cannot be estimated using ML.

6 Penalized Regression common techniques Ridge Regression: penalizes the sum of the squares of the coefficients. Cannot perform variable selection. Good when you want to retain all model terms. LASSO: penalizes the sum of the absolute values of the coefficients. Performs variable selection. Elastic Net: Weighted average of Ridge and LASSO. Performs variable selection.

7 Penalized Regression Penalty Types As the LASSO penalty grows, its “sharp” unit sphere often first intersects a given level hypersurface for error on at least one coefficient axis, which effectively selects those variables out of the model. The smooth unit sphere of ridge regression’s penalty does this with probability zero:

8 Generalized Linear Models
Why are they useful? Deal with violations of OLS assumptions: Non-normal error distribution. Non-normal response distribution. Cases where the response is not a linear combination of the predictors. Non-constant error variance.

9 Generalized Linear Models
Fit the Right Distribution

10 Generalized Linear models
Fit the Right Distribution Cont.

11 Makes no distributional or variance-based assumptions.
Quantile Regression Why is it useful? Makes no distributional or variance-based assumptions. Allows the response/predictor relationship to vary with response. Allows modeling of not just the median, but of any quantile. Is more robust than OLS, lessening the influence of high-leverage points.

12 Quantile regression Why Is It Useful? Cont. OLS assumes that the relationship between any predictor and the response does not vary with the response. (The blue line would be horizontal.) Here, though, the relationship between marital status and birth weight depends on the birth weight. Quantile regression lets us deal with this issue while still using all of the data.


Download ppt "Generalized regression techniques"

Similar presentations


Ads by Google