Download presentation
Presentation is loading. Please wait.
Published byBetty Powers Modified over 9 years ago
1
Multiple Regression Harry R. Erwin, PhD School of Computing and Technology University of Sunderland
2
Resources Crawley, MJ (2005) Statistics: An Introduction Using R. Wiley. Freund, RJ, and WJ Wilson (1998) Regression Analysis, Academic Press. Gentle, JE (2002) Elements of Computational Statistics. Springer. Gonick, L., and Woollcott Smith (1993) A Cartoon Guide to Statistics. HarperResource (for fun).
3
Introduction In multiple regression, you have: –A continuous response variable, and –Two or more continuous explanatory variables. Your problems are not restricted to order. You often lack enough data to examine all the potential interactions and higher-order effects. –To explore the possibility of a third order interaction term with three explanatory variables (A:B:C) requires about 3 8 = 24 data values. –If there’s potential for curvature, you need 3 3 = 9 more data values to pin that down.
4
Issues to Address Which explanatory variables to include. Curvature in the response to explanatory variables. Interactions between explanatory variables. (High order interactions tend to be rare.) Correlation between explanatory variables. Over-parameterization.
5
Crawley’s Approach Use tree models to investigate complicated interactions. Use generalised additive models (gam()) to investigate curvature.
6
Book Example 1.1 pairs() Use of gam() Plot the model Use of tree() Plot the model Use of a linear model (lm()) Model reduction
7
Book Example 1.2 Plot the reduced model. Problems with –heteroscedastic –non-normal response Transform the response (log()) Model reduction Influential data point Final model
8
Book Example 2.1 6 explanatory variables! Tree model first Interactions outweigh non-linearity. 15 2-way interactions 20 3-way interactions 15 4-way interactions 6 5-way interactions 1 6-way interaction 6 quadratic terms ~70 parameters to be estimated (requires about 210 data points). 41 data points…
9
Book Example 2.2 First eliminate curvature. Then fit interaction terms in randomly selected sets. Keep just the significant interactions and see what the model regards as significant. Plot() Fit the third-order interactions that correspond to significant second-order interactions…
10
Help is at Hand! Try using step() Akaike’s Information Criterion (AIC) is used. Specify lower to protect nuisance variables in complex contingency table models. (Nuisance variables here are covariates like tree identifier or rat number that have to be kept to constrain the marginal totals.) Useful site: http://www.geodata.soton.ac.uk/biology/lexstats.html Note the graph at the beginning 8)! http://www.geodata.soton.ac.uk/biology/lexstats.html
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.