Multiple Regression Analysis of Biological Data

Slides:



Advertisements
Similar presentations
Section 6.1: Scatterplots and Correlation (Day 1).
Advertisements

Multiple Regression and Model Building
Analysis of variance and statistical inference.
Automated Regression Modeling Descriptive vs. Predictive Regression Models Four common automated modeling procedures Forward Modeling Backward Modeling.
Extension The General Linear Model with Categorical Predictors.
Regression & Correlation Analysis of Biological Data Ryan McEwan and Julia Chapman Department of Biology University of Dayton
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple.
1 Multiple Regression Interpretation. 2 Correlation, Causation Think about a light switch and the light that is on the electrical circuit. If you and.
Multiple Logistic Regression RSQUARE, LACKFIT, SELECTION, and interactions.
From last time….. Basic Biostats Topics Summary Statistics –mean, median, mode –standard deviation, standard error Confidence Intervals Hypothesis Tests.
MULTIPLE REGRESSION. OVERVIEW What Makes it Multiple? What Makes it Multiple? Additional Assumptions Additional Assumptions Methods of Entering Variables.
Lecture 4: Correlation and Regression Laura McAvinue School of Psychology Trinity College Dublin.
1 BA 275 Quantitative Business Methods Simple Linear Regression Introduction Case Study: Housing Prices Agenda.
Lecture 24: Thurs., April 8th
Lecture 11 Multivariate Regression A Case Study. Other topics: Multicollinearity  Assuming that all the regression assumptions hold how good are our.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
CORRELATIO NAL RESEARCH METHOD. The researcher wanted to determine if there is a significant relationship between the nursing personnel characteristics.
Simple Linear Regression Analysis
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Correlation Nabaz N. Jabbar Near East University 25 Oct 2011.
Correlation & Regression
DISCLAIMER This guide is meant to walk you through the physical process of graphing and regression in Excel…. not to describe when and why you might want.
Simple Linear Regression
Lecture 12 Model Building BMTRY 701 Biostatistical Methods II.
Shonda Kuiper Grinnell College. Statistical techniques taught in introductory statistics courses typically have one response variable and one explanatory.
Model Selection1. 1. Regress Y on each k potential X variables. 2. Determine the best single variable model. 3. Regress Y on the best variable and each.
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
Correlational Research Chapter Fifteen Bring Schraw et al.
BIOL 582 Lecture Set 11 Bivariate Data Correlation Regression.
Topic 10 - Linear Regression Least squares principle - pages 301 – – 309 Hypothesis tests/confidence intervals/prediction intervals for regression.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chapter 7 Relationships Among Variables What Correlational Research Investigates Understanding the Nature of Correlation Positive Correlation Negative.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Exam 1 Review. Data referenced throughout review The Federal Trade Commission annually rates varieties of domestic cigarettes according to their tar,
Chapter 13 Multiple Regression
Multiple Regression INCM 9102 Quantitative Methods.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Model Building and Model Diagnostics Chapter 15.
Association between 2 variables We've described the distribution of 1 variable - but what if 2 variables are measured on the same individual? Examples?
Scatter Diagrams scatter plot scatter diagram A scatter plot is a graph that may be used to represent the relationship between two variables. Also referred.
Multiple Regression  Similar to simple regression, but with more than one independent variable R 2 has same interpretation R 2 has same interpretation.
Regression Handout Spring 2015 WFC, FWC, verbal expression of emotion and psychological strain.
B AD 6243: Applied Univariate Statistics Multiple Regression Professor Laku Chidambaram Price College of Business University of Oklahoma.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Unit 7 Statistics: Multivariate Analysis of Variance (MANOVA) & Discriminant Functional Analysis (DFA) Chat until class starts.
Correlation  We can often see the strength of the relationship between two quantitative variables in a scatterplot, but be careful. The two figures here.
Regression Chapter 5 January 24 – Part II.
Chapter 8 Relationships Among Variables. Chapter Outline What correlational research investigates Understanding the nature of correlation What the coefficient.
Video Conference 1 AS 2013/2012 Chapters 10 – Correlation and Regression 15 December am – 11 am Puan Hasmawati Binti Hassan
DSCI 346 Yamasaki Lecture 6 Multiple Regression and Model Building.
Exploratory Data Analysis
Regression & Correlation
REGRESSION (R2).
Statistics in MSmcDESPOT
Forward Selection The Forward selection procedure looks to add variables to the model. Once added, those variables stay in the model even if they become.
Analysis of Data Graphics Quantitative data
Exam 2 Analysis of Biological Data/Biometrics Dr. Ryan McEwan
CHAPTER 29: Multiple Regression*
Regression Model Building
Linear Model Selection and regularization
Multivariate Data Analysis of Biological Data/Biometrics Ryan McEwan
Lecture 12 Model Building
EQUATION 4.1 Relationship Between One Dependent and One Independent Variable: Simple Regression Analysis.
Regression Analysis.
Cases. Simple Regression Linear Multiple Regression.
Chapter 14 Multiple Regression
Presentation transcript:

Multiple Regression Analysis of Biological Data Ryan McEwan and Julia Chapman Department of Biology University of Dayton ryan.mcewan@udayton.edu

Simple linear regression is a way of understanding the relationship between two variables where the data analyst assumes that one variable (predictor; independent variable) drives a second variable (response; dependent). Extremely useful this is, and yet in most biological situations any given response variable is likely to be determined by more than just a single predictor. In this case, wing length is related to age, but you can imagine that nutritional status or gender could be important as well.

Here is aboveground biomass (Y axis) in a forest and stem density in that forest. You see a relationship, but a messy one. Maybe adding other variables would help Explain AGB. How about soil nitrogen? How about species diversity? How about mean temperature at each point? Etc.

In biology you may be collecting a slew of values that might serve as predictors for a potential response.

Consider a correlation matrix!!

Herbaceous cover =

You are building a model!! Herbaceous cover = +

Herbaceous cover = + + +

Multiple regression is a process of figuring out statistically what suite variables best predict a particular response… …okay how do you proceed? Herbaceous cover = + + +

Forward selection: + + Herbaceous cover = (1) select the variable that forms the best regression relationship with the response variable. (2) Add all of the variables in the pool, in a stepwise fashion, to find the best relationship, throwing back in weaker ones. (3) Repeat step 2 until adding in variables no longer makes a stronger relationship. Herbaceous cover = + +

Backward selection: + + Herbaceous cover = Start with all variables in the model (2) Eject each one and test the relationship (3) Throw back into the pool the variable(s) that weaken, or fail to strengthen the relationship. Herbaceous cover = + +

Backward selection: + + + + Herbaceous cover = Start with all variables in the model (2) Eject each one and test the relationship (3) Throw back into the pool the variable(s) that weaken, or fail to strengthen the relationship. Herbaceous cover = + + + +

+ + + + A few more things to cover: How to evaluate models? What about correlated variables What about categorical variables? Herbaceous cover = + + + +

+ + + + + + A few more things to cover: How to evaluate models? Herbaceous cover = + + + + Herbaceous cover = + +

+ + + + + + A few more things to cover: How to evaluate models? P-value R2 Akaike Information Criterion (AIC) Herbaceous cover = + + + + Herbaceous cover = + +

A few more things to cover: How to evaluate models? P-value R2 Akaike Information Criterion (AIC) AIC is a way of comparing the information content of different models. It does not provide a statistical test, per se, but rather provides a quantitative way to assess model fit vs. model complexity. The best model is the one with the lowest AIC

+ + + + A few more things to cover: How to evaluate models? What about correlated variables What about categorical variables? Herbaceous cover = + + + +

+ + + + A few more things to cover: How to evaluate models? What about correlated variables What about categorical variables? Herbaceous cover = + + + +

+ + + + A few more things to cover: How to evaluate models? What about correlated variables What about categorical variables? Strongly correlated variables effectively contain the same information, thus should not be inserted into the same model. The data analyst needs to assess “muliticollinearity” among the variables in the model. One simple way to think about it = correlation matrix. Formally, a model building procedure generally includes calculation of “Variable Inflation Factors” and ejecting from the model one of two variables that are highly correlated. Herbaceous cover = + + + +

+ + + + A few more things to cover: How to evaluate models? What about correlated variables? What about categorical variables? Multiple regression models CAN incorporate yes/no variables (logistic) or even categorical variables. Herbaceous cover = + + + + H vs. M vs. N invaded Burned vs. UnBurned

H vs. M vs. N invaded Burned vs. UnBurned