A preliminary exploration into the Binomial Logistic Regression Models in R and their potential application Andrew Trant PPS Arctic - Labrador Highlands.

Slides:



Advertisements
Similar presentations
Malcolm Cameron Xiaoxu Tan
Advertisements

© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Multinomial Logistic Regression David F. Staples.
Logistic Regression Example: Horseshoe Crab Data
Logistic Regression.
ANCOVA Regression with more than one line Andrew Jackson
Predicting Success in the National Football League An in-depth look at the factors that differentiate the winning teams from the losing teams. Benjamin.
Confidence Intervals Underlying model: Unknown parameter We know how to calculate point estimates E.g. regression analysis But different data would change.
Logistic Regression Predicting Dichotomous Data. Predicting a Dichotomy Response variable has only two states: male/female, present/absent, yes/no, etc.
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Gl
Introduction to Logistic Regression Analysis Dr Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
Genetic Association and Generalised Linear Models Gil McVean, WTCHG Weds 2 nd November 2011.
1 Logistic Regression Homework Solutions EPP 245/298 Statistical Analysis of Laboratory Data.
Logistic Regression with “Grouped” Data Lobster Survival by Size in a Tethering Experiment Source: E.B. Wilkinson, J.H. Grabowski, G.D. Sherwood, P.O.
MATH 3359 Introduction to Mathematical Modeling Download/Import/Modify Data, Logistic Regression.
Logistic Regression and Generalized Linear Models:
SPH 247 Statistical Analysis of Laboratory Data May 19, 2015SPH 247 Statistical Analysis of Laboratory Data1.
New Ways of Looking at Binary Data Fitting in R Yoon G Kim, Colloquium Talk.
Lecture 15: Logistic Regression: Inference and link functions BMTRY 701 Biostatistical Methods II.
MATH 3359 Introduction to Mathematical Modeling Project Multiple Linear Regression Multiple Logistic Regression.
Lecture 6 Generalized Linear Models Olivier MISSA, Advanced Research Skills.
© Department of Statistics 2012 STATS 330 Lecture 26: Slide 1 Stats 330: Lecture 26.
© Department of Statistics 2012 STATS 330 Lecture 25: Slide 1 Stats 330: Lecture 25.
Logistic Regression Pre-Challenger Relation Between Temperature and Field-Joint O-Ring Failure Dalal, Fowlkes, and Hoadley (1989). “Risk Analysis of the.
Introduction to Generalized Linear Models Prepared by Louise Francis Francis Analytics and Actuarial Data Mining, Inc. October 3, 2004.
Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
Repeated Measures  The term repeated measures refers to data sets with multiple measurements of a response variable on the same experimental unit or subject.
Logistic regression. Analysis of proportion data We know how many times an event occurred, and how many times did not occur. We want to know if these.
November 5, 2008 Logistic and Poisson Regression: Modeling Binary and Count Data LISA Short Course Series Mark Seiss, Dept. of Statistics.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
© Department of Statistics 2012 STATS 330 Lecture 31: Slide 1 Stats 330: Lecture 31.
Lecture 14: Introduction to Logistic Regression BMTRY 701 Biostatistical Methods II.
Lecture 7 GLMs II Binomial Family Olivier MISSA, Advanced Research Skills.
Design and Analysis of Clinical Study 10. Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
Tutorial 4 MBP 1010 Kevin Brown. Correlation Review Pearson’s correlation coefficient – Varies between – 1 (perfect negative linear correlation) and 1.
Applied Statistics Week 4 Exercise 3 Tick bites and suspicion of Borrelia Mihaela Frincu
Count Data. HT Cleopatra VII & Marcus Antony C c Aa.
1 Model choice Gil McVean, Department of Statistics Tuesday 17 th February 2007.
© Department of Statistics 2012 STATS 330 Lecture 22: Slide 1 Stats 330: Lecture 22.
Université d’Ottawa - Bio Biostatistiques appliquées © Antoine Morin et Scott Findlay :32 1 Logistic regression.
Logistic Regression. Example: Survival of Titanic passengers  We want to know if the probability of survival is higher among children  Outcome (y) =
Statistics 2: generalized linear models. General linear model: Y ~ a + b 1 * x 1 + … + b n * x n + ε There are many cases when general linear models are.
Example x y We wish to check for a non zero correlation.
© Department of Statistics 2012 STATS 330 Lecture 24: Slide 1 Stats 330: Lecture 24.
1 Say good things, think good thoughts, and do good deeds.
You can NOT be serious! Dr Tim Paulden EARL 2014, London, 16 September 2014 How to build a tennis model in 30 minutes (Innovation & Development Manager,
04/19/2006Econ 6161 Econ 616 – Spring 2006 Qualitative Response Regression Models Presented by Yan Hu.
Tutorial 5 Thursday February 14 MBP 1010 Kevin Brown.
Logistic Regression and Odds Ratios Psych DeShon.
Logistic Regression. What is the purpose of Regression?
R Programming/ Binomial Models Shinichiro Suna. Binomial Models In binomial model, we have one outcome which is binary and a set of explanatory variables.
Logistic Regression Jeff Witmer 30 March Categorical Response Variables Examples: Whether or not a person smokes Success of a medical treatment.
Lecture 21: poisson regression log-linear regression BMTRY 701 Biostatistical Methods II.
Unit 32: The Generalized Linear Model
Transforming the data Modified from:
Logistic regression.
A priori violations In the following cases, your data violates the normality and homoskedasticity assumption on a priori grounds: (1) count data  Poisson.
CHAPTER 7 Linear Correlation & Regression Methods
Logistic Regression Logistic Regression is used to study or model the association between a binary response variable (y) and a set of explanatory variables.
Measuring Success in Prediction
Statistical Methods For Engineers
F - Ratio Table Degrees of Freedom for the Factor
SAME THING?.
(& Generalized Linear Models)
DCAL Stats Workshop Bodo Winter.
PSY 626: Bayesian Statistics for Psychological Science
Basic Introduction LOGISTIC REGRESSION
Logistic Regression with “Grouped” Data
Presentation transcript:

A preliminary exploration into the Binomial Logistic Regression Models in R and their potential application Andrew Trant PPS Arctic - Labrador Highlands Research Group

One presentation in two parts Part 1 (today) -comparing binomial logistical regressions in R and Minitab -binomial GLMs and Odds Ratios Part 2 (next time) -using GLMs in conservation biology -an exploration of the past

Start off small….. before …and after Kettlewell, H B D (1956)

Research Question: Are the odds of survival higher for dark form (Melanic) than the light form (Typical)? Odds = e (  o) + e (  Type) + error (link = logit)

In Minitab: In R…

glm(formula = Nrecap/Nrel ~ type, family = binomial, data = moth,weights = Nrel) Call: Coefficients: (Intercept) typeTyp Degrees of Freedom: 1 Total (i.e. Null); 0 Residual Null Deviance: Residual Deviance: 1.592e-13 AIC: > summary(moth) Deviance Residuals: [1] 0 0 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) e-14 *** typeTyp e-06 *** --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: e+01 on 1 degrees of freedom Residual deviance: e-13 on 0 degrees of freedom AIC: Number of Fisher Scoring iterations: 3

In Minitab: In R… Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Odds = e (  o) + e (  Type) + error BUT R assumes that  o = melanistic NOT typical > exp( ) = R: MINITAB: >exp( ) =

Odds = e (  o) + e (  Type) + error BUT R assumes that  o = melanistic NOT typical > exp( ) = R: MINITAB: >exp( ) = SAME(ish)

R: MINITAB: >exp(0.9332) = Odds Ratio >exp( ) = >1/exp( )=2.5426

Calculating 95% Confidence Intervals CI = e Estimate±(SE*z-value) >exp(0.9332±(0.2069*1.96)) Lower = Upper=

You have reached the end of part one But there is a preliminary stab at part two

Dave’s Barnacles Tetraclita squamosa Acanthia sp. >avthickathole<-glm(formula=Npartial/N~AvThickAtHole,family=binomial, weights=N,data=test)

Average thickness at hole LM GLM

Average thickness at hole General Linear Model: >lm.avthickathole<-lm(w.wbar~AvThickAtHole,data=test) Generalized Linear Model: >avthickathole<-glm(formula=Npartial/N~AvThickAtHole,family=binomial, weights=N,data=test) Odds Ratio: 1.13 (remember…exp(0.1249))

Average thickness residuals plots LM GLM

Summary of model comparison…GLM vs LM

Height of barnacle General Linear Model: >lm(w.wbar ~ Ht, data = height) Generalized Linear Model: >glm.height<-glm(Npartial/N~Ht,family=binomial,weights=N,data=height)) Odds Ratio: 1.26

Height of barnacle residuals plots LM GLM

Summary of model comparisons…GLM vs LM

Max Diameter General Linear Model: >lm(w.wbar ~ MaxDiam, data = maxdiam) Generalized Linear Model: >glm(Npartial/N ~ MaxDiam, family = binomial, data = maxdiam,weights = N) Odds Ratio: 1.073

Max Diameter residuals plots LM GLM

Summary of model comparisons…GLM vs LM

Okay, that’s it…