(& Generalized Linear Models)

(& Generalized Linear Models)
Nathaniel MacHardy Fall 2014

Why care about data management?
Everything in R is technically data “classic” data: tables Model Results Functions

Cooking with Data Simple recipes we’ll learn today
Subset according to some rule Code a variable into simpler categories Merge data on a key value (ID) Recode particular values of a variable

Recipe 1: Subset object.to.subset[subset instructions]
By index vector[ c(1,2,3) ] data [ 2, 4 ] data [ 3, ] data [ , 2 ] By name cars[ c(“speed”,”dist”) ] By logical (1:3)[ c(TRUE,FALSE,TRUE) ] By expression cars$dist[ cars$speed== 7]

Operators: Generate TRUE/FALSE
Feed these into square brackets! foods %in% c(“pizza”,”pie”) age > 18 eyecolor == “red”

Basic Functions: Combine/Transform
c() combine data.frame() bind into data frame rbind() stack data merge() merge data by key t() swap rows/columns

Functions II: Shortcuts
# the long way to classify cat <- rep(NA,length(cat)) # empty vector cat[values<30] <- “high” cat[values>=30 & values<70] <- “medium” cat[values>=70] <- “high” table(cat) high low medium

Recipe 2: Classifying Data with Cut
cat <- cut(values,c(0,30,70,100)) table(cat) (0,30] (30,70] (70,100] cat <- cut(values,c(0,30,70,100), labels=c(“low”,”med”,”high”)) # if desired

Recipe 3: Merge Data health <- data.frame(
id=c(“a”,”b”,”c”), age=c(33,56,13) ) out <- data.frame( id=c(“b”,”c”,”a”), outcome=c(0,0,1)) merged <- merge(health,out,by=“id”)

Recipe 4: Replacing Data
What if you want to recode part of an object? Easy: subset the “destination” of <- x <- c(1,NA,0,NA,3,NA,4,6) # replace NA x[is.na(x)] <- 0 y <- c(1,1,4,10,99,100) # clip range y[y>10] <- 10

Part II: Generalized Linear Models

Why care about Generalized Linear Models (GLMs)?
Almost all Epi models are GLMs Odds ratios (logistic) Risk ratios (log-binomial) “Fancy” models are usually just extensions of GLMs GWR (geographically weighted regression) adds a “weights matrix” MCMC (Markov Chain Monte Carlo) solves GLMs with priors

Basic Parts of GLMs Y: Outcome, Dependent Variable
X: Predictor(s), Covariates, Exposure(s), Independent Variable(s) Y = mX + b

Specifying a model in R A “formula” with Outcome on left
Exposures on right Y ~ X

Example: Using glm() glm(uptake ~ Type) data(CO2); attach(CO2)
Call: glm(formula = uptake ~ Type) Coefficients: (Intercept) TypeMississippi Degrees of Freedom: 83 Total (i.e. Null); 82 Residual Null Deviance: Residual Deviance: AIC: 607.6

Example: Using glm() glm(uptake ~ type + conc) data(CO2); attach(CO2)
Call: glm(formula = uptake ~ Type + conc) Coefficients: (Intercept) TypeMississippi conc Degrees of Freedom: 83 Total (i.e. Null); 81 Residual Null Deviance: Residual Deviance: AIC: 572.1

More info: glm() object
m1 <- glm(uptake ~ type + conc) summary(m1) Coefficients: Estimate Std. Error t value Pr(>|t|) (intercept) < 2e-16 *** Mississippi e-12 *** conc e-09 *** Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

More info: glm() object
> confint(m1) Waiting for profiling to be done... 2.5 % % (Intercept) TypeMississippi conc

Getting Epidemiological Measures
Specify the family argument glm(y~x) # default linear, risk difference glm(y~x, family=binomial(“logit”) # odds ratio For odds ratio and risk ratio, make sure to exponentiate the output with exp() first R gives you the beta coefficient

Lab: Practice with GLMs
You'll learn more about model-building in courses What goes in the model (confounders) Choosing a “best” model Practice data management Preparing data for analyses Formatting/extracting results

(& Generalized Linear Models)

Similar presentations

Presentation on theme: "(& Generalized Linear Models)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

(& Generalized Linear Models)

Similar presentations

Presentation on theme: "(& Generalized Linear Models)"— Presentation transcript:

Similar presentations

About project

Feedback