Presentation is loading. Please wait.

Presentation is loading. Please wait.

Welcome to the class! set.seed(843) df <- tibble::data_frame(

Similar presentations


Presentation on theme: "Welcome to the class! set.seed(843) df <- tibble::data_frame("โ€” Presentation transcript:

1 Welcome to the class! set.seed(843) df <- tibble::data_frame(
x = rnorm(100), y = x + rnorm(100), group = ifelse(x > 0, 1, 0), xs = scale(x) %>% as.numeric, ys = scale(y) %>% as.numeric ) df library(tidyverse) df %>% t.test(y ~ group, data = ., var.equal = TRUE) lm(y ~ group, data = .) %>% summary() aov(y ~ group, cor.test(~ ys + xs, data = .) lm(ys ~ xs,

2 Statistical Control & Linear Models
EDUC 7610 Statistical Control & Linear Models Chapter 1 Fall 2018 Tyson S. Barrett, PhD ๐’€ ๐’Š = ๐œท ๐ŸŽ + ๐œท ๐Ÿ ๐‘ฟ ๐Ÿ๐’Š + ๐ ๐’Š

3 Linear Models We will discuss how to use linear models to understand relationships in data and how this may generalize to the population Everything you learned in EDUC 6600 applies here, but we will extend it (and probably simplify it somewhat)

4 Linear Models Most everything youโ€™ve learned are specific approaches of linear models T-tests Two Sample t-test data: y by group t = , df = 98, p-value = 2.068e-09 95 percent confidence interval: sample estimates: mean in group 0 mean in group 1 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) e-06 *** group e-09 *** --- Residual standard error: on 98 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 98 DF, p-value: 2.068e-09

5 Linear Models Most everything youโ€™ve learned are specific approaches of linear models T-tests ANOVA Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) e-06 *** group e-09 *** --- Residual standard error: on 98 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 98 DF, p-value: 2.068e-09 Df Sum Sq Mean Sq F value Pr(>F) group e-09 Residuals

6 Linear Models Most everything youโ€™ve learned are specific approaches of linear models T-tests ANOVA Correlation Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) e e xs e e <2e-16 *** --- Residual standard error: on 98 degrees of freedom Multiple R-squared: F-statistic: on 1 and 98 DF, p-value: < 2.2e-16 t = , df = 98, p-value < 2.2e-16 95 percent confidence interval: sample estimates: cor

7 So what are linear models?
โ€œModelโ€ is a simplification of how something works โ€œLinearโ€ says the relationships are linear Here, we use regression analysis (Ordinary Least Squares) to build linear models

8 So what are linear models?
Outcome Variable Covariates ๐’€ ๐’Š = ๐œท ๐ŸŽ + ๐œท ๐Ÿ ๐‘ฟ ๐Ÿ๐’Š + โ€ฆ + ๐ ๐’Š Error (random) term Intercept (constant) Regression Slope (coefficient, weight)

9 Requirements for linear models
Participants (rows in a rectangular data matrix) Measures (in rectangular data matrix form) Each variable a single column of numbers A single dependent variable Dependent variable is numeric (otherwise, GLMs) Other assumptions for statistical inference Tidy Data Page 9

10 Tidy Data The idea of having rows be observations and columns are variables The ideal for data prep for all analyses (especially in R) Pre-print:

11 Tidy Data Some pragmatic guidelines to have tidy data
1 Be Consistent 2 Choose good names for things 3 Write dates as YYYY-MM-DD 4 Put just one thing in a cell 5 Make it a rectangle 6 Create a data dictionary

12 Linear models are flexible
Predictor can be experimentally manipulated or naturally occurring A dependent variable in one model can be a predictor in another Predictors may be dichotomous, multi-categorical or continuous Predictors can be correlated Predictors may interact Can be extended to handle curvilinear relations Assumptions are not all that limiting Page 10

13 Some Vocabulary Predictor = the variable that predicts the outcome
Independent Variable (IV) = the predictor(s) of interest Covariates = the predictors that are not considered IV Control = a class of methods that remove (or control for) the effect of another variable Page 10

14 Controlling for Confounders
Five Types of Control Random assignment on the independent variable Manipulation of covariates Other types of randomization Exclusion of cases Statistical control Experimental Control

15 Statistical Control i.e., โ€œAdjust forโ€, โ€œCorrect forโ€, โ€œHold constantโ€, โ€œPartial outโ€ Linear models allow us to control for the effects of covariates Weโ€™ll discuss more about how this works, but essentially makes everyone statistically equal on each covariate Can assess how much of the differences in the outcome are uniquely attributable to the IV (once we take out the effect of the covariates)

16 https://calpolystat2.shinyapps.io/3d_regression/


Download ppt "Welcome to the class! set.seed(843) df <- tibble::data_frame("

Similar presentations


Ads by Google