Generalized Additive Models: An Introduction and Example

Slides:



Advertisements
Similar presentations
Simple linear models Straight line is simplest case, but key is that parameters appear linearly in the model Needs estimates of the model parameters (slope.
Advertisements

Generalized Additive Models Keith D. Holler September 19, 2005 Keith D. Holler September 19, 2005.
Lecture 10 Curves and Surfaces I
Regression Discontinuity Design William Shadish University of California, Merced.
CHAPTER 24: Inference for Regression
Data mining and statistical learning - lecture 6
Basis Expansion and Regularization Presenter: Hongliang Fei Brian Quanz Brian Quanz Date: July 03, 2008.
Chapter 18 Interpolation The Islamic University of Gaza
Models with Discrete Dependent Variables
Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.
Chapter 10 Simple Regression.
Curve-Fitting Regression
Additional Topics in Regression Analysis
Drawing Parametric Curves Jean-Paul Mueller. Curves - The parametric form of a curve expresses the value of each spatial variable for points on the curve.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Multiple Linear Regression
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Regression Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Objectives of Multiple Regression
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Andrew Thomson on Generalised Estimating Equations (and simulation studies)
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
Statistical Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Model Comparison for Tree Resin Dose Effect On Termites Lianfen Qian Florida Atlantic University Co-author: Soyoung Ryu, University of Washington.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Curve-Fitting Regression
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Statistical Models for the Analysis of Single-Case Intervention Data Introduction to:  Regression Models  Multilevel Models.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Mixed Effects Models Rebecca Atkins and Rachel Smith March 30, 2015.
2011 COURSE IN NEUROINFORMATICS MARINE BIOLOGICAL LABORATORY WOODS HOLE, MA Introduction to Spline Models or Advanced Connect-the-Dots Uri Eden BU Department.
R. Ty Jones Director of Institutional Research Columbia Basin College PNAIRP Annual Conference Portland, Oregon November 7, 2012 R. Ty Jones Director of.
Sea Level in month i =α 2 sin t i + α 3 cos t i + α 4 sin2 t i + α 5 cos2 t i seasonality + α 6 SOI i climate indices + α 1 + α 7 time i + (deviations.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Regression Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
Meta-Analysis of Single-Case Designs
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
MACHINE LEARNING 3. Supervised Learning. Learning a Class from Examples Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
Basis Expansions and Generalized Additive Models Basis expansion Piecewise polynomials Splines Generalized Additive Model MARS.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Introduction. We want to see if there is any relationship between the results on exams and the amount of hours used for studies. Person ABCDEFGHIJ Hours/
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Interpolation - Introduction
Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis.
An introduction to General Additive Models Claudia von Brömssen Dept. of Energy and Technology.
Estimating standard error using bootstrap
CHAPTER 12 More About Regression
Piecewise Polynomials and Splines
CHAPTER 12 More About Regression
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
An Empirical Test of the Regression Discontinuity Design
William R. Shadish University of California, Merced
Statistical Models for the Analysis of Single-Case Intervention Data
CHAPTER 12 More About Regression
Correlation and Regression
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Modelling data and curve fitting
Regression Models - Introduction
CHAPTER 12 More About Regression
Basis Expansions and Generalized Additive Models (2)
CHAPTER 12 More About Regression
Longitudinal Data & Mixed Effects Models
Multiple Regression Berlin Chen
Bootstrapping and Bootstrapping Regression Models
Chapter 5: Sampling Distributions
Presentation transcript:

Generalized Additive Models: An Introduction and Example William R Shadish University of California, Merced This research was supported in part by grants R305D 100046 and R305U070003 from the Institute for Educational Sciences, U.S. Department of Education, and by a grant from the University of California Office of the President to the University of California Educational Evaluation Consortium. The opinions expressed here are those of the author and do not represent the opinions of the U.S. Department of Education or the UC Office of the President.

This Topic is about Modeling Trend Do the data trend up or down over time? Do they trend in a straight line or curve? If nonlinear, what is the shape of the curve?

Sometimes Trend Looks Pretty Linear

Sometimes It Looks Pretty Nonlinear

Sometimes It Just Looks Like A Mess

How Do We Find the Right Shape of the Trend? Most common statistics assume trend is linear Or they require the researcher to know how to specify the nonlinearity correctly. E.g., is it: x + x2 + x3 log(x) Or something else? But the researcher rarely (if ever) knows! If the guess is wrong, so are the estimates and inferences.

Generalized Additive Models to the Rescue GAMs model trend with smoothing splines They let the data suggest the shape of trend Penalizing for over-fitting a curve to the data

What is a Smoother?

What is a Spline Splines A flexible strip fixed at certain points (knots) and then bent in a smooth curve. Used to draw curves in drafting or carpentry. Statistical splines improve on loess and related smoothers by having a stronger analytic basis, being better at preventing over-smoothing, having superior software implementations, and being easier to make part of GA(M)Ms.

More on Splines (Keele, 2008) A very simple example. Imagine modeling these data: Clearly linearity is not a good fit. The point at which the line changes direction (x = 60) is called a knot (c).

A Tentative Model a very simple case joining two linear regressions together into a spline Predicted values (y) will change depending on whether the observations (x) are above or below the knot

Spline Basis Functions For a spline, the second column of the design matrix (X) is replaced (in this simple case) by two columns: The resulting design matrix is:

X for Simple Spline

This creates the following spline:

More Advanced Matters Many kinds of splines exist Bases are rarely linear Can do generalized additive mixed model Can compute autocorrelation or autoregressive models Can do Poisson, binomial, and other outcomes Can include parametric covariates GAMs are not limited to longitudinal data E.g., used in analysis of regression discontinuity designs.

GAM Degree of Nonlinearity Measured by estimated degrees of freedom (edf) edf = trace(H) edf is approximately (polynomial degree + 1) Some examples of data and edf:

edf for linear data

edf for quadratic data

edf for very wiggly data

An Example: Lambert et al. (2006) An Example: Lambert et al. (2006). Number of Intervals of Disruptive Behavior Recorded during single-student responding (SSR) and response card treatment (RC) conditions

Computations and Data Computations done in R mgcv Data snapshot:

Some Output A significant treatment effect

Cases differ significantly from each other in starting levels on the outcome Some Output

The treatment effect varies significantly over cases Some Output

Some Output 7 of 9 cases show significant nonlinear trend.

Some Output Case 2 shows significant linear trend.

Some Output Case 6 shows no significant trend, linear or not.

Graphical Output

Autocorrelations This was not an autoregressive model. Small n makes AR models difficult We are working on a Bayesian approach to AC In the meantime, one can compute the AC on the residuals to get a sense of the size of the problem:

Autocorrelations Among Residuals Only lag 1 is significant, but 4 or 9 of them. So standard errors could be wrong. gam cannot estimate an autoregressive models, so we are looking at Bayesian gamm’s, which can do so.

Discussion All these models can be implemented in regression or mixed models without smoothers. For SCDs, GA(M)Ms provide information about level, trend, variability, overlap, immediacy of effect, and phase consistency that SCD researchers want when interpreting a functional relation GA(M)Ms probably have wide application in other longitudinal data sets. I can send R syntax for using GAMs.

GAM is a Method Whose Time Has Come Further Readings (In order from least to most complex) Shadish, W. R., Zuur, A. F., & Sullivan, K. J. (2014). Using generalized additive (mixed) models to analyze single case designs. Journal of School Psychology, 52, 149-178 Sullivan, K.J., Shadish, W.R., & Steiner, P.M. (in press). Analyzing longitudinal data with generalized additive models: Applications to single-case designs. Psychological Methods. Keele, L. (2008). Semiparametric Regression for the Social Sciences. Chichester, UK: Wiley. Zuur, A. F. (2012). A beginner’s guide to generalized additive models with R. Newburgh, UK: Highland Statistics. Zuur, A. F., Saveliev, A. A., & Ieno, E. N. (2014). A beginner’s guide to generalized additive mixed models with R. Newburgh, UK: Highland Statistics. Wood, S. N. (2006). Generalized additive models: An introduction with R. Boca Raton, FL: Chapman and Hall/CRC.

THE END This research was supported in part by grants R305D 100046 and R305U070003 from the Institute for Educational Sciences, U.S. Department of Education, and by a grant from the University of California Office of the President to the University of California Educational Evaluation Consortium. The opinions expressed here are those of the author and do not represent the opinions of the U.S. Department of Education or the UC Office of the President.