Download presentation
1
GENERALIZED LINEAR MODELS
2012 COURSE IN NEUROINFORMATICS MARINE BIOLOGICAL LABORATORY WOODS HOLE, MA GENERALIZED LINEAR MODELS Uri Eden Sridevi V. Sarma Emery N. Brown BU Department of Mathematics and Statistics MIT Dept. of Brain & Cognitive Sciences MIT Division of Health, Sciences and Technology Massachusetts General Hospital August 15, 2012
2
OBJECTIVES Understand the theory of the Generalized Linear Model
Understand its relation to the General Linear Model Understand how model analysis is conducted using the Generalized Linear Model
3
OUTLINE Motivation: The Stimulus-Response Structure of
Neuroscience Experiments Theory of the Generalized Linear Model (GLM) GLM Analysis of the Retinal Neuron Spike Train GLM Analysis of Sub-thalamic Nucleus Spike Trains in Parkinson Patients and Healthy Primate Summary
4
Spatial Receptive Fields of Hippocampal Pyramidal Neurons
Circular Environment Spike Histogram 70 70 x2 (cm) x2 (cm) 35 35 35 70 35 70 x1 (cm) x1 (cm)
5
Learning in Hippocampal Neurons
Single cell recording in monkey hippocampus Trial and error learning of association between picture and response Taking as our motivation a study by Sylvia Wirth in Wendy Suzuki’s lab. single-cell recording in monkey hippocampus 2 monkeys, learned 290 scenes, recorded from 145 HC neurons This is the experiment that we're extending in humans, but with fMRI and unfortunately, no juice squirts. I'll take you through it, but the basic idea is that hippocampal activity increased at the same time that learning occurred Wirth et al. Science 2003 Smith et al. Neural Computation 2003 Smith et al. Journal of Neuroscience, 2004
6
Most neuroscience experiments are stimulus-response
Stimulus/Covariate Response Free-behaving Rat Position Spikes Monkey Learning Association Incorrect/Correct Responses, or Spikes Goldfish Retina Constant, Spiking History fMRI Visual/Motor BOLD Need a general framework that allows us to relate the stimuli in neuroscience experiments to their responses… Actually-we want to do regression for all types of data so that we get the formal inference structure of ML (optimality, GOF, T tests on coeffs)!
7
? System to Study Noise (unpredictable) Stimulus Response
How can we use numerical data to model how X ‘impacts’ Y?
8
From Data to Model Data: Notation:
are random variables are typically non-random variables (covariates) Notation: Constant parameter vector typically estimated from data
9
From Data to Model cont…
Model is joint probability density function f:
10
From Linear Models to GLMs
Linear regression models of the form: are useful for relating Gaussian continuous valued observations to a set of covariates. Many types of data cannot be described by a Gaussian additive noise model. Generalized linear models extend a simple class of models to other data types. In particular, Count data: eg. number of arrivals in time T (poisson) Binary data: eg. incorrect/correct response of trial (bernoulli)
11
The Linear Model: A Different Perspective
Y is Gaussian which belongs to the exponential family of distributions: Data and Parameters are multiplicatively separable! 2. The likelihood function for the exponential family is: Canonical Link function
12
The Linear Model: A Different Perspective cont…
3. The likelihood for the Gaussian and its canonical link for the linear model: Gaussian Data The canonical link function is then
13
The Generalized Linear Model
1. Y belongs to the exponential family of distributions 2. The canonical link function is a linear function of the parameters All the probability models we have studied, Bernoulli, binomial, Poisson, Gaussian, gamma, exponential, inverse Gaussian, beta belong to the exponential family!
14
The Exponential Family
Poisson Data The canonical link function is
15
The Exponential Family
Bernoulli Data The canonical link function
16
Summary of Generalized Linear Models
Link Equation Model Gaussian Poisson Bernoulli
17
Model Goodness-of-Fit and Analysis
A. Deviance (Analog of the Residual Sum of Squares): where in the Gaussian case B. Akaike’s Information Criterion: For maximum likelihood estimates it measures the trade-off between maximizing the likelihood ( minimizing ) and the numbers of parameters p in the model. C. Standard Errors of the Coefficients and t-tests t-statistic = Coefficient Estimate/SE
18
Generalized Linear Models (GLM)
Properties of the GLM Convex likelihood surface Estimators asymptotically have minimum MSE All model estimation is efficient: iterative reweighted least squares Stochastic Models Neural Spiking Models Generalized Linear Models (GLM) Linear Regression
19
GLM Neural Models By selecting an appropriate set of basis functions we can capture arbitrary functional relations. Analysis of relative contributions of components to spiking Truccolo W, Eden UT, Fellows MR, Donoghue JP, Brown EN. (2004) J. Neurophys 93:
20
Summary of GLM Theory Generalization of the Gaussian Linear Model (McCulloch and Nelder) Can be used for any probability models in the exponential family. Is a maximum likelihood analysis and all its optimality properties. An efficient computational framework using iteratively reweighted least squares. GLM is available as a toolbox in all major statistics packages and Matlab.
21
Case 1: An Analysis of the Spiking Activity of Retinal Neurons in Culture Retinal neurons are grown in culture under constant light and environmental conditions. The spontaneous spiking activity of these neurons is recorded. The objective is to develop a statistical model which accurately describes the stochastic structure of this activity. (Iygengar and Liu, 1997)
23
Retinal Ganglion Cell Example cont…
24
ISI Model Candidates Exponential Distribution: Gamma Distribution:
Inverse Gaussian Distribution:
25
Interspike Interval ML Models
Exponential Gamma Inverse Gaussian
26
refractoriness short-term history long-term history/local effects
27
Discrete Time Spike Train Data
dN1 dN2 dN3 dN4 dN5 dN6 dN7 1 1 dNk is the spike indicator function in interval k is the intensity of spiking at time k, which in the limit is given by
28
Point Process is Exponential
Link Function Data Component
29
Question: Can we construct a history-dependent firing rate model to describe the retinal neuron spiking activity? The ISI distribution models we construct were We use GLM to build history-dependent model as Poisson Model: How do we pick a model order?
30
Partial Correlogram-Visualization
10 20 30 40 50 -0.04 -0.02 0.02 0.04 0.06 0.08 0.1 Model Order Partial Correlation Coefficient Order=14?
31
Generalized Linear Model Analysis
Exploratory Analysis: We plotted the data as a time-series. We computed the partial autocorrelation coefficients of order up to 50. Confirmatory Analysis: We fit GLM models of orders varying from 1 to 120 (msec). We computed the deviance, AIC, KS plots and the significance of the coefficients.
32
AIC Model Order Analysis
GLM Order
33
GLM Coefficients Order=14? Coefficient Value Lag (msec)
Coefficient values Stat. significant coeffs. Coefficient Value Lag (msec)
34
Absolute Goodness-of-Fit
Time Rescale Time-Rescaling Theorem: zi’s are i.i.d. exponential rate 1 34
35
Kolmogorov-Smirnov Analysis
Kolmogorov-Smirnov (KS) Plot: Graphical measure of goodness-of-fit, based on the time rescaling theorem, for comparing an empirical and model cumulative distribution function. If the model is correct, then the rescaled ISI,s are independent, identically distributed exponential 1 (uniform) random variables whose ordered quantiles should produce a 45° line. ECDF(zi) KS distance CDF(exp(1)) KS Distance is the maximum distance between an empirical and a theoretical probability model.
36
Kolmogorov-Smirnov Plots
Model Quantiles Empirical Quantiles
37
KS Plots for Different Order GLMs
Model Quantiles Empirical Quantiles
38
Correlation Function for Rescaled ISIs
95% conf bounds Correlation ISI Lag Order
39
AIC and KS Statistics Poisson-GLM 1 14 50 6589 5931 5892 0.2525 0.0657
0.0462 Order AIC KS Exp Gamma Inv. Gauss. 0.2330 0.2171 0.1063 Parametric Models: KS Statistic
40
Interspike Interval Models
Exponential Gamma Inverse Gaussian Order 50 GLM Probability Density ISI (msec)
41
Inferences and Conclusions
Iyengar and Liu showed that a generalized inverse Gaussian model described these data well. The fit of history-dependent GLM model improves appreciably on the fits of the exponential, gamma and inverse Gaussian models, most notably in terms of KS plots. Our analysis shows that the GLM model describes the essential stochastic features in the data. There is a significant history dependence in the retinal neural spiking data extending back 14 msec. There is another effect going back approximately 50 msec. The shorter time-scale phenomena may reflect intrinsic dynamics of the individual neuron whereas the longer time-scale effects may also include network dynamics.
42
Remarks Only 14 parameters are used to fit ~ 30,000 data points!
This type of strong history dependent effect is something we have seen in neurons from a number of different brain regions, animal models and experimental protocols. It was all simply described by GLM fitting. Truccolo W, Eden UT, Fellow M, Donoghue JD, Brown EN. A point process framework for relating neural spiking activity to spiking history, neural ensemble and covariate effects. Journal of Neurophysiology, 2005, 93: Kass RE, Ventura V, Brown EN. Statistical issues in the analysis of neuronal data. Journal of Neurophysiology, 2005, 94: 8-25.
43
Case 2: An Analysis of the Spiking Activity of STN Neurons in Parkinson Patients and Healthy Primate The spiking activity of sub-thalamic neurons of basal ganglia in Parkinson patients and healthy primate is recorded under identical experimental conditions. Subjects execute a center-out directed hand-movement task. *Sarma, Cheng, Eden, Hu, Williams, Brown, Eskandar, 2008
44
Sub-Thalamic Nucleus
45
Neurophysiological Data
Single Neuronal Recordings from STN 1 Primate 96 Neurons 868 trials 8 Patients 3-15 neurons/patient trials/patient 1 time
46
Behavioral Task Go cue Move Reach & Hold Fixation Grey Array Appears
ms Move “U” “R” “D” “L” Reach & Hold 100 ms Fixation 500 ms Grey Array Appears ms Target cue ms
47
Effect of spiking history
GLM for STN TC GC MV t 350 ms GA 700 ms l Effect of spiking history Period-specific movement planning, execution (stimulus) effect short term: Intrinsic dynamics long term: Network dynamics Compute maximum likelihood estimates for parameters using glmfit.m
48
KS Plots of PD Models
49
GLM Coefficient Estimates
PD Model Primate Model
50
Population Summary “Parkinsonian” Motor Symptoms
MODELING STN ACTIVITY 50 Hz 50% oscillations (10-30 Hz) 40% bursting 8% directional selectivity 35% 30 Hz 15% oscillations (10-30 Hz) 12% bursting 250 msec rate function spike train 0.35 HEALTHY PRIMATE PD PATIENT We’ve made progress on both phases…I’ll highlight some of our findings from stimuli-to-neural activity models that we built from rare neural spiking data PD patients and normal primates executing the same directed-hand movement task and recordings from the STN of the BG. We employed a point process modeling framework to estimate dynamic spike rates functions dependent on movement direction and spiking history. For the first time, we are able to quantify prevalent abnormalities found in the PD activity that are not significant in the normals-which include 10-30Hz oscillations, bursting, and a loss of directional plurality all of which may directly relate to the well known PD motor symptoms. These models are necessary for designing controller for phase 2.. “Parkinsonian” Motor Symptoms akinesia/bradykinesia resting tremor rigidity “Parkinsonian” STN Neural Symptoms 10-30 Hz oscillations bursting directional selectivity lower firing rate
51
Summary GLM provides a computationally tractable generalization of the Gaussian linear model to non-Gaussian regression models. Estimation is carried out using maximum likelihood. This analysis has all the properties of maximum likelihood. AIC, deviance and parameter standard errors provide measures of goodness-of-fit and an inference framework analogous to regression. Can be applied to other exponential family models. Non-canonical link functions can also be used. GLM is a standard tool in Matlab, Minitab, R, S, SAS, Splus, and SPSS.
52
Acknowledgments Reference
We are grateful to Julie Scott for technical assistance. Reference McCullagh P, Nelder JA. Generalized Linear Model, 2nd Edition. Chapman and Hall, 1989.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.