Hierarchical models. Hierarchical with respect to Response being modeled – Outliers – Zeros Parameters in the model – Trends (Us) – Interactions (Bs)

Slides:



Advertisements
Similar presentations
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Advertisements

2005 Hopkins Epi-Biostat Summer Institute1 Module 2: Bayesian Hierarchical Models Francesca Dominici Michael Griswold The Johns Hopkins University Bloomberg.
Error Propagation. Uncertainty Uncertainty reflects the knowledge that a measured value is related to the mean. Probable error is the range from the mean.
Modeling Wim Buysse RUFORUM 1 December 2006 Research Methods Group.
458 Fitting models to data – II (The Basics of Maximum Likelihood Estimation) Fish 458, Lecture 9.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Computer vision: models, learning and inference Chapter 3 Common probability distributions.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Generalized Linear Models
Department of Geography, Florida State University
Bootstrapping applied to t-tests
Chapter Two Probability Distributions: Discrete Variables
Lecture 7: Simulations.
Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health.
2006 Hopkins Epi-Biostat Summer Institute1 Module 2: Bayesian Hierarchical Models Instructor: Elizabeth Johnson Course Developed: Francesca Dominici and.
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Bayesian Analysis and Applications of A Cure Rate Model.
Multivariate Analysis. One-way ANOVA Tests the difference in the means of 2 or more nominal groups Tests the difference in the means of 2 or more nominal.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Lecture 2 Forestry 3218 Lecture 2 Statistical Methods Avery and Burkhart, Chapter 2 Forest Mensuration II Avery and Burkhart, Chapter 2.
Regression Analysis Part C Confidence Intervals and Hypothesis Testing
Distributions, Iteration, Simulation Why R will rock your world (if it hasn’t already)
Problem: 1) Show that is a set of sufficient statistics 2) Being location and scale parameters, take as (improper) prior and show that inferences on ……
Review of statistical modeling and probability theory Alan Moses ML4bio.
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Estimating Uncertainty. Estimating Uncertainty in ADMB Model parameters and derived quantities Normal approximation Profile likelihood Bayesian MCMC Bootstrap.
Stockholm, Sweden24-28 March 2014 Introduction to MAR(1) models for estimating community interactions This lecture follows largely from: Ives AR, Dennis.
Lecture 17: Multi-stage models with MARSS, etc. Multivariate time series with MARSS We’ve largely worked with: 1. Different time series of the same species.
Prediction and Missing Data. Summarising Distributions ● Models are often large and complex ● Often only interested in some parameters – e.g. not so interested.
Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis.
Crash course in probability theory and statistics – part 2 Machine Learning, Wed Apr 16, 2008.
Data Modeling Patrice Koehl Department of Biological Sciences
Probability distributions and likelihood
Applied statistics Usman Roshan.
Bayesian estimation for time series models
Multilevel modelling: general ideas and uses
Chapter 14 Introduction to Multiple Regression
Hierarchical Models.
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
Module 2: Bayesian Hierarchical Models
ESTIMATION.
Probability Theory and Parameter Estimation I
Basic simulation methodology
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
An introduction to Dynamic Linear Models
CH 5: Multivariate Methods
Analyzing Redistribution Matrix with Wavelet
Set-up of the Bayesian Regression Model
Linear and generalized linear mixed effects models
Computer vision: models, learning and inference
Generalized Linear Models
CHAPTER 29: Multiple Regression*
Predictive distributions
Modelling data and curve fitting
Residuals The residuals are estimate of the error
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Summarizing Data by Statistics
Estimating mean abundance from repeated presence-absence surveys
Bayesian Linear Regression
Chapter 3 A Review of Statistical Principles Useful in Finance
数据的矩阵描述.
Replicated Binary Designs
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Longitudinal Data & Mixed Effects Models
Model Adequacy Checking
Professor Ke-sheng Cheng
Presentation transcript:

Hierarchical models

Hierarchical with respect to Response being modeled – Outliers – Zeros Parameters in the model – Trends (Us) – Interactions (Bs) – Variances (R, Q)

Models for outliers

Ecological process variance may be asymmetric What’s the upper bound on positive values of population size between t and t + 1? What’s the lower bound on negative values of population size between t and t + 1? Many asset return models also have fatter tails than a normal distribution

1. Use non-normal error Student t-distribution is one alternative – Can be written as mixture of normal distributions In finance (Harvey et al. 1994) plot via wikipedia

2. Use mixture distribution model Involves specifying the distribution yourself Distribution is composite of 2 or more components Several ways to do this

Example 1: catastrophes Time series of pinniped pup counts from Ward et al. 2008

Model year as catastrophe or not f() represent normal distribution with unique means and variances I represents indicator function (1 = normal, 0 = catastrophe)

Use categorical sampler in JAGS Categorical model in JAGS: p[1] ~ dunif(0,1); p[2] <- 1-p[1]; isCat ~ dcat(p[1:2]) Estimate mean and variance of process variations in each year – Constrain catastrophe variance > regular variability

This model doesn’t have an autoregressive property But easy to implement one Instead of probabilities of catastrophic year as constant, we could estimate Pr(catastrophe in year t | not in t-1) Pr(not a catastrophe in year t | catastrophe in t-1)

Why to include autoregressive process? Is it biologically supported? Improved predictions in future – If we estimate ‘state’ at time t (0,1), both aren’t equally likely at time t+1

Example 2: extreme fisheries catches Thorson et al – “Extreme catch events” Some schooling / shoaling species caught in huge aggregations

Slightly different formulation f() represent normal distribution with unique means and variances (1-p) represents contribution of extreme component

Hierarchical model for 0s Common approaches to dealing with 0s – Add small value (arbitrary) – Use negative binomial or Poisson distribution – Use 2-step distribution (delta-GLM)

1. Use a tweedie distribution Mixture of Gamma – Poisson – Number of mixture components random

2. Implement mixture model 2 component mixture Like the mixture model for catastrophes For continuous data:

For count data Applications: mark-recapture, sighting histories, presence-absence, etc. Zeros are possible (Bernoulli, Poisson, Neg Bin, etc) z = indicator function Kery & Schaub 2012, Royle et al. 2005, etc.

Hierarchical with respect to Response being modeled – Outliers – Zeros Parameters in the model – Trends (Us) – Interactions (Bs) – Variances (R, Q)

Hierarchical models Multivariate time series with MARSS – We’ve already fit models models that can be considered hierarchical – shared parameters across time series as fixed effects Z = (1, 2, 1, 1, 2, 3) U = “equal” R and Q = “diagonal and equal”

Random effects Assumes shared distribution of ‘population’ of trends Upside / downside: increased complexity

Comparison of fixed versus random Same trend model to harbor seals we discussed last week (10 time series) R = diagonal and equal Q = diagonal and equal In Bayesian model, compare inference from fixed vs random model for U

Model file (from last week) jagsscript = cat(" model { # Populations are independent, so the Q matrix is a diagonal. We'll assume # B = 1, and there's no scaling (A) parameters because each time series = 1 state. # Unlike MARSS, we can model the trends (U) as random effects - meaning we'll estimate # a shared mean and sd across populations, as well as the deviations from that Umu ~ dnorm(0,1); Usig ~ dunif(0,100); Utau <- 1/(Usig*Usig); for(i in 1:nSites) { tauQ[i]~dgamma(0.001,0.001); Q[i] <- 1/tauQ[i]; U[i] ~ dnorm(Umu,Utau); # For fixed effects, U[i] ~ dnorm(0,0.01); } # Estimate the initial state vector of population abundances for(i in 1:nSites) { X[1,i] ~ dnorm(3,0.01); # vague normal prior } # Autoregressive process for remaining years for(i in 2:nYears) { for(j in 1:nSites) { predX[i,j] <- X[i-1,j] + U[j]; X[i,j] ~ dnorm(predX[i,j], tauQ[j]); } # Observation model tauR ~ dgamma(0.001,0.001); for(i in 1:nYears) { for(j in 1:nSites) { Y[i,j] ~ dnorm(X[i,j],tauR); } ",file="normal_independent.txt")

Posterior estimates FixedRandom Popsd(U)CV(U)sd(U)CV(U)

Random effects on other parameters Coefficients for covariates – Temperature – Ocean acidification – Contaminants – Species interactions (e.g. shared across systems) x0 (initial states) – Is there a common initial state amongst populations?

Random effects on variances More difficult because – Variances not ~ normally distributed – Constrained to be > 0 Normal random effects in log-space Non-normal distribution

DLM parameters also can be treated hierarchically From week 6 lab we modeled level and covariate effect as potentially time varying No real hierarchical structure here because we focused on each in isolation

Species with correlated dynamics

Compare Univariate DLMs fit to each time series Hierarchical DLM with shared / correlated level terms Is there a shared regime change / trend

The code (also see comment box) jagsscript = cat(" model { # time varying level parameter tauQ ~ dgamma(0.001,0.001); tauR ~ dgamma(0.001,0.001); sigmaQ <- 1/sqrt(tauQ); sigmaR <- 1/sqrt(tauR); alpha[1] ~ dnorm(0,0.01); for(i in 2:N) { alpha[i] ~ dnorm(alpha[i-1],tauQ); } for(i in 1:N) { Y[i] ~ dnorm(alpha[i], tauR); } ",file="univariateDLM.txt") model.loc=("univariateDLM.txt")

Univariate DLM

Diagnostics

Code for multivariate DLM jagsscript = cat(" model { # time varying level parameter priorQ[1,1] <- 1; priorQ[2,2] <- 1; priorQ[1,2] <- 0; priorQ[2,1] <- 0; tauQ ~ dwish(priorQ, 2); tauR ~ dgamma(0.001,0.001); sigmaQ <- inverse(tauQ[1:2,1:2]); sigmaR <- 1/sqrt(tauR); for(pop in 1:2) { alpha[1,pop] ~ dnorm(0,0.01); } for(i in 2:N) { alpha[i,1:2] ~ dmnorm(alpha[i-1,1:2],tauQ); } for(i in 1:N) { Y[i,1] ~ dnorm(alpha[i,1], tauR); Y[i,2] ~ dnorm(alpha[i,2], tauR); } ",file="DLMcorrelated.txt")

Fits still great – how to compare univariate v multivariate?

High uncertainty with missing data

Summary Treating response variable hierarchically increases flexibility Modeling parameters hierarchically – Can lead to improved precision of estimates – Increased complexity can lead to better predictions – But these approaches require lots of data