Hierarchical models. Hierarchical with respect to Response being modeled – Outliers – Zeros Parameters in the model – Trends (Us) – Interactions (Bs)

Slides:

Advertisements

Similar presentations

© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.

Advertisements

2005 Hopkins Epi-Biostat Summer Institute1 Module 2: Bayesian Hierarchical Models Francesca Dominici Michael Griswold The Johns Hopkins University Bloomberg.

Error Propagation. Uncertainty Uncertainty reflects the knowledge that a measured value is related to the mean. Probable error is the range from the mean.

Modeling Wim Buysse RUFORUM 1 December 2006 Research Methods Group.

458 Fitting models to data – II (The Basics of Maximum Likelihood Estimation) Fish 458, Lecture 9.

Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.

Computer vision: models, learning and inference Chapter 3 Common probability distributions.

Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.

Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.

Generalized Linear Models

Department of Geography, Florida State University

Bootstrapping applied to t-tests

Chapter Two Probability Distributions: Discrete Variables

Lecture 7: Simulations.

Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health.

2006 Hopkins Epi-Biostat Summer Institute1 Module 2: Bayesian Hierarchical Models Instructor: Elizabeth Johnson Course Developed: Francesca Dominici and.

Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.

Bayesian Analysis and Applications of A Cure Rate Model.

Multivariate Analysis. One-way ANOVA Tests the difference in the means of 2 or more nominal groups Tests the difference in the means of 2 or more nominal.

Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.

ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.

Lecture 2 Forestry 3218 Lecture 2 Statistical Methods Avery and Burkhart, Chapter 2 Forest Mensuration II Avery and Burkhart, Chapter 2.

Regression Analysis Part C Confidence Intervals and Hypothesis Testing

Distributions, Iteration, Simulation Why R will rock your world (if it hasn’t already)

Problem: 1) Show that is a set of sufficient statistics 2) Being location and scale parameters, take as (improper) prior and show that inferences on ……

Review of statistical modeling and probability theory Alan Moses ML4bio.

- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.

Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.

Estimating Uncertainty. Estimating Uncertainty in ADMB Model parameters and derived quantities Normal approximation Profile likelihood Bayesian MCMC Bootstrap.

Stockholm, Sweden24-28 March 2014 Introduction to MAR(1) models for estimating community interactions This lecture follows largely from: Ives AR, Dennis.

Lecture 17: Multi-stage models with MARSS, etc. Multivariate time series with MARSS We’ve largely worked with: 1. Different time series of the same species.

Prediction and Missing Data. Summarising Distributions ● Models are often large and complex ● Often only interested in some parameters – e.g. not so interested.

Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis.

Crash course in probability theory and statistics – part 2 Machine Learning, Wed Apr 16, 2008.

Data Modeling Patrice Koehl Department of Biological Sciences

Probability distributions and likelihood

Applied statistics Usman Roshan.

Bayesian estimation for time series models

Multilevel modelling: general ideas and uses

Chapter 14 Introduction to Multiple Regression

Hierarchical Models.

Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.

Module 2: Bayesian Hierarchical Models

Probability Theory and Parameter Estimation I

Basic simulation methodology

Statistical Quality Control, 7th Edition by Douglas C. Montgomery.

An introduction to Dynamic Linear Models

CH 5: Multivariate Methods

Analyzing Redistribution Matrix with Wavelet

Set-up of the Bayesian Regression Model

Linear and generalized linear mixed effects models

Computer vision: models, learning and inference

Generalized Linear Models

CHAPTER 29: Multiple Regression*

Predictive distributions

Modelling data and curve fitting

Residuals The residuals are estimate of the error

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Summarizing Data by Statistics

Estimating mean abundance from repeated presence-absence surveys

Bayesian Linear Regression

Chapter 3 A Review of Statistical Principles Useful in Finance

数据的矩阵描述.

Replicated Binary Designs

Multivariate Methods Berlin Chen

Multivariate Methods Berlin Chen, 2005 References:

Longitudinal Data & Mixed Effects Models

Model Adequacy Checking

Professor Ke-sheng Cheng

Presentation transcript:

Hierarchical models

Hierarchical with respect to Response being modeled – Outliers – Zeros Parameters in the model – Trends (Us) – Interactions (Bs) – Variances (R, Q)

Models for outliers

Ecological process variance may be asymmetric What’s the upper bound on positive values of population size between t and t + 1? What’s the lower bound on negative values of population size between t and t + 1? Many asset return models also have fatter tails than a normal distribution

1. Use non-normal error Student t-distribution is one alternative – Can be written as mixture of normal distributions In finance (Harvey et al. 1994) plot via wikipedia

2. Use mixture distribution model Involves specifying the distribution yourself Distribution is composite of 2 or more components Several ways to do this

Example 1: catastrophes Time series of pinniped pup counts from Ward et al. 2008

Model year as catastrophe or not f() represent normal distribution with unique means and variances I represents indicator function (1 = normal, 0 = catastrophe)

Use categorical sampler in JAGS Categorical model in JAGS: p[1] ~ dunif(0,1); p[2] <- 1-p[1]; isCat ~ dcat(p[1:2]) Estimate mean and variance of process variations in each year – Constrain catastrophe variance > regular variability

This model doesn’t have an autoregressive property But easy to implement one Instead of probabilities of catastrophic year as constant, we could estimate Pr(catastrophe in year t | not in t-1) Pr(not a catastrophe in year t | catastrophe in t-1)

Why to include autoregressive process? Is it biologically supported? Improved predictions in future – If we estimate ‘state’ at time t (0,1), both aren’t equally likely at time t+1

Example 2: extreme fisheries catches Thorson et al – “Extreme catch events” Some schooling / shoaling species caught in huge aggregations

Slightly different formulation f() represent normal distribution with unique means and variances (1-p) represents contribution of extreme component

Hierarchical model for 0s Common approaches to dealing with 0s – Add small value (arbitrary) – Use negative binomial or Poisson distribution – Use 2-step distribution (delta-GLM)

1. Use a tweedie distribution Mixture of Gamma – Poisson – Number of mixture components random

2. Implement mixture model 2 component mixture Like the mixture model for catastrophes For continuous data:

For count data Applications: mark-recapture, sighting histories, presence-absence, etc. Zeros are possible (Bernoulli, Poisson, Neg Bin, etc) z = indicator function Kery & Schaub 2012, Royle et al. 2005, etc.

Hierarchical with respect to Response being modeled – Outliers – Zeros Parameters in the model – Trends (Us) – Interactions (Bs) – Variances (R, Q)

Hierarchical models Multivariate time series with MARSS – We’ve already fit models models that can be considered hierarchical – shared parameters across time series as fixed effects Z = (1, 2, 1, 1, 2, 3) U = “equal” R and Q = “diagonal and equal”

Random effects Assumes shared distribution of ‘population’ of trends Upside / downside: increased complexity

Comparison of fixed versus random Same trend model to harbor seals we discussed last week (10 time series) R = diagonal and equal Q = diagonal and equal In Bayesian model, compare inference from fixed vs random model for U

Model file (from last week) jagsscript = cat(" model { # Populations are independent, so the Q matrix is a diagonal. We'll assume # B = 1, and there's no scaling (A) parameters because each time series = 1 state. # Unlike MARSS, we can model the trends (U) as random effects - meaning we'll estimate # a shared mean and sd across populations, as well as the deviations from that Umu ~ dnorm(0,1); Usig ~ dunif(0,100); Utau <- 1/(Usig*Usig); for(i in 1:nSites) { tauQ[i]~dgamma(0.001,0.001); Q[i] <- 1/tauQ[i]; U[i] ~ dnorm(Umu,Utau); # For fixed effects, U[i] ~ dnorm(0,0.01); } # Estimate the initial state vector of population abundances for(i in 1:nSites) { X[1,i] ~ dnorm(3,0.01); # vague normal prior } # Autoregressive process for remaining years for(i in 2:nYears) { for(j in 1:nSites) { predX[i,j] <- X[i-1,j] + U[j]; X[i,j] ~ dnorm(predX[i,j], tauQ[j]); } # Observation model tauR ~ dgamma(0.001,0.001); for(i in 1:nYears) { for(j in 1:nSites) { Y[i,j] ~ dnorm(X[i,j],tauR); } ",file="normal_independent.txt")

Posterior estimates FixedRandom Popsd(U)CV(U)sd(U)CV(U)

Random effects on other parameters Coefficients for covariates – Temperature – Ocean acidification – Contaminants – Species interactions (e.g. shared across systems) x0 (initial states) – Is there a common initial state amongst populations?

Random effects on variances More difficult because – Variances not ~ normally distributed – Constrained to be > 0 Normal random effects in log-space Non-normal distribution

DLM parameters also can be treated hierarchically From week 6 lab we modeled level and covariate effect as potentially time varying No real hierarchical structure here because we focused on each in isolation

Species with correlated dynamics

Compare Univariate DLMs fit to each time series Hierarchical DLM with shared / correlated level terms Is there a shared regime change / trend

The code (also see comment box) jagsscript = cat(" model { # time varying level parameter tauQ ~ dgamma(0.001,0.001); tauR ~ dgamma(0.001,0.001); sigmaQ <- 1/sqrt(tauQ); sigmaR <- 1/sqrt(tauR); alpha[1] ~ dnorm(0,0.01); for(i in 2:N) { alpha[i] ~ dnorm(alpha[i-1],tauQ); } for(i in 1:N) { Y[i] ~ dnorm(alpha[i], tauR); } ",file="univariateDLM.txt") model.loc=("univariateDLM.txt")

Univariate DLM

Diagnostics

Code for multivariate DLM jagsscript = cat(" model { # time varying level parameter priorQ[1,1] <- 1; priorQ[2,2] <- 1; priorQ[1,2] <- 0; priorQ[2,1] <- 0; tauQ ~ dwish(priorQ, 2); tauR ~ dgamma(0.001,0.001); sigmaQ <- inverse(tauQ[1:2,1:2]); sigmaR <- 1/sqrt(tauR); for(pop in 1:2) { alpha[1,pop] ~ dnorm(0,0.01); } for(i in 2:N) { alpha[i,1:2] ~ dmnorm(alpha[i-1,1:2],tauQ); } for(i in 1:N) { Y[i,1] ~ dnorm(alpha[i,1], tauR); Y[i,2] ~ dnorm(alpha[i,2], tauR); } ",file="DLMcorrelated.txt")

Fits still great – how to compare univariate v multivariate?

High uncertainty with missing data

Summary Treating response variable hierarchically increases flexibility Modeling parameters hierarchically – Can lead to improved precision of estimates – Increased complexity can lead to better predictions – But these approaches require lots of data