What is the likelihood that your model is wrong? Generalized tests and corrections for overdispersion during model fitting and exploration James Thorson,

Slides:



Advertisements
Similar presentations
Modeling Recruitment in Stock Synthesis
Advertisements

Evaluation of standard ICES stock assessment and Bayesian stock assessment in the light of uncertainty: North Sea herring as an example Samu MäntyniemiFEM,
458 Quantifying Uncertainty using Classical Methods (Likelihood Profile, Bootstrapping) Fish 458, Lecture 12.
Gavin Using size increment data in age-structured stock assessment models CAPAM growth workshop: Nov 2014.
Modeling fisheries and stocks spatially for Pacific Northwest Chinook salmon Rishi Sharma, CRITFC Henry Yuen, USFWS Mark Maunder, IATTC.
Estimating Growth Within Size-Structured Fishery Stock Assessments ( What is the State of the Art and What does the Future Look Like? ) ANDRÉ E PUNT, MALCOLM.
An evaluation of alternative binning approaches for composition data in integrated stock assessments Cole Monnahan, Sean Anderson, Felipe Hurtado, Kotaro.
Growth in Age-Structured Stock Assessment Models R.I.C. Chris Francis CAPAM Growth Workshop, La Jolla, November 3-7, 2014.
C3: Estimation of size-transition matrices with and without molt probability for Alaska golden king crab using tag–recapture data M.S.M. Siddeek, J. Zheng,
Dealing with interactions between area and year Mark Maunder IATTC.
The current status of fisheries stock assessment Mark Maunder Inter-American Tropical Tuna Commission (IATTC) Center for the Advancement of Population.
Jun Zhu Dept. of Comp. Sci. & Tech., Tsinghua University This work was done when I was a visiting researcher at CMU. Joint.
Visual Recognition Tutorial
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
458 More on Model Building and Selection (Observation and process error; simulation testing and diagnostics) Fish 458, Lecture 15.
Introduction to template model builder, an improved tool for maximum likelihood and mixed-effects models in R James Thorson.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 20 Learning features one layer at a time Geoffrey Hinton.
Hui-Hua Lee 1, Kevin R. Piner 1, Mark N. Maunder 2 Evaluation of traditional versus conditional fitting of von Bertalanffy growth functions 1 NOAA Fisheries,
Age-structured assessment of three Aleutian fish stocks with predator-prey interactions Doug Kinzey School of Aquatic and Fishery Sciences University of.
Applications of Nonparametric Survey Regression Estimation in Aquatic Resources F. Jay Breidt, Siobhan Everson-Stewart, Alicia Johnson, Jean D. Opsomer.
Scot Exec Course Nov/Dec 04 Ambitious title? Confidence intervals, design effects and significance tests for surveys. How to calculate sample numbers when.
Single and Multiple Spell Discrete Time Hazards Models with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.
Gaussian process modelling
Kalman filtering techniques for parameter estimation Jared Barber Department of Mathematics, University of Pittsburgh Work with Ivan Yotov and Mark Tronzo.
Guide to Handling Missing Information Contacting researchers Algebraic recalculations, conversions and approximations Imputation method (substituting missing.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici.
Use of multiple selectivity patterns as a proxy for spatial structure Felipe Hurtado-Ferro 1, André E. Punt 1 & Kevin T. Hill 2 1 University of Washington,
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
Spatial issues in WCPO stock assessments (bigeye and yellowfin tuna) Simon Hoyle SPC.
A retrospective investigation of selectivity for Pacific halibut CAPAM Selectivity workshop 14 March, 2013 Ian Stewart & Steve Martell.
Evaluation of a practical method to estimate the variance parameter of random effects for time varying selectivity Hui-Hua Lee, Mark Maunder, Alexandre.
Biostatistics: Sampling strategies Data collection for fisheries assessment: Monitoring and sampling strategies.
Maximum Likelihood - "Frequentist" inference x 1,x 2,....,x n ~ iid N( ,  2 ) Joint pdf for the whole random sample Maximum likelihood estimates.
Integrating archival tag data into stock assessment models.
The Stock Synthesis Approach Based on many of the ideas proposed in Fournier and Archibald (1982), Methot developed a stock assessment approach and computer.
Workshop on Stock Assessment Methods 7-11 November IATTC, La Jolla, CA, USA.
Simulated data sets Extracted from:. The data sets shared a common time period of 30 years and age range from 0 to 16 years. The data were provided to.
Flexible estimation of growth transition matrices: pdf parameters as non-linear functions of body length Richard McGarvey and John Feenstra CAPAM Workshop,
Modeling Growth in Stock Synthesis Modeling population processes 2009 IATTC workshop.
U.S. Department of Commerce | National Oceanic and Atmospheric Administration | NOAA Fisheries | Page 1 Model Misspecification and Diagnostics and the.
The effect of variable sampling efficiency on reliability of the observation error as a measure of uncertainty in abundance indices from scientific surveys.
Extending length-based models for data-limited fisheries into a state-space framework Merrill B. Rudd* and James T. Thorson *PhD Student, School of Aquatic.
Classification Ensemble Methods 1
September 28, 2000 Improved Simultaneous Data Reconciliation, Bias Detection and Identification Using Mixed Integer Optimization Methods Presented by:
Modeling biological-composition time series in integrated stock assessments: data weighting considerations and impact on estimates of stock status P. R.
Simulation of methods to account for spatial effects in the stock assessment of Pacific bluefin tuna Cast by: Hui-hua Lee (NOAA Fisheries, SWFSC) Kevin.
Using distributions of likelihoods to diagnose parameter misspecification of integrated stock assessment models Jiangfeng Zhu * Shanghai Ocean University,
Sampling Designs Outline
CAN DIAGNOSTIC TESTS HELP IDENTIFY WHAT MODEL STRUCTURE IS MISSPECIFIED? Felipe Carvalho 1, Mark N. Maunder 2,3, Yi-Jay Chang 1, Kevin R. Piner 4, Andre.
Capture-recapture Models for Open Populations “Single-age Models” 6.13 UF-2015.
Some Insights into Data Weighting in Integrated Stock Assessments André E. Punt 21 October 2015 Index-1 length-4.
Lecture 10 review Spatial sampling design –Systematic sampling is generally better than random sampling if the sampling universe has large-scale structure.
Anders Nielsen Technical University of Denmark, DTU-Aqua Mark Maunder Inter-American Tropical Tuna Commission An Introduction.
A bit of history Fry 1940s: ”virtual population”, “catch curve”
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Data weighting and data conflicts in fishery stock assessments Chris Francis Wellington, New Zealand CAPAM workshop, “ Data conflict and weighting, likelihood.
Survey sampling Outline (1 hr) Survey sampling (sources of variation) Sampling design features Replication Randomization Control of variation Some designs.
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Day 4, Session 1 Abundance indices, CPUE, and CPUE standardisation
Population Dynamics and Stock Assessment of Red King Crab in Bristol Bay, Alaska Jie Zheng Alaska Department of Fish and Game Juneau, Alaska, USA.
Stockholm, Sweden24-28 March 2014 Introduction to MAR(1) models for estimating community interactions This lecture follows largely from: Ives AR, Dennis.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
An Introduction to AD Model Builder PFRP
CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.
Boosting and Additive Trees (2)
Linear Mixed Models in JMP Pro
How to handle missing data values
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
JABBA-Select: Simulation-Testing Henning Winker
Presentation transcript:

What is the likelihood that your model is wrong? Generalized tests and corrections for overdispersion during model fitting and exploration James Thorson, Kelli Johnson, Richard Methot, and Ian Taylor Oct. 20, 2015 NWFSC (Seattle)

Outline Likelihoods and random processes Surplus production model Steps for real-world assessments 1.Standardizing compositional data for input sample size 2.Estimating effective sample size 3.Exploring process errors for overdispersed fleets Plan moving forward – Probability of the model given data

Likelihoods

Implications 1.If you want to estimate fixed effects …. 2.… and there’s additional stochastic process that is unobservable… 3.… then you can estimate fixed-effects via a mixed- effects model! Benefits – Generic approach to correlations, heteroskedasticity, and heterogeneity – (Fixes most violations in statistical models)

Likelihoods Hierarchical models Why would you make a hierarchy of parameters? 1.Stein’s paradox and shrinkage – Pooling parameters towards a mean will be more accurate on average 2.Biological intuition – Formulate models based on knowledge of constituent parts 3.Variance partitioning – Separate different sources of variability (e.g., measurement errors!) More reading – Thorson, J.T., Minto, C., Mixed effects: a unifying framework for statistical modelling in fisheries biology. ICES J. Mar. Sci. J. Cons. 72, 1245–1256. –

Likelihoods

Surplus production model

Restrictions – Assume r is known – (Otherwise scale K is confounded with productivity r) Programming techniques – Explicit-F parameterization – Treat exploitation rate as random effect, with variance fixed at low value (CV=0.01) Code publicly available: Thorson/state_space_production_model

Estimating overdispersion Neglecting overdispersion Surplus production model

Accounting for overdispersion improves parameter estimates (estimate, ignore, true)

Step #1: Comp. standardization Three sampling “Strata”: latitude Difference in age-structure by depth Thorson, J.T., Standardizing compositional data for stock assessment. ICES J. Mar. Sci. J. Cons. 71, 1117–1128.

Step #1: Comp. standardization

Difference in age structure by depth Dotted: inshore Dashed: offshore Solid: combined

Step #1: Comp. standardization Four estimators Design-based Dirichlet-multinomial Normal approx. Normal w/ process error Performance for estimating proportion at age Top of panel – RMSE, low is good – (bias), close to zero is good

Step #1: Comp. standardization Estimating overdispersion Dirichlet-multinomial Normal approx. Normal w/ process error

Step #2: Estimating N effective Modify Stock Synthesis V3.3 – target release date: Jan New feature: Dirichlet-multinomial distribution – Turn-on for any fleet (fishery or survey) – Works for length/age comps – Should work for conditional age-at-length and length-at- age (but is not tested) – Allows mirroring (single parameter for multiple fleets) Useful for spatially stratified models (e.g., Canary rockfish)

Step #2: Estimating N effective

Linear effective sample size New parameter has similar action to iterative- reweighting factors Therefore… Compare its performance with McAllister-Ianelli iterative reweighting approach

Step #2: Estimating N effective Factorial design Three levels of overdispersion Three true sample sizes (N={25,100,400}) Conclusion Dirichlet-multinomial estimates N eff accurately – Small positive bias given high true sample size

Step #2: Estimating N effective Performance for estimating parameters? – Works similarly to McAllister-Ianelli method Estimation method True overdispersion

Step #2: Estimating N effective Case study: pacific hake – Four models: Unweighted, McAllister-Ianelli, Dirichlet-multinomial, no fishery ages Conclusion: Works similarly to McAllister-Ianelli

Step #2: Estimating N effective Benefits of internal estimation 1.Allows proper weighting during profiles and sensitivities – Profiles/sensitivities currently aren’t often tuned 2.Propagates uncertainty during standard errors and forecast intervals – Uncertainty in weighting currently not included in any confidence/credible intervals 3.Permits focus on other iterative model-fitting steps – Variance of process errors!

Step #3: Process errors Many methods to estimate process errors in assessment models 1.Add a penalty and “wing it” Just call it “penalized likelihood” and hope no one asks… Eye-ball fit to data Ad hoc tuning 2.First-order approximations Statistically motivated tuning (G. Thompson, pers. comm.) Sample variance plus estimation variance (Methot and Taylor 2014, CJFAS) 3.Clever model modifications Empirical weight-at-age 4.Statistical methods Laplace approximation (Thorson Hicks Methot 2014 ICESJMS) Bayesian estimation (e.g., Mäntyniemi et al CJFAS)

Step #3: Process errors Laplace approximation 1.ADMB – very slow Requires iterative re-fitting of model 2.TMB – fast and efficient Requires rebuilding code

Step #3: Process errors Template Model Builder – Permits faster estimation with more parameters Fully spatial delay-difference model – Accounts for spatial variation in spawning biomass Thorson, J.T., Ianelli, J., Munch, S., Ono, K., Spencer, P., In press. Spatial delay-difference modelling: a new approach to estimating spatial and temporal variation in recruitment and population abundance. Can. J. Fish. Aquat. Sci.

Step #3: Process errors Clever model modifications 1.Empirical weight-at-age Deals easily with time-varying growth Doesn’t account for uncertainty 2.Statistical VPA (MacCall and Teo 2013 Fish Res) Deals with time-varying selectivity Not easy in most software packages 3.Empirical maturity and fecundity schedules (I think this is still the norm most places…)

Plan moving forward Three practical steps: 1.Estimate input sample sizes from the data 2.Account for overdispersion as a parameter 3.Account for correlations via process errors Why not combine steps 2 and 3?

Plan moving forward Why not combine Steps 2 and 3? – Structure of correlation in fishery models is very complicated! Correlations among: 1.ages 2.lengths 3.sexes 4.fleets – There’s probably many ways to model correlations Different methods account for some correlations but not others! Francis (2014) account for correlations among lengths OR ages, but not simultaneously

Plan moving forward Obvious approach to modelling correlations… … is mixed effects! Benefits 1.Retains focus on modelling the process Doesn’t obfuscate correlations 2.Allows calculation of predicted compositions Hard to calculate using methods that use ad hoc corrections for correlations 3.Keeps us working with mainstream statistics Easier to work with ecologists E.g., U-CARE and E-SURGE as diagnostic for overdispersion in tag-resighting models

Plan moving forward Where can we start? Make a list of unmodeled processes that generate correlations in compositional data – Recruitment variation (usually modeled explicitly) – Time-varying selectivity (Actually caused by spatial variation in density and fishing rate) – Time-varying individual growth – Time- and age-varying natural mortality rates

Plan moving forward What if there’s multiple processes that can account for correlations? Multi-model inference 1.Model selection Only valid if each constituent model is admissible 2.Model averaging / Ensemble modelling Averaging results from multiple models 3.Multi-model decision theory Averaging decision from each model, with weights derived for performance in that decision

Plan moving forward Next steps 1.Methods and software for compositional standardization Big black-box in most assessments I’ve seen! 2.Additional research regarding internal estimation of overdispersion Does it affect estimates of uncertainty? Does it affect profiles? 3.Improved algorithms and software for mixed-effect assessment models

Acknowledgements Input on themes – Allan Hicks, Mark Maunder