An introduction to Dynamic Factor Analysis

Slides:



Advertisements
Similar presentations
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Advertisements

A.M. Alonso, C. García-Martos, J. Rodríguez, M. J. Sánchez Seasonal dynamic factor model and bootstrap inference: Application to electricity market forecasting.
Exploratory Factor Analysis
3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.
Lecture 7: Principal component analysis (PCA)
Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.
SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.
Prediction and model selection
Techniques for studying correlation and covariance structure
Factor Analysis Psy 524 Ainsworth.
Objectives of Multiple Regression
Introduction to Multilevel Modeling Using SPSS
Factor Analysis and Inference for Structured Covariance Matrices
Chapter 2 Dimensionality Reduction. Linear Methods
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Some matrix stuff.
Canonical Correlation Analysis and Related Techniques Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
Multivariate Statistics Matrix Algebra I W. M. van der Veld University of Amsterdam.
Advanced Correlational Analyses D/RS 1013 Factor Analysis.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
Generalised method of moments approach to testing the CAPM Nimesh Mistry Filipp Levin.
Exploratory Factor Analysis. Principal components analysis seeks linear combinations that best capture the variation in the original variables. Factor.
Education 795 Class Notes Factor Analysis Note set 6.
Principle Component Analysis and its use in MA clustering Lecture 12.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Principal Component Analysis (PCA)
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
Principal Component Analysis
Feature Extraction 主講人:虞台文.
Computacion Inteligente Least-Square Methods for System Identification.
Eric Ward Mark Scheuerell Eli Holmes Applied Time Series Analysis FISH 507.
Central limit theorem - go to web applet. Correlation maps vs. regression maps PNA is a time series of fluctuations in 500 mb heights PNA = 0.25 *
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Lecture 17: Multi-stage models with MARSS, etc. Multivariate time series with MARSS We’ve largely worked with: 1. Different time series of the same species.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
An introduction to Dynamic Factor Analysis Stockholm, Sweden24-28 March 2014 Applied Time Series Analysis for Ecologists.
Lecture 2 Survey Data Analysis Principal Component Analysis Factor Analysis Exemplified by SPSS Taylan Mavruk.
Unsupervised Learning
Covariance, stationarity & some useful operators
Eric Ward Mark Scheuerell Eli Holmes
Linear Regression.
LECTURE 11: Advanced Discriminant Analysis
An introduction to Dynamic Linear Models
Chapter 6: Autoregressive Integrated Moving Average (ARIMA) Models
CJT 765: Structural Equation Modeling
1 Chapter 1: Introduction to Statistics. 2 Variables A variable is a characteristic or condition that can change or take on different values. Most research.
Kalman Filtering: Control with Limited/Noisy Measurements
Descriptive Statistics vs. Factor Analysis
X.1 Principal component analysis
Principal Components Analysis
OVERVIEW OF LINEAR MODELS
Principal Component Analysis
Factor Analysis BMTRY 726 7/19/2018.
Chapter_19 Factor Analysis
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.
Principal Component Analysis
Seasonal Forecasting Using the Climate Predictability Tool
Lecturer Dr. Veronika Alhanaqtah
Lecture 8: Factor analysis (FA)
Exploratory Factor Analysis. Factor Analysis: The Measurement Model D1D1 D8D8 D7D7 D6D6 D5D5 D4D4 D3D3 D2D2 F1F1 F2F2.
Principal Component Analysis
Topic 11: Matrix Approach to Linear Regression
BOX JENKINS (ARIMA) METHODOLOGY
Unsupervised Learning
Presentation transcript:

An introduction to Dynamic Factor Analysis Mark Scheuerell FISH 507 – Applied Time Series Analysis 9 February 2017

Iterative approach to model building Postulate general class of models Identify candidate model Estimate parameters Diagnostics: is model adequate? Use model for forecasting or control No Yes

Observations of one or more states We have learned several forms of models for analyzing ts General idea is to reduce the ts to trend(s), seasonal effect(s) & stationary remainder When we have multiple ts, we have modeled them as observations of 1+ states (see Lec 6)

Observations of one or more states For example, recall our analyses of Pacific harbor seals 3 “true” populations JF N S 5 observations

Finding common patterns in data What if our observations were not of simply 1 state, but were instead a mixture of 2+ states (eg, we sampled a haul- out in between 2 breeding sites)?

Observations of one or more states Returning to our analyses of Pacific harbor seals 3 “true” populations JF N S 5 observations

Finding common patterns in data What if our observations were not of simply 1 state, but were instead a mixture of 2+ states (eg, we sampled a haul- out in between 2 breeding sites)? What if some unknown & unmeasured environmental drivers created common patterns among our ts (eg, adult salmon in the North Pacific)? We know how many observations we have, but we don’t know the number of states! So, what are the dimensions of Z?

Finding common patterns in data What if our observations were not of simply 1 state, but were instead a mixture of 2+ states (eg, we sampled a haul- out in between 2 breeding sites)? What if some unknown & unmeasured environmental drivers created common patterns among our ts (eg, adult salmon in the North Pacific)? Perhaps we could use a few common trends to describe much/most of the variance in the observations? Dynamic Factor Analysis (DFA) is an approach to ts modeling that does just that

Let’s start with PCA PCA stands for Principal Component Analysis Goal is to reduce some correlated variables to fewer uncorrelated values Number of principal components is generally less than the number of original variables

A graphical example

Adding in the first 2 PC’s

And rotating the basis

What exactly is DFA? It’s like PCA for time series DFA can indicate whether there are any: underlying common patterns/trends in the time series, interactions between the response variables, and what the effects of explanatory variables are. The mathematics are a bit complex—for details, see Zuur et al. (2003) Environmetrics

DFA in common terms Data Trend(s) Covariate(s) Errors = + +

DFA in matrix form State equation Observation equation Common trends over time Observation equation Relate trends (x) to observations (y) via Z

DFA with covariates State equation Observation equation Common trends over time Observation equation Relate trends (x) to observations (y) via Z, and covariates (d) to y via D

We will use model selection criteria to decide Examining the Z matrix Observation equation 1 state? 2 states? 3 states? We will use model selection criteria to decide

Relationship between PCA & DFA Similarity with PCA can be seen via In PCA, however, R is constrained to be diagonal Not so in DFA Zuur et al. (2003)

Various forms for R Diagonal & equal Equal variance & covariance Diagonal & unequal Spp-specific variances & covariances

Some caveats in fitting DFA models State equation Observation equation Infinite combinations! Harvey et al. (1989); Zuur et al. (2003)

Some caveats in fitting DFA models State equation 1) Set Q = Identity Observation equation 2) Constrain portions of Z and a Harvey et al. (1989); Zuur et al. (2003)

Constraining the a vector Observation equation Constraining portions of a (eg, n = 5; m = 3) Harvey et al. (1989); Zuur et al. (2003)

Constraining the a vector Observation equation Constraining portions of a (eg, n = 5; m = 3) Note: This approach causes the EM algorithm to take a very long time to converge Soln: We will demean our data and set a = 0 Harvey et al. (1989); Zuur et al. (2003)

Constraining the Z matrix Observation equation Constraining portions of Z (eg, n = 5; m = 3) Harvey et al. (1989); Zuur et al. (2003)

Constraining the Z matrix Observation equation Constraining portions of Z (eg, n = 5; m = 3) Harvey et al. (1989); Zuur et al. (2003)

We arbitrarily constrained Z to obtain just 1 of ∞ solutions Rotation matrix H We arbitrarily constrained Z to obtain just 1 of ∞ solutions (1) Harvey et al. (1989)

Rotation matrix H If H is m x m non-singular matrix, these 2 models are equivalent (1) We need to choose appropriate H — we’ll use “varimax” (2) Harvey et al. (1989)

Varimax rotation for H A “simple” solution means each factor has a small number of large loadings, and a large number of (near) zero loadings After varimax rotation, each original variable tends to be associated with one (or a small number) of factors Varimax searches for a linear combination of original factors that maximizes the variance of the loadings

For example, PCA on wine descriptions Varimax rotation for H For example, PCA on wine descriptions

After varimax rotation Varimax rotation for H After varimax rotation

A note of caution for model selection Some people compare data support for models with and without covariates, and varying numbers of trends But, this has to be done in a well reasoned manner This is an undetermined random walk This is a predetermined covariate Unless dt is very highly correlated with yt, then the trend- only models will be selected

Arctic-Yukon-Kuskokwim salmon Poor returns of Chinook & chum in AYK region over past decade have led to severe restrictions on commercial & subsistence harvest This has also led to repeated disaster declarations by the state and federal governments (nobody fished in 2012!). In response, native regional organizations, state and federal agencies formed an innovative partnership to cooperatively address problems (AYK SSI)

The data 15 stocks of Alaskan Chinook Brood years 1976-2005 Response is log(Recruits/Spawner) Covariates lagged by 1-5 years

Environmental indicators

Examples of some indicators

The analysis Varied the number of states/trends from 1-3 Varied forms of R to try: Diagonal and equal, Diagonal and unequal, Equal variances and covariances. Used AICc to select “best”model

Statewide results The most parsimonious model had: 2 common trends North Pacific Gyre Oscillation

Statewide results

Statewide results – covariate effect

Regional results

Regional results – covariate effect

Fits to data

Forecasting ability

Topics for lab Fitting DFA models without covariates Fitting DFA models with covariates Doing factor rotations