Lecture 18 Varimax Factors and Empircal Orthogonal Functions.

Slides:



Advertisements
Similar presentations
Lecture 1 Describing Inverse Problems. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture 03Probability.
Advertisements

Lecture 20 Continuous Problems Linear Operators and Their Adjoints.
Lecture 10 Nonuniqueness and Localized Averages. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Environmental Data Analysis with MatLab Lecture 15: Factor Analysis.
Lecture 14 Nonlinear Problems Grid Search and Monte Carlo Methods.
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Lecture 13 L1 , L∞ Norm Problems and Linear Programming
Lecture 15 Orthogonal Functions Fourier Series. LGA mean daily temperature time series is there a global warming signal?
Lecture 23 Exemplary Inverse Problems including Earthquake Location.
Environmental Data Analysis with MatLab
Lecture 22 Exemplary Inverse Problems including Filter Design.
Environmental Data Analysis with MatLab Lecture 9: Fourier Series.
Dimension reduction (1)
Environmental Data Analysis with MatLab Lecture 16: Orthogonal Functions.
Lecture 7 Backus-Gilbert Generalized Inverse and the Trade Off of Resolution and Variance.
Lecture 3 Probability and Measurement Error, Part 2.
Lecture 7: Principal component analysis (PCA)
Lecture 2 Probability and Measurement Error, Part 1.
Lecture 5 A Priori Information and Weighted Least Squared.
Environmental Data Analysis with MatLab Lecture 17: Covariance and Autocorrelation.
Lecture 19 Continuous Problems: Backus-Gilbert Theory and Radon’s Problem.
Lecture 15 Nonlinear Problems Newton’s Method. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Lecture 4 The L 2 Norm and Simple Least Squares. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Lecture 16 Nonlinear Problems: Simulated Annealing and Bootstrap Confidence Intervals.
Lecture 9 Inexact Theories. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture 03Probability and.
Lecture 17 Factor Analysis. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture 03Probability and.
Lecture 21 Continuous Problems Fréchet Derivatives.
Lecture 24 Exemplary Inverse Problems including Vibrational Problems.
Lecture 6 Resolution and Generalized Inverses. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Some useful linear algebra. Linearly independent vectors span(V): span of vector space V is all linear combinations of vectors v i, i.e.
Environmental Data Analysis with MatLab Lecture 5: Linear Models.
Curve-Fitting Regression
Lecture 12 Equality and Inequality Constraints. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Reduced Support Vector Machine
Lecture 8 The Principle of Maximum Likelihood. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Environmental Data Analysis with MatLab Lecture 24: Confidence Limits of Spectra; Bootstraps.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Clustering In Large Graphs And Matrices Petros Drineas, Alan Frieze, Ravi Kannan, Santosh Vempala, V. Vinay Presented by Eric Anderson.
Lecture 11 Vector Spaces and Singular Value Decomposition.
Lecture 20 Empirical Orthogonal Functions and Factor Analysis.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.
Camera parameters Extrinisic parameters define location and orientation of camera reference frame with respect to world frame Intrinsic parameters define.
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
Today Wrap up of probability Vectors, Matrices. Calculus
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Modern Navigation Thomas Herring
SVD: Singular Value Decomposition
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
BioSS reading group Adam Butler, 21 June 2006 Allen & Stott (2003) Estimating signal amplitudes in optimal fingerprinting, part I: theory. Climate dynamics,
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Principle Component Analysis and its use in MA clustering Lecture 12.
Searching a Linear Subspace Lecture VI. Deriving Subspaces There are several ways to derive the nullspace matrix (or kernel matrix). ◦ The methodology.
Camera Calibration Course web page: vision.cis.udel.edu/cv March 24, 2003  Lecture 17.
Central limit theorem revisited Throw a dice twelve times- the distribution of values is not Gaussian Dice Value Number Of Occurrences.
Central limit theorem - go to web applet. Correlation maps vs. regression maps PNA is a time series of fluctuations in 500 mb heights PNA = 0.25 *
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Environmental Data Analysis with MatLab 2 nd Edition Lecture 22: Linear Approximations and Non Linear Least Squares.
PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules LECTURE 8 Supplementary Readings: Wilks, chapters 9.
Lecture 8:Eigenfaces and Shared Features
Unfolding Problem: A Machine Learning Approach
Some useful linear algebra
Singular Value Decomposition SVD
5.2 Least-Squares Fit to a Straight Line
Dynamic modeling of gene expression data
Seasonal Forecasting Using the Climate Predictability Tool
Unfolding with system identification
Presentation transcript:

Lecture 18 Varimax Factors and Empircal Orthogonal Functions

Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture 03Probability and Measurement Error, Part 2 Lecture 04The L 2 Norm and Simple Least Squares Lecture 05A Priori Information and Weighted Least Squared Lecture 06Resolution and Generalized Inverses Lecture 07Backus-Gilbert Inverse and the Trade Off of Resolution and Variance Lecture 08The Principle of Maximum Likelihood Lecture 09Inexact Theories Lecture 10Nonuniqueness and Localized Averages Lecture 11Vector Spaces and Singular Value Decomposition Lecture 12Equality and Inequality Constraints Lecture 13L 1, L ∞ Norm Problems and Linear Programming Lecture 14Nonlinear Problems: Grid and Monte Carlo Searches Lecture 15Nonlinear Problems: Newton’s Method Lecture 16Nonlinear Problems: Simulated Annealing and Bootstrap Confidence Intervals Lecture 17Factor Analysis Lecture 18Varimax Factors, Empircal Orthogonal Functions Lecture 19Backus-Gilbert Theory for Continuous Problems; Radon’s Problem Lecture 20Linear Operators and Their Adjoints Lecture 21Fréchet Derivatives Lecture 22 Exemplary Inverse Problems, incl. Filter Design Lecture 23 Exemplary Inverse Problems, incl. Earthquake Location Lecture 24 Exemplary Inverse Problems, incl. Vibrational Problems

Purpose of the Lecture Choose Factors Satisfying A Priori Information of Spikiness (varimax factors) Use Factor Analysis to Detect Patterns in data (EOF’s)

Part 1: Creating Spiky Factors

can we find “better” factors that those returned by svd() ?

mathematically S = CF = C’ F’ with F’ = M F and C’ = M -1 C where M is any P×P matrix with an inverse must rely on prior information to choose M

one possible type of prior information factors should contain mainly just a few elements

example of rock and minerals rocks contain minerals minerals contain elements MineralComposition QuartzSiO 2 RutileTiO 2 AnorthiteCaAl 2 Si 2 O 8 FosteriteMg 2 SiO 4

example of rock and minerals rocks contain minerals minerals contain elements MineralComposition QuartzSiO 2 RutileTiO 2 AnorthiteCaAl 2 Si 2 O 8 FosteriteMg 2 SiO 4 factors most of these minerals are “simple” in the sense that each contains just a few elements

spiky factors factors containing mostly just a few elements

How to quantify spikiness?

variance as a measure of spikiness

modification for factor analysis

depends on the square, so positive and negative values are treated the same

f (1) = [1, 0, 1, 0, 1, 0] T is much spikier than f (2) = [1, 1, 1, 1, 1, 1] T

f (2) =[1, 1, 1, 1, 1, 1] T is just as spiky as f (3) = [1, -1, 1, -1, -1, 1] T

“varimax” procedure find spiky factors without changing P start with P svd() factors rotate pairs of them in their plane by angle θ to maximize the overall spikiness

fBfB fAfA f’ B f’ A 

determine θ by maximizing

after tedious trig the solution can be shown to be

and the new factors are in this example A=3 and B=5

now one repeats for every pair of factors and then iterates the whole process several times until the whole set of factors is as spiky as possible

OldNew f5f5 f2f2 f3f3 f4f4 f’5f’5 f’2f’2 f’3f’3 f’4f’4 SiO 2 TiO 2 Al 2 O 3 FeO total MgO CaO Na 2 O K2OK2O example: Atlantic Rock dataset

OldNew f5f5 f2f2 f3f3 f4f4 f’5f’5 f’2f’2 f’3f’3 f’4f’4 SiO 2 TiO 2 Al 2 O 3 FeO total MgO CaO Na 2 O K2OK2O example: Atlantic Rock dataset not so spiky

OldNew f5f5 f2f2 f3f3 f4f4 f’5f’5 f’2f’2 f’3f’3 f’4f’4 SiO 2 TiO 2 Al 2 O 3 FeO total MgO CaO Na 2 O K2OK2O example: Atlantic Rock dataset spiky

Part 2: Empirical Orthogonal Functions

row number in the sample matrix could be meaningful example: samples collected at a succession of times time

column number in the sample matrix could be meaningful example: concentration of the same chemical element at a sequence of positions distance

S = CF becomes

distance dependence time dependence

S = CF becomes each loading: a temporal pattern of variability of the corresponding factor each factor: a spatial pattern of variability of the element

S = CF becomes there are P patterns and they are sorted into order of importance

S = CF becomes factors now called EOF’s (empirical orthogonal functions)

example 1 hypothetical mountain profiles what are the most important spatial patterns that characterize mountain profiles

this problem has space but not time s( x j, i ) = Σ k=1 p c ki f (k) (x j )

this problem has space but not time s( x j, i ) = Σ k=1 p c ki f (k) (x j ) factors are spatial patterns that add together to make mountain profiles

elements: elevations ordered by distance along profile

EOF 3EOF 2EOF 1 index, i f i (1) f i (2) f i (3) λ 1 = 38 λ 2 = 13 λ 3 = 7

factor loading, C i2 factor loading, C i3

example 2 spatial-temporal patterns (synthetic data)

the data

spatial pattern at a single time x y t=1

the data time

the data

s need to unfold each 2D image into vector

index, i λiλi p=3

EOF 1 EOF 2 EOF 3 loadng 1 loadng 2 loadng 3 (A) (B) time, t

example 3 spatial-temporal patterns (actual data) sea surface temperature in the Pacific Ocean

29  S 29  N 124  E290  E latitude longitude equatorial Pacific Ocean sea surface temperature (black = warm) CAC Sea Surface Temperature

the image is 30 by 84 pixels in size, or 2520 pixels total to use svd(), the image must be unwrapped into a vector of length 2520

2520 positions in the equatorial Pacific ocean 399 times“element” means temperature

singular values,  ii index, i singular values

singular values,  ii index, i singular values no clear cutoff for P, but the first 12 singular values are considerably larger than the rest

using SVD to approximate data

S=C M F M S=C P F P S≈C P’ F P’ With M EOF’s, the data is fit exactly With P chosen to exclude only zero singular values, the data is fit exactly With P’<P, small non-zero singular values are excluded too, and the data is fit only approximately

A) OriginalB) Based on first 5 EOF’s