Download presentation
Presentation is loading. Please wait.
Published byMitchell Porter Modified over 9 years ago
1
Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_ 2014/ Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_ 2014/
2
Exercise 3a: Error Estimates Known errors for Q, T and H Need error estimates for Q/(ρ c p H) dT/dt
3
Exercise 3a: Error Estimates Terms quite similar: Q/(ρ c p H) dT/dt Compare LHS variance with error variance estimate
4
1) Compute estimated relative error variance for heating term 2) Convert to error variance by multiplying by variance 3) Compute error variance for dT/dt (assuming uncorrelated errors) 4) Total error variance is sum of these terms 5) Compare with variance of LHS 6) Error variance > LHS variance => other terms negligible comared with errors Exercise 3a: Error Estimates
5
Exercise 3b: Significance Test Is the recent global temperature change significantly different from the previous values? Probability of year-to-year temperature differences
6
Exercise 3b: Significance Test Differences ΔT can be normalized to N(0,1) using the Z-transform Probability p of Z score (or lower) found from a table OR probability of ΔT score can be found from Matlab: p = normcdf(Z,0,1) [Matlab function] Is the most recent temperature change significantly different from previous values at 5%? (Is the probability of recent value smaller than 5%?)
7
The error variance in the derivative dT/dt is given by How is the error variance affected by the choice of time step Δt? Consider two cases: 1)errors are point-to-point random 2)errors have time scales long relative to T Derivatives and Errors: strategy
8
1) For errors with little serial correlation the error variance in the derivative decreases with increasing interval Δt 2) For errors that are highly correlated (more like a bias) there is little advantage to increasing the interval size because the error terms nearly cancel Derivatives and Errors
9
Analysis of Variance (ANOVA) Evaluate a dynamical or statistical model using observations [mean and periodic signals removed] Example: linear function The error is the difference between the model and the observations. The data variance NOT explained by model (error variance) is The fraction of data variance σ 2 that IS explained by the model is usually called the skill Signal-to-noise ratio:
10
Chi-Squared A related measure of the goodness of fit of a least-squares estimator is chi-squared (Χ 2 ) For errors that are all about the same size this is just the error variance divided by the data variance which is the fraction of variance in the error so in this case skill is given by
11
Analysis of Variance: correlation vs. skill What is the relationship between correlation and skill for By definition: In terms of variance: Eliminate y: which is the skill S. This is a special case where amplitude of y is nearly the same as that of x and
12
Analysis of Variance for a Model What is the relationship between skill and correlation for In terms of variance: So the skill S is given by
13
Exercise 4: Lagged correlations SSH: longitude-time Δt Δx Lag correlate the SSH at one location with SSH at every other location to get this image:
14
Measuring Similarity The correlation gives a measure of the similarity of two time series but not their magnitude Example of a model that is highly correlated with data but smaller in magnitude The fraction of variance explained by the model (skill) does not distinguish between correlations and magnitudes. Does any metric describe both?
15
To compare both correlations and relative magnitudes (Taylor, 2001) Given a model of an observed value The mean squared error is given by which reduces to (1) because Compare with law of cosines for triangle (2) Taylor Diagram σmσm σoσo ε θ Truth ---
16
Taylor Diagram & Normalized Version Eqns (1) and (2) match if we define Triangle shows error contributions of both correlation and magnitude geometrically Normalized version: Divide through by variance of observations σ o or where r is relative magnitude
17
Analysis of Variance for a Model What is the relationship between skill and correlation for In terms of variance: So the skill S is given by where. For large a, S can be negative. (For small a the correlation will be small and the assumption is violated.)
18
Skill: Empirical vs. Model A good estimator has a skill near 1 (small squared error). A linear regression is an empirical fit to data that minimizes the squared error (least-squares fit). Its skill is positive by design. Skill or fraction of variance explained can also be used to evaluate a model d m (x,t) = f[u(x,t]) However, models do not optimize the fit to the observations (unless data assimilation is used) so the skill can be negative. For example, a model that predicts the seasonal cycle of ocean temperature, with a good match of phase, but an underestimate of amplitude, could give a negative skill. Therefore, Taylor diagrams are frequently used to evaluate models.
19
Lowpass Filter Smooth data to remove errors. What is the assumption? filter removes half the power at specified time (or space) scale Input to function “butter” in signal processing toolbox: Wn = 2* t/half_power 2* t is twice the grid spacing (Nyquist frequency)
20
Other Filters Highpass filter: removes low frequencies (or large spatial scales) Bandpass filter: removes low and high frequencies (small and large scales)
21
Filtering and Correlations Lowpass filter: what happens to the integral length scale? How does this affect N*? How does this affect the significance level for correlations?
22
Linear Algebra Review (2)
23
Linear Algebra Review (3) B=A\C
24
Linear Algebra Review (1)
25
Linear Regression
27
Linear Regression (cont’d) x minimum error observation z y estimate estimate error best estimate
28
Linear Regression (cont’d) y=X =X\y
29
Linear Regression: limiting the number of variables Note: fitting data to a curve is a simple form of linear regression in which variables X are 1, x, x 2, x 3,... Coefficients are optimized to give best fit to the data. For each variable X added to the regression squared error decreases, because coefficients also fit random components (“noise”). However, on another set of data the same coefficients will not fit random components and the fit may be worse. The amount by which the estimator overfits data is sometimes called “artificial skill.” Minimize “artificial skill” by limiting regression to only significant variables, which are determined using an F test on the variance reduction (skill) of the estimator. We check the estimator by comparing the errors from regression and errors from another set of data.
30
Significance of Linear Regression k is number of additional parameters
31
Code for Linear Regression (1)
32
Code for Linear Regression (2)
33
Exercise 6: Linear Regression create linear estimator for heat flux
34
Linear regression Test each variable (hindcast) Regression on single variables for latent heat flux [use only half the data] cosineair-sea temp. wind speed humidity diff.
35
Multiple Regression Test combinations of variables (hindcast) Find variable(s) least correlated with best single variable correlated = redundant F test for evaluating additional variables humidity humidity + wind humidity + wind + cosine
36
Multiple Regression Examine residuals plot residual check histogram (nearly normal?) Is there a pattern in residual? residual humidity + wind + cosine
37
Linear regression Test regressions on new data (forecast) Compare hindcast and forecast errors Do estimators perform as predicted? Check for patterns in residual humidity + wind
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.