ESTIMATION METHODS We know how to calculate confidence intervals for estimates of and 2 Now, we need procedures to calculate and 2 , themselves Several methods to do this, we’ll look at only one: MAXIMUM LIKELIHOOD First, define Likelihood: L(y1, y2, …., yN) where y1, y2, …., yN are sample observations of random variables Y1, Y2, …., YN is the joint probability density evaluated at the observations yi PDF of random variables Y1, Y2, …., YN
MAXIMUM LIKELIHOOD METHOD Choose the parameter values that maximize Example: Apply method to estimates of and 2 for a normal population. Let y1, y2, …., yN be a random sample of the normal population Find Maximum Likelihood
Simplify by taking the logN (L): Taking derivative with respect to and 2 Making them equal to zero to get the maximum, the maximum likelihood:
are the Maximum Likelihood estimators of and 2 Making them equal to zero to get maximum likelihood estimators of mean and variance: substituting hat into are the Maximum Likelihood estimators of and 2 is an unbiased estimator of , but is not unbiased for 2
can be adjusted to the unbiased estimator: So, for a normally distributed oceanographic data set, we can readily obtain Maximum Likelihood estimates of and 2 This technique (ML) is really useful for variables that are not normally distributed. Spectral energy values from current velocities or sea level, show 2 rather than normal distribution Following the ML procedure, we find that the mean of the spectral values is and the variance is 2
So, with the ML approach you can calculate the best parameters that fit certain models. For instance, you can apply it to a pulse of current velocity data to obtain the best dissipation value and fitting coefficient in the inertial subrange , on the basis of Kolmogorov’s law for turbulence:
So in general, to apply the ML method to a sample: As another example, you can apply it to a segment of temperature gradient in a profile to obtain the best Batchelor length scale (or wave number B) and dissipation of temperature variance T, to get dissipation values on the basis of Batchelor spectrum for turbulence: So in general, to apply the ML method to a sample: - Determine appropriate PDF for sample values - Find joint likelihood function - Take natural logs - Differentiate wrt parameter of interest - Set derivative = 0 to find max - Obtain value of parameter Steinbuck et al., 2009
LINEAR ESTIMATION (REGRESSION) Consider the values y of a random variable Y called dependent variable. The values y are a function of one or more non-random variables x1, x2, …, xN called independent variables. The random variable can be modeled (represented) as: The random variable (not to be confused with dissipation used before) gives the departure from linearity and has a specific PDF with mean of zero. Simple linear regression:
If N independent variables are involved then we have a multiple linear regression: A powerful method to fit the independent variables x1 , x2 , … , xN to the dependent variable y is the method of least squares x y The simplest case is to fit a straight line to a set of points using the “best” coefficients b0 , b1 The method of least squares does what we do by eye, i.e., minimize deviations (residuals) between data points and fitted line.
Sum of Squares Regression (variance explained by regression) Let: where: is the deterministic portion of the data is the residual or error To find b0 , b1 minimize the sum of the squared errors (SSE) Sum of Squares Total (data variance) Sum of Squares Regression (variance explained by regression)
To minimize the sum of the squared errors (SSE) Two equations, two unknowns; solve for the parameters
x y Regression line always goes through Regression line splits the scatter of observations such that the positive residuals cancel out with negative residuals
Percent explained variance R2: Sum of Squares Total (data variance) Sum of Squares Regression (variance explained by regression) Goodness of Fit (Correlation of Determination) Least squares can be used to fit any curve – we’ll see it in harmonic analysis Least squares can be considered a Maximum Likelihood Estimator
q can be obtained from linear regression of scatter diagram (0, y) Rotation of axes q can be obtained from linear regression of scatter diagram y sinq (0, y) x’ y cosq x cosq -x sinq q x (x, 0)
Rotation of axes
CORRELATION Concept linked to time series analysis Correlation coefficient: determines how well two variables co-vary in time or space. For two random variables, x and y the correlation coefficient can be: Cxy is the covariance of x and y, and sx and sy are the stdev
AUTOCORRELATION x are the measurements L represents a lag N is the total number of measurements overbar represents mean over the N measurements rx is the autocorrelation coefficient for x rx oscillates between -1 and 1 rx equals1 at L = 0
Examples of Covariance, Correlation, Cross-Correlation & Autocorrelation load('adcp_08_2009.mat') n1=35; n=length(time); v=u(1:n1,:)'; figure ; contourf(time,[1:1:n1]*.5,v','LineStyle','None'); datetick('x','yy/mm/dd'); axis tight c=cov(v); figure; contourf(c); colorbar c1=sum((v(:,1)-mean(v(:,1))).^2)/(n-1); c2=sum((v(:,1)-mean(v(:,1))).*(v(:,2)-mean(v(:,2))))/(n-1); r=corr(v); figure ; contourf(r,[.95:0.001:1]) ; colorbar ; %caxis([-.2 1]) [r,p]=corr(v); figure ; contourf(p,[.0001,.001,.01,.05,.1,.5,.75,1]) ; colorbar ; caxis([.0001 1]) % autocorrelation v1=v(:,30); xc=xcorr(v1,v1,'coeff'); [xc,lags]=xcorr(v1,v1,'coeff'); figure; plot(lags/6,xc) hold on ; ef=1-exp(-1); plot([0,18],[ef,ef],'r'); hold off % cross-correlation v2=v(:,1); [xc,lags]=xcorr(v1,v2,'coeff'); figure; plot(lags/6,xc)