Methods for dealing with spurious covariances arising from small samples in ensemble data assimilation Jeff Whitaker jeffrey.s.whitaker@noaa.gov NOAA Earth System Research Lab, Boulder what is ensemble data assimilation? what are the consequences of sampling error? covariance localization. alternatives to covariance localization.
Ensemble data assimilation Parallel forecast and analysis cycles Background-errors estimated from sample covariances, depend on weather situation. Ensemble forecasting well established in NWP Analysis schemes still mostly deterministic (only a single background forecast is evolved). Use dynamically evolved ensemble to estimate ‘errors of the day’ - examples.
Ensemble Kalman Filter Ensemble mean is updated via Kalman Filter equations. H is operator that takes model state vector and converts it to predicted observations.
Ensemble Kalman Filter k ensemble members from a forecast model ensemble (sample) mean Instead of propagating covariance matrix, the individual samples that are used to construct the covariance are evolved individually (essential a ‘square-root’ formulation).
Ensemble Kalman Filter k ensemble members from a forecast model ensemble (sample) mean background-error (sample) covariance Don’t actually need to compute the entire matrix (would not fit in memory).
Ensemble Kalman Filter k ensemble members from a forecast model ensemble (sample) mean background-error (sample) covariance Update of ‘perturbations’ (deviations from mean) computed so that analysis error covariance is what you expect from KF equations analysis-error covariance
Consequences of Sampling Error Top: mean SLP, sample covariance between SLP everywhere and point in East Asia 2nd: same for 400 member ensemble. 3rd: function that is 1 at ob location, tapers to zero several thousand km away. 4rd: 25 member covariance multiplied by taper function (looks more like 400 member covariance).
Mis-specification of background-error covariance Sampling error results in large errors in sample covariance ‘far’ from observation location. Results in inappropriate updates to state vector. Simple example of how errors in Pb can affect state update in KF. 2-D state vector (x1 and y2), single ob only for x1 (x2 unobserved). Heavy line is prior covariance (marginal distribution on axes). Dot on x1 axis is value of ob, light solid is marginal distribution for ob. Dashed line is posterior covariance. Note the true background x1 is uncorrelated with x2. Underestimating covariance causes ob not to be used enough, posterior covariance too similar to prior. Over-estimating correlations between state variables causes state vector to be incremented too much in that direction. Posterior variance is too small in x2 direction, x2 mean is biased.
Effect of localization in a simplied GCM (1) 2-layer PE model on a sphere 46 observations over the globe A few years ago, we wrote a paper looking at the effect of covariance localization in a simplified GCM (no model error). We varied the ensemble size and the severity of the localization. For small ensembles, if not enough localization used, the filter diverged. More ensemble members, less localization needed. There’s a ‘sweet spot’ which minimizes ensemble mean error for a given ensemble size (and observation network).
Effect of localization in a simplied GCM (2) Rank histograms show what’s happening to the ensemble as the filter length scale is varied. Trend toward too much population at extreme ranks when length scale increased (then filter divergence). When filter length scale is too short, not enough population at extreme ranks (too much variance, because ensemble is not corrected enough far away from ob).
Effect of localization in a simplied GCM (3) Eigen-analysis of prior ensemble (prior to localization. Spectrum too steep for small ensembles. Assume 400 members represents ‘truth’. Apply localization to 25 member ensemble flattens the spectrum, adding new directions (increasing the rank of Pb). Limit of delta function, spectrum is flat. Without localization, ensemble is updated in too small a subspace. With localization, updates depend on distance from ob (more degrees of freedom).
Covariance localization increases rank of Pb If the ensemble has k members, then Pb describes nonzero uncertainty only in a k-dimensional subspace . Analysis only adjusted in this subspace. If the system is high-dimensionally unstable (if it has more than k positive Lyapunov exponents) then forecast errors will grow in directions not accounted for by the ensemble, and these errors will not be corrected by the analysis. Sampling error manifests itself directly in the form of spurious long-range covariances. Alternaltely, one can think of the sampling error as a manifestation of rank-deficiency in the ensemble (if k < the number of degrees of freedom in the dynamical model). Can’t correct the missing directions, so errors grow and ensemble variance shrinks, leading to filter divergence.
Alternative to localization Localizing covariances works because it increases the dimensionality…. So, one can instead compute updates in local regions where error dynamics evolves in a lower-dimensional subspace (< k). (LETKF - Hunt et al, 2007) This interpretation leads naturally to another way of solving the problem. Instead of updating the entire state vector at once, update localized pieces where error dynamics can be described by the small ensemble.
Two EnKF approaches Serial approach - for each observation, update each model variable (tapering the influence of the observation to zero at a specified distance). Used in NCAR DART. Local approach - update each model variable one at a time, using all observations within a specified radius (increasing R with distance between observation and model variable) - we use this approach since it scales well on massively parallel computers So, we have two ways of solving the problem. Serial approach makes the most sense when you have few obs and a large state vector Local approach makes more sense when you have lots of obs - then it’s much easier to parallelize the problem and it scales well on a MP system. Mathematically, there are differences between the two (the Local approach involves some approximations), but practical experience has shown there is little or no difference in accuracy.
Outstanding issues Both methods assume a priori that covariance is maximized at the observation location - problematic for non-local and time-lagged obs. Both methods are flow-independent (assume same degree of locality for every situation). Localization can destroy balance. Both method have deficiencies - the localization is the same all the time, and geostrophic balance is not maintained.
Localization and Balance Analysis of single zonal wind observation, using idealized nondivergent and geostrophically balanced covariances. Control imbalance by time-filtering first-guess forecast. Mid-latitude geophysical flows are nearly in geostrophic balance at the scales of cyclonic storms and fronts. That is, the wind field is proportional to the gradient of the geopot. Height field. Localization can mess this up. Here’s an example of the increment association with a single wind ob, where Pb is in exact geostrophic balance (ageostrophic wind increments are zero). Applying localization to the an ensemble sampled from that perfect Pb results in ageostropic winds. In practice, this means gravity waves will be excited during the forecast cycle. Houtekamer show that a measure of this gravity wave activity decreases as localization relaxed. In practice, the gravity waves tend not to interact very much with the balanced flow, so they can be filtered out of the forecast.
Flow Dependent Localization (Hodyss and Bishop, QJR) Stable flow error correlations km Unstable flow error correlations Recently, some new algorithms have been proposed to make localization flow dependent. When flow is stable, in the atmosphere on can expect the background error correlations to be larger scale (errors are larger scale). When flow is unstable (e.g. lots of small scale turbulence, convection), the errors will also be smaller scale. More sampling error for a finite ensemble in the unstable situation. km Ensembles give flow dependent noisy correlations
Flow Dependent Localization Stable flow error correlations Fixed moderation Current ensemble DA techniques reduce noise by multiplying ensemble correlation function by fixed moderation function (green line). Resulting correlations (blue line) are too thin when true correlation is broad and too noisy when true correlation is thin. km Unstable flow error correlations Unstable flow error correlations If the same localization is used for both cases, we may unduly sharpen the correlations in the stable case, while not removing enough of the noise in the unstable case. What is needed is ‘adaptivity’ Fixed moderation km km Today’s fixed moderation functions limit adaptivity
Flow Dependent Localization Stable flow error correlations SENCORP moderation Smoothed ENsemble Correlations Raised to a Power (SENCORP) moderation functions provide flow adaptive moderations functions. km Unstable flow error correlations Unstable flow error correlations Unstable flow error correlations In this scheme, the tapering function is broad in the stable case, and much narrower in the unstable case, resulting in better use of the observations in both situations. SENCORP moderation km km km SENCORP moderation functions adapt
“SENCORP” Recipe Smooth Pb = P1b Element-wise cube of P1b = P2b Normalized matrix product of P2b with itself = P3b Use element-wise square of P3b to compute K. Here’s a recipe for how their adaptive scheme works. True covariance shown if top right (temporal separation, two peaks) Spatially smooth cube 3) Matrix product
Hierarchical Ensemble Filter Proposed by Jeff Anderson (NCAR). Evolve K coupled N-member ensemble filters. Use differences between sample covariances to design a situation-dependent localization function. asymptotes to optimally localized N member ensemble (not K*N).
Conclusions Localization (tapering the impact of observations with distance from analysis grid point) makes ensemble data assimilation feasible with large NWP models. Both model errors and localization make filter performance suboptimal. Right now model error is the bigger problem, but improvements in localization are needed.