P2.5 Sensitivity of Surface Air Temperature Analyses to Background and Observation Errors Daniel Tyndall (dan.tyndall@utah.edu) and John Horel Department of Atmospheric Sciences, University of Utah Introduction This study presents a methodology for determining overfitting and analysis accuracy using a local variational data assimilation system and a Hilbert curve in its data denial methodology Methodology is tested using a detailed case study as well as analyses from two additional days Case Study Examination of 4°4° area over Shenandoah Valley, VA RMSE values illustrate overfitting when the analysis is drawn too tightly (e.g. Exp. 7 in table) to observations: Analyzing strong surface inversion situation: 0900 UTC 22 October 2007 Difference between RMSE at withheld and all observations becomes very large Constraining the analysis (b and c) too much to the observations results in an unphysical temperature feature near the spine of the Blue Ridge Mtns. Sensitivity also increases Goal isn’t to minimize RMSE, but to minimize difference between RMSE at withheld and all observations (want similar analysis error in data voids and data rich areas) Increasing σo2/σb2 and decorrelation length scales (e and f) results in greater lateral and vertical influence of the observations RMSE and Sensitivity – 0900 UTC 22 October 2007 # Experiment RMSE Using All Observations (°C) RMSE Using Withheld Observations (°C) Sensitivity (°C) B Background 2.15 - 1 R = 40 km, Z = 100 m, σo2/σb2 = 1 1.62 1.93 0.26 2 R = 80 km, Z = 200 m, σo2/σb2 = 1 1.80 1.98 0.29 3 R = 20 km, Z = 50 m, σo2/σb2 = 1 1.41 1.89 0.20 4 R = 40 km, Z = 100 m, σo2/σb2 = 0.5 1.54 1.94 0.34 5 R = 40 km, Z = 100 m, σo2/σb2 = 2 1.70 0.19 6 R = 80 km, Z = 200 m, σo2/σb2 = 2 1.83 0.22 7 R = 20 km, Z = 50 m, σo2/σb2 = 0.5 1.67 1.90 Local Surface Analysis (LSA) 2D variational surface analysis system that utilizes downscaled Rapid Update Cycle (RUC) 1-hr forecast for its background Unphysical features can develop in background field during temperature inversions from downscaling 2-m air temperature observations assimilated from various mesonets and METAR sites Observations must fall within a ±12 min time window about analysis hour; or −30/+12 min time window for RAWS networks Background error covariance specified using background error variance horizontal and vertical decorrelation length scales Further Evaluation Two additional days of analyses analyzed to further evaluate method 20 May 2009 – synoptically quiescent period with morning inversion 26 May 2009 – synoptically active period with regressing and progressing front through domain Additional analyses confirm difference between RMSE values at all and withheld observations increases when analysis is constrained too tightly Sensitivity across grid also increases (compared to other experiments) b. c. d. f. e. a. Washington, D.C. Blue Ridge Mtns. Shenandoah Valley Appalachian Mtns. Analysis, R = 40 km, Z = 100 m, σo2/σb2 = 1 Increments, R = 40 km, Z = 100 m, σo2/σb2 = 1 Analysis, R = 80 km, Z =200 m, σo2/σb2 = 2 Increments, R = 80 km, Z = 200 m, σo2/σb2 = 2 Background Domain Topography Domain Topography Analysis, R = 40 km, Z = 100 m, σo2/σb2 = 1 Increments, R = 40 km, Z = 100 m, σo2/σb2 = 1 Background Analysis, R = 80 km, Z =200 m, σo2/σb2 = 2 Increments, R = 80 km, Z = 200 m, σo2/σb2 = 2 RMSE and Sensitivity – 20 May 2009 – Synoptically quiescent period # Experiment RMSE Using All Observations (°C) RMSE Using Withheld Observations (°C) Sensitivity (°C) B Background 2.41 2..41 - 1 R = 40 km, Z = 100 m, σo2/σb2 = 1 1.99 2.23 0.24 2 R = 80 km, Z = 200 m, σo2/σb2 = 1 2.12 2.22 0.19 3 R = 20 km, Z = 50 m, σo2/σb2 = 1 1.81 2.25 0.23 4 R = 40 km, Z = 100 m, σo2/σb2 = 0.5 1.94 0.30 5 R = 40 km, Z = 100 m, σo2/σb2 = 2 2.04 0.18 6 R = 80 km, Z = 200 m, σo2/σb2 = 2 2.14 0.16 7 R = 20 km, Z = 50 m, σo2/σb2 = 0.5 1.72 2.28 0.31 Covariance Estimation Original σo2:σb2 suggested for background field was 1:1, with horizontal (R) and vertical (Z) decorrelation length scales of 40 km and 100 m respectively Covariances determined through a statistical analysis on month-long sample of observations across CONUS (see Myrick and Horel (2006) for details on technique) Results of statistical analysis show σo2:σb2 should be doubled (2:1) Observation errors remain strongly correlated beyond original decorrelation length scales RMSE and Sensitivity – 26 May 2009 – Synoptically active period # Experiment RMSE Using All Observations (°C) RMSE Using Withheld Observations (°C) Sensitivity (°C) B Background 2.25 - 1 R = 40 km, Z = 100 m, σo2/σb2 = 1 1.81 2.01 0.21 2 R = 80 km, Z = 200 m, σo2/σb2 = 1 1.92 0.17 3 R = 20 km, Z = 50 m, σo2/σb2 = 1 1.65 2.03 4 R = 40 km, Z = 100 m, σo2/σb2 = 0.5 1.77 0.27 5 R = 40 km, Z = 100 m, σo2/σb2 = 2 1.85 1.93 0.16 6 R = 80 km, Z = 200 m, σo2/σb2 = 2 1.94 0.14 7 R = 20 km, Z = 50 m, σo2/σb2 = 0.5 1.58 2.06 Data Denial Methodology Observations were randomly withheld using the Hilbert curve Hilbert curve uniformly removes observations from non-randomly distributed observation networks Two error measures computed: Root-mean-square error (RMSE) at observation gridpoints: Root-mean-square sensitivity at all gridpoints: Summary Overfitting occurs when analysis is drawn too tightly to observations; results in large difference between RMSE computed at all locations compared to that computed at withheld locations Preferable to have analyses with comparable errors in data rich regions relative to those in data void areas 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 R = 40 km, Z =100 m R = 80 km, Z =200 m Using the Hilbert curve to withhold observations. (1) The analysis domain is converted into a unit square. (2) The unit square is subdivided over and over again into smaller squares until there is a maximum of one observation in each of the smaller squares (some squares may be empty as well). (3) The Hilbert curve is drawn through the domain, with each vertex of the curve occupying each of the smaller squares. (4) Vertices of the Hilbert curve without observations are condensed down and removed from the curve. (5) Observations are withheld based on the order they are located on the curve.