Robin Hogan Ewan OConnor, Anthony Illingworth University of Reading, UK Clouds radar collaboration meeting 17 Nov 09 Ground based evaluation of cloud forecasts.

Slides:



Advertisements
Similar presentations
ECMWF Slide 1Met Op training course – Reading, March 2004 Forecast verification: probabilistic aspects Anna Ghelli, ECMWF.
Advertisements

Robin Hogan Alan Grant, Ewan O’Connor,
Robin Hogan, Chris Westbrook University of Reading Lin Tian NASA Goddard Space Flight Center Phil Brown Met Office Why it is important that ice particles.
Lidar observations of mixed-phase clouds Robin Hogan, Anthony Illingworth, Ewan OConnor & Mukunda Dev Behera University of Reading UK Overview Enhanced.
Ewan OConnor, Anthony Illingworth, Robin Hogan and the Cloudnet team Cloudnet.
Quantifying sub-grid cloud structure and representing it GCMs
Ewan OConnor, Robin Hogan, Anthony Illingworth Drizzle comparisons.
Ewan OConnor, Robin Hogan, Anthony Illingworth, Nicolas Gaussiat Liquid water path from microwave radiometers.
Proposed new uses for the Ceilometer Network
Ewan OConnor, Robin Hogan, Anthony Illingworth, Nicolas Gaussiat Radar/lidar observations of boundary layer clouds.
Stratospheric Measurements: Microwave Sounders I. Current Methods – MSU4/AMSU9 Diurnal Adjustment Merging II. Problems and Limitations III. Other AMSU.
1 Drizzle rates inferred from CloudSat & CALIPSO compared to their representation in the operational Met Office and ECMWF forecast models. Lee Hawkness-Smith.
R. Forbes, 17 Nov 09 ECMWF Clouds and Radiation University of Reading ECMWF Cloud and Radiation Parametrization: Recent Activities Richard Forbes, Maike.
Anthony Illingworth, + Robin Hogan, Ewan OConnor, U of Reading, UK and the CloudNET team (F, D, NL, S, Su). Reading: 19 Feb 08 – Meeting with Met office.
Robin Hogan, Andrew Barrett
© University of Reading Richard Allan Department of Meteorology, University of Reading Thanks to: Jim Haywood and Malcolm.
Robin Hogan Ewan OConnor, Anthony Illingworth University of Reading, UK Chris Ferro, Ian Jolliffe, David Stephenson University of Exeter, UK Verifying.
Radar/lidar observations of boundary layer clouds
Robin Hogan, Julien Delanoë, Nicky Chalmers, Thorwald Stein, Anthony Illingworth University of Reading Evaluating and improving the representation of clouds.
Joint ECMWF-University meeting on interpreting data from spaceborne radar and lidar: AGENDA 09:30 Introduction University of Reading activities 09:35 Robin.
Blind tests of radar/lidar retrievals: Assessment of errors in terms of radiative flux profiles Malcolm Brooks Robin Hogan and Anthony Illingworth David.
Robin Hogan Anthony Illingworth Ewan OConnor Nicolas Gaussiat Malcolm Brooks University of Reading Cloudnet products available from Chilbolton.
DYMECS: Dynamical and Microphysical Evolution of Convective Storms (NERC Standard Grant) University of Reading: Robin Hogan, Bob Plant, Thorwald Stein,
Robin Hogan Department of Meteorology University of Reading Cloud and Climate Studies using the Chilbolton Observatory.
How to test a model: Lessons from Cloudnet
Robin Hogan, Richard Allan, Nicky Chalmers, Thorwald Stein, Julien Delanoë University of Reading How accurate are the radiative properties of ice clouds.
Clouds processes and climate
Robin Hogan Ewan OConnor University of Reading, UK What is the half-life of a cloud forecast?
Robin Hogan Ewan OConnor, Natalie Harvey, Thorwald Stein, Anthony Illingworth, Julien Delanoe, Helen Dacre, Helene Garcon University of Reading, UK Chris.
Use of ground-based radar and lidar to evaluate model clouds
Robin Hogan & Anthony Illingworth Department of Meteorology University of Reading UK Parameterizing ice cloud inhomogeneity and the overlap of inhomogeneities.
Robin Hogan (with input from Anthony Illingworth, Keith Shine, Tony Slingo and Richard Allan) Clouds and climate.
Robin Hogan Ewan OConnor Anthony Illingworth Department of Meteorology, University of Reading UK PDFs of humidity and cloud water content from Raman lidar.
Robin Hogan Ewan OConnor Damian Wilson Malcolm Brooks Evaluation statistics of cloud fraction and water content.
Robin Hogan Julien Delanoe University of Reading Remote sensing of ice clouds from space.
Robin Hogan A variational scheme for retrieving rainfall rate and hail intensity.
Clouds and their turbulent environment
Robin Hogan Ewan OConnor Anthony Illingworth Nicolas Gaussiat Malcolm Brooks Cloudnet Evaluating the clouds in European forecast models.
Robin Hogan Ewan OConnor Cloudnet level 3 products.
Robin Hogan Julien Delanoe, Ewan OConnor, Anthony Illingworth, Jonathan Wilkinson University of Reading, UK Quantifying the skill of cloud forecasts from.
Solar Energy Forecasting Using Numerical Weather Prediction (NWP) Models Patrick Mathiesen, Sanyo Fellow, UCSD Jan Kleissl, UCSD.
Integrated Profiling at the AMF
Introduction to data assimilation in meteorology Pierre Brousseau, Ludovic Auger ATMO 08,Alghero, september 2008.
Exploiting multiple scattering in CALIPSO measurements to retrieve liquid cloud properties Nicola Pounder, Robin Hogan, Lee Hawkness-Smith, Andrew Barrett.
1 00/XXXX © Crown copyright Use of radar data in modelling at the Met Office (UK) Bruce Macpherson Mesoscale Assimilation, NWP Met Office EWGLAM / COST-717.
Microphysical and radiative properties of ice clouds Evaluation of the representation of clouds in models J. Delanoë and A. Protat IPSL / CETP Assessment.
Details for Today: DATE:3 rd February 2005 BY:Mark Cresswell FOLLOWED BY:Assignment 2 briefing Evaluation of Model Performance 69EG3137 – Impacts & Models.
Robin Hogan Anthony Illingworth Marion Mittermaier Ice water content from radar reflectivity factor and temperature.
1. The problem of mixed-phase clouds All models except DWD underestimate mid-level cloud –Some have separate “radiatively inactive” snow (ECMWF, DWD) –Met.
Ewan O’Connor Anthony Illingworth Comparison of observed cloud properties at the AMF COPS site with NWP models.
Initial 3D isotropic fractal field An initial fractal cloud-like field can be generated by essentially performing an inverse 3D Fourier Transform on the.
The DYMECS project A statistical approach for the evaluation of convective storms in high-resolution models Thorwald Stein, Robin Hogan, John Nicol, Robert.
Observed and modelled long-term water cloud statistics for the Murg Valley Kerstin Ebell, Susanne Crewell, Ulrich Löhnert Institute for Geophysics and.
The aim of FASTER (FAst-physics System TEstbed and Research) is to evaluate and improve the parameterizations of fast physics (involving clouds, precipitation,
Profiling Clouds with Satellite Imager Data and Potential Applications William L. Smith Jr. 1, Douglas A. Spangenberg 2, Cecilia Fleeger 2, Patrick Minnis.
1 On the use of radar data to verify mesoscale model precipitation forecasts Martin Goeber and Sean Milton Model Diagnostics and Validation group Numerical.
ESA DA Projects Progress Meeting 2University of Reading Advanced Data Assimilation Methods WP2.1 Perform (ensemble) experiments to quantify model errors.
The three-dimensional structure of convective storms Robin Hogan John Nicol Robert Plant Peter Clark Kirsty Hanley Carol Halliwell Humphrey Lean Thorwald.
Robin Hogan, Ewan O’Connor, Andrew Barrett University of Reading, UK Maureen Dunn, Karen Johnson Brookhaven National Laboratory Objective assessment of.
Anthony Illingworth, Robin Hogan, Ewan O’Connor, U of Reading, UK Nicolas Gaussiat Damian Wilson, Malcolm Brooks Met Office, UK Dominique Bouniol, Alain.
The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Southern Ocean cloud biases in ACCESS.
Evaluating forecasts of the evolution of the cloudy boundary layer using radar and lidar observations Andrew Barrett, Robin Hogan and Ewan O’Connor Submitted.
Robin Hogan Ewan O’Connor The Instrument Synergy/ Target Categorization product.
Yuying Zhang, Jim Boyle, and Steve Klein Program for Climate Model Diagnosis and Intercomparison Lawrence Livermore National Laboratory Jay Mace University.
Towards a Characterization of Arctic Mixed-Phase Clouds Matthew D. Shupe a, Pavlos Kollias b, Ed Luke b a Cooperative Institute for Research in Environmental.
Page 1© Crown copyright 2005 Damian Wilson, 12 th October 2005 Assessment of model performance and potential improvements using CloudNet data.
Robin Hogan Anthony Illingworth Marion Mittermaier Ice water content from radar reflectivity factor and temperature.
Verification methods - towards a user oriented verification The verification group.
Seamless turbulence parametrization across model resolutions
Quantitative verification of cloud fraction forecasts
Presentation transcript:

Robin Hogan Ewan OConnor, Anthony Illingworth University of Reading, UK Clouds radar collaboration meeting 17 Nov 09 Ground based evaluation of cloud forecasts.

Project Aim: to retrieve and evaluate the crucial cloud variables in forecast and climate models –8+ models: global, mesoscale and high-resolution forecast models –Variables: cloud fraction, LWC, IWC, plus a number of others –Sites: 4 across Europe plus worldwide ARM sites –Period: several years to avoid unrepresentative case studies Current status –Funded by US Department of Energy Climate Change Prediction Program to apply to ARM data worldwide –Application to FP7 Infrastructure - bid 3 Dec 09 joint with EUSAAR (trace gases) + Earlinet (lidar/aerosol) ACTRIS: Aerosol Clouds and Trace gases Research Infrastructure Network.

Level 1b Minimum instrument requirements at each site –Cloud radar, lidar, microwave radiometer, rain gauge, model or sondes Radar Lidar

Level 1c Ice Liquid Rain Aerosol Instrument Synergy product –Example of target classification and data quality fields:

Level 2a/2b Cloud products on (L2a) observational and (L2b) model grid –Water content and cloud fraction L2a IWC on radar/lidar grid L2b Cloud fraction on model grid

Chilbolton Observations Met Office Mesoscale Model ECMWF Global Model Meteo-France ARPEGE Model KNMI RACMO Model Swedish RCA model Cloud fraction

Cloud fraction in 7 models Mean & PDF for 2004 for Chilbolton, Paris and Cabauw Illingworth et al. (BAMS 2007) 0-7 km –All models except DWD underestimate mid-level cloud –Some have separate radiatively inactive snow (ECMWF, DWD); Met Office has combined ice and snow but still underestimates cloud fraction –Wide range of low cloud amounts in models –Not enough overcast boxes, particularly in Met Office model

Comparison of NAE (12km) and 4km model 3 months data Ideally global and 1.5km model as well. Compare 12km with 4km and also 4km averaged 3x3 boxes. Is the performance any better at 4km? Can make more overcast skies? Any improvement on mid level cloud? What about low level clouds? What about getting the right cloud in the right place at the right time - skill scores?

NAE 12km Mean fraction too low Equitable threat score: Falls with fraction threshold and age of forecast.

4km 3 x 3 Mean fraction too low Equitable threat score: Spin up 0-5 hrs

4km each box Mean fraction too low Equitable threat score: Same as 3x3.

CLOUD FRACTION NAE 12km 6-11hr Cant make overcast

CLOUD FRACTION 4km (3x3) 6-11hr Cant make overcast

CLOUD FRACTION 4km (each box) 6-11hr Cant make overcast BL clouds – worse!

IWC nae

IWC 4km 3X3 Improved pdf mid level cloud Still missing the higher iwc.

IWC 4km

Diurnal cycle composite of clouds Barrett, Hogan & OConnor (GRL 2009) Meteo-France: Local mixing scheme: too little entrainment SMHI: Prognostic TKE scheme: no diurnal evolution All other models have a non-local mixing scheme in unstable conditions and an explicit formulation for entrainment at cloud top: better performance over the diurnal cycle Radar and lidar provide cloud boundaries and cloud properties above site

a = 7194b = 4098 c = 4502d = DWD model, Murgtal Model cloud Model clear-sky a: Cloud hitb: False alarm c: Missd: Clear-sky hit Contingency tables For given set of observed events, only 2 degrees of freedom in all possible forecasts (e.g. a & b), because 2 quantities fixed: - Number of events that occurred n =a +b +c +d - Base rate (observed frequency of occurrence) p =(a +c)/n Observed cloud Observed clear-sky

Desirable properties of verification measures 1.Equitable: all random forecasts receive expected score zero –Constant forecasts of occurrence or non-occurrence also score zero –Note that forecasting the right cloud climatology versus height but with no other skill should also score zero 2.Useful for rare events –Almost all measures are degenerate in that they asymptote to 0 or 1 for vanishingly rare events Extreme dependency score Stephenson et al. (2008) explained this behavior: –Almost all scores have a meaningless limit as base rate p 0 –HSS tends to zero and LOR tends to infinity They proposed the Extreme Dependency Score: –where n = a + b + c + d It can be shown that this score tends to a meaningful limit:

Symmetric extreme dependency score EDS problems: –Easy to hedge (unless calibrated) –Not equitable Solved by defining a symmetric version: –All the benefits of EDS, none of the drawbacks! Hogan, OConnor and Illingworth (2009 QJRMS)

Skill versus height –Most scores not reliable near the tropopause because cloud fraction tends to zero LORHSS LBSS SEDS New score reveals: –Skill tends to slowly decrease at tropopause –Mid-level clouds (4-5 km) most skilfully predicted, particularly by Met Office –Boundary-layer clouds least skilfully predicted EDS

What is the origin of the term ETS? First use of Equitable Threat Score: Mesinger & Black (1992) –A modification of the Threat Score a/(a+b+c) –They cited Gandin and Murphys equitability requirement that constant forecasts score zero (which ETS does) although it doesnt satisfy requirement that non-constant random forecasts have expected score 0 –ETS now one of most widely used verification measures in meteorology An example of rediscovery –Gilbert (1884) discussed a/(a+b+c) as a possible verification measure in the context of Finleys (1884) tornado forecasts –Gilbert noted deficiencies of this and also proposed exactly the same formula as ETS, 108 years before! Suggest that ETS is referred to as the Gilbert Skill Score (GSS) –Or use the Heidke Skill Score, which is unconditionally equitable and is uniquely related to ETS = HSS / (2 – HSS) Hogan, Ferro, Jolliffe and Stephenson (WAF, in press)

THUS FAR DISCUSSED: CLOUDNET Clouds in the 4km v 12km NAE. Diurnal cycle of BL clouds in various models. Problems with the ETS (now GSS) – use SEDS Now DRIZZLE! BL clouds in models drizzle all the time. New observations from CloudSat/Calipso compared with FWD model from ECMWF.

A TRAIN v ECMWF. OBSERVATIONS: Z - LWP. LWP 100 g/m 2 -22dBZ OBSERVED Z LWP ECMWF FWD MODEL: LWP 100 g/m 2 0dBZ 160 times too much drizzle! Drizzle rate 0.03mm/hr. {20 W/m 2, 300m layer cools 0.3 /hr} MODEL -22dBZ 0.4g/m 3 or 0.001mm/hr (1mm per month: 0.6 W/m 2 ).

26 ECMWF rain flux parameterisation Without threshold term: dq r /dt q cl LWP of 1000g/m mm/hr LWP of 100gm mm/hr =K q cl Autoconversion of cloud mixing ratio q cl to rain mixing ration q r Threshold term: turns off autoconversion for value below q cl,crit = 0.3 g kg -1 Add threshold assume adiabatic 0.03mm/hr (0dBZ) So why not increase q cl.crit to stop all the drizzle forming? NO! This will increase the lwp of all water clouds, make them too bright and destroy the global radiation balance.

27 Evidence that the clouds in ECMWF are more adiabatic than observed? F Observed 25% adiabatic? Modelled 50% adiabatic? MODEL AUTOCONVERSION: for LWP 100g/m 2 100% adiabatic 0.03mm/hr 0dBZ 300m deep/ max LWC 0.6gm 3 50% adiabatic 0.02mm/hr 450m deep/max LWC 0.45g/m 3 25% adiabatic 0.01mm/hr -8dBZ 700m deep/max LWC 0.3g/m 3 CSAT gate 500m. Cloud amount >80%

Lwc nae

Lwc 4km 3x3

Lwc 4km

How skillful is a forecast? Most model evaluations of clouds test the cloud climatology –What about individual forecasts? Standard measure shows ECMWF forecast half-life of ~6 days in 1980 and ~9 days in 2000 –But virtually insensitive to clouds! ECMWF 500-hPa geopotential anomaly correlation

Overview The Cloudnet processing of ground-based radar and lidar observations –Continuous evaluation of the climatology of clouds in models –Evaluation of the diurnal cycle of boundary-layer clouds Desirable properties of verification measures (skill scores) –Usefulness for rare events: the Symmetric Extreme Dependency Score –Equitability: is the Equitable Threat Score equitable? Testing the skill of cloud forecasts from seven models –Skill versus cloud fraction, height, scale, forecast lead time, season... –Estimating the forecast half life Testing the skill of cloud forecasts from space –Evaluation of ECMWF model with ICESat/GLAS lidar Most results taken from these papers: –Hogan, OConnor & Illingworth (QJ 2009) –Hogan, Ferro, Jolliffe & Stephenson (WAF, in press)

Joint PDFs of cloud fraction Raw (1 hr) resolution –1 year from Murgtal –DWD COSMO model 6-hr averaging ab cd …or use a simple contingency table

Skill-Bias diagrams Positive skill Random forecast Negative skill Best possible forecast ab cd Worst possible forecast Under-prediction No bias Over-prediction Random unbiased forecast Constant forecast of non-occurrence Constant forecast of occurrence ???????????????? Reality (n=16, p=1/4) Forecast -

Hedging Issuing a forecast that differs from your true belief in order to improve your score (e.g. Jolliffe 2008) Hit rate H=a/(a+c) –Fraction of events correctly forecast –Easily hedged by randomly changing some forecasts of non-occurrence to occurrence H=0.5 H=0.75 H=1

Equitability Defined by Gandin and Murphy (1992): Requirement 1: An equitable verification measure awards all random forecasting systems, including those that always forecast the same value, the same expected score –Inequitable measures rank some random forecasts above skillful ones Requirement 2: An equitable verification measure S must be expressible as the linear weighted sum of the elements of the contingency table, i.e. S = (S a a +S b b +S c c +S d d) / n –This can safely be discarded: it is incompatible with other desirable properties, e.g. usefulness for rare events Gandin and Murphy reported that only the Peirce Skill Score and linear transforms of it is equitable by their requirements –PSS = Hit Rate minus False Alarm Rate = a/(a+c) – b/(b+d) –What about all the other measures reported to be equitable?

Some reportedly equitable measures HSS = [x-E(x)] / [n-E(x)]; x = a+dETS = [a-E(a)] / [a+b+c-E(a)] LOR = ln[ad/bc]ORSS = [ad/bc – 1] / [ad/bc + 1] E(a) = (a+b)(a+c)/n is the expected value of a for an unbiased random forecasting system Random and constant forecasts all score zero, so these measures are all equitable, right? Simple attempts to hedge will fail for all these measures

Skill versus cloud-fraction threshold Consider 7 models evaluated over 3 European sites in LOR implies skill increases for larger cloud-fraction threshold HSS implies skill decreases significantly for larger cloud- fraction threshold LORHSS

Extreme dependency score Stephenson et al. (2008) explained this behavior: –Almost all scores have a meaningless limit as base rate p 0 –HSS tends to zero and LOR tends to infinity They proposed the Extreme Dependency Score: –where n = a + b + c + d It can be shown that this score tends to a meaningful limit: –Rewrite in terms of hit rate H =a/(a +c) and base rate p =(a +c)/n : –Then assume a power-law dependence of H on p as p 0: –In the limit p 0 we find –This is useful because random forecasts have Hit rate converging to zero at the same rate as base rate: =1 so EDS=0 –Perfect forecasts have constant Hit rate with base rate: =0 so EDS=1

Skill versus cloud-fraction threshold SEDS has much flatter behaviour for all models (except for Met Office which underestimates high cloud occurrence significantly) LORHSS SEDS

A surprise? Is mid-level cloud well forecast??? –Frequency of occurrence of these clouds is commonly too low (e.g. from Cloudnet: Illingworth et al. 2007) –Specification of cloud phase cited as a problem –Higher skill could be because large-scale ascent has largest amplitude here, so cloud response to large-scale dynamics most clear at mid levels –Higher skill for Met Office models (global and mesoscale) because they have the arguably most sophisticated microphysics, with separate liquid and ice water content (Wilson and Ballard 1999)? Low skill for boundary-layer cloud is not a surprise! –Well known problem for forecasting (Martin et al. 2000) –Occurrence and height a subtle function of subsidence rate, stability, free-troposphere humidity, surface fluxes, entrainment rate...

Key properties for estimating ½ life We wish to model the score S versus forecast lead time t as: –where 1/2 is forecast half-life We need linearity –Some measures saturate at high skill end (e.g. Yules Q / ORSS) –Leads to misleadingly long half-life...and equitability –The formula above assumes that score tends to zero for very long forecasts, which will only occur if the measure is equitable

Expected values of a–d for a random forecasting system may score zero: –S[E(a), E(b), E(c), E(d)] = 0 But expected score may not be zero! –E[S(a,b,c,d)] = P(a,b,c,d)S(a,b,c,d) Width of random probability distribution decreases for larger sample size n –A measure is only equitable if positive and negative scores cancel Which measures are equitable? ETS & ORSS are asymmetric n = 16 n = 80

Asyptotic equitability Consider first unbiased forecasts of events that occur with probability p = ½ –Expected value of Equitable Threat Score by a random forecasting system decreases below 0.01 only when n > 30 –This behaviour we term asymptotic equitability –Other measures are never equitable, e.g. Critical Success Index CSI = a/(a+b+c), also known as Threat Score

What about rarer events? Equitable Threat Score still virtually equitable for n > 30 ORSS, EDS and SEDS approach zero much more slowly with n –For events that occur 2% of the time (e.g. Finleys tornado forecasts), need n > 25,000 before magnitude of expected score is less than 0.01 –But these measures are supposed to be useful for rare events!

Possible solutions 1.Ensure n is large enough that E(a) > 10 2.Inequitable scores can be scaled to make them equitable: –This opens the way to a new class of non-linear equitable measures 3.Report confidence intervals and p-values (the probability of a score being achieved by chance)

Truly equitable Asymptotically equitable Not equitable Measure Equitable Useful for rare events Linear Peirce Skill Score, PSS Heidke Skill Score, HSS YNY Equitably Transformed SEDSYY~ Symmetric Extreme Dependency Score, SEDS ~Y~ Log of Odds Ratio, LOR~~~ Odds Ratio Skill Score, ORSS (also known as Yules Q) ~~N Gilbert Skill Score, GSS (formerly ETS) ~NN Extreme Dependency Score, EDSNY~ Hit rate, H False alarm rate, FAR NNY Critical Success Index, CSINNN Properties of various measures

Skill versus lead time Only possible for UK Met Office 12-km model and German DWD 7-km model –Steady decrease of skill with lead time –Both models appear to improve between 2004 and 2007 Generally, UK model best over UK, German best over Germany –An exception is Murgtal in 2007 (Met Office model wins)

Forecast half life Fit an inverse-exponential: –S 0 is the initial score and 1/2 is the half-life Noticeably longer half-life fitted after 36 hours –Same thing found for Met Office rainfall forecast (Roberts 2008) –First timescale due to data assimilation and convective events –Second due to more predictable large-scale weather systems days 2.9 days 2.7 days 2.9 days 2.7 days 3.1 days 2.4 days 4.0 days 4.3 days 3.0 d 3.2 d 3.1 d Met OfficeDWD

Different spatial scales? Convection? –Average temporally before calculating skill scores: –Absolute score and half-life increase with number of hours averaged Why is half-life less for clouds than pressure?

Cloud is noisier than geopotential height Z because it is separated by around two orders of differentiation: –Cloud ~ vertical wind ~ relative vorticity ~ 2 streamfunction ~ 2 pressure –Suggests cloud observations should be used routinely to evaluate models Geopotential height anomalyVertical velocity

Satellite observations: IceSAT Cloud observations from IceSAT 0.5-micron lidar (first data Feb 2004) Global coverage but lidar attenuated by thick clouds: direct model comparison difficult Optically thick liquid cloud obscures view of any clouds beneath Solution: forward-model the measurements (including attenuation) using the ECMWF variables Lidar apparent backscatter coefficient (m -1 sr -1 ) Latitude

Global cloud fraction comparison ECMWF raw cloud fraction ECMWF processed cloud fraction IceSAT cloud fraction Wilkinson, Hogan, Illingworth and Benedetti (MWR 2008) Results for October 2003 –Tropical convection peaks too high –Too much polar cloud –Elsewhere agreement is good Results can be ambiguous –An apparent low cloud underestimate could be a real error, or could be due to high cloud above being too thick

Testing the model skill from space Clearly need to apply SEDS to cloud estimated from lidar & radar! Unreliable region Lowest skill: tropical boundary-layer clouds Tropical skill appears to peak at mid-levels but cloud very infrequent here Highest skill in north mid-latitude and polar upper troposphere Is some of reduction of skill at low levels because of lidar attenuation? Wilkinson, Hogan, Illingworth and Benedetti (MWR 2008)

CCPP project US Dept of Energy Climate Change Prediction Program recently funded 5-year consortium project centred at Brookhaven, NY –Implement updated Cloudnet processing system at Atmospheric Radiation Measurement (ARM) radar-lidar sites worldwide –Ingests ARMs cloud boundary diagnosis, but uses Cloudnet for stats –New diagnostics being tested Testing of NWP models –NCEP, ECMWF, Met Office, Meteo-France... –Over a decade of data at several sites: have cloud forecasts improved over this time? Single-column model testbed –SCM versions of many GCMs will be run over ARM sites by Roel Neggers –Different parameterization schemes tested –Verification measures can be used to judge improvements

US Southern Great Plains 2004

Winter 2004

Summer 2004

Summary and outlook Model comparisons reveal: –Half-life of a cloud forecast is between 2.5 and 4 days, much less than ~9 days for ECMWF 500-hPa geopotential height forecast –In Europe, higher skill for mid-level cloud and lower for boundary-layer cloud, but larger seasonal contrast in Southern US Findings applicable to other verification problems: –Symmetric Extreme Dependency Score is a reliable measure of skill for both common and rare events (given we have large enough sample) –Many measures regarded as equitable are only so for very large samples, including the Equitable Threat Score, but they can be rescaled Future work (in addition to CCPP): –CloudSat & Calipso: what is the skill of cloud forecasts globally? –What is half-life of ECMWF cloud forecasts? (Need more data!) –Near-real-time evaluation for rapid feedback to NWP centres? –Dept of Meteorology Lunchtime Seminar, 1pm Tuesday 3 rd Nov: Faster and more accurate representation of clouds and gases in GCM radiation schemes

Monthly skill versus time Measure of the skill of forecasting cloud fraction>0.05 –Comparing models using similar forecast lead time –Compared with the persistence forecast (yesterdays measurements) Lower skill in summer convective events

Statistics from AMF Murgtal, Germany, 2007 –140-day comparison with Met Office 12-km model Dataset released to the COPS community –Includes German DWD model at multiple resolutions and forecast lead times

Possible skill scores Contingency table Observed cloud Observed clear sky Modeled cloud a hit b false alarm Modeled clear sky c miss d correct negative DWD model a = 7194 b = 4098 c = 4502 d = Perfect forecast a p = b p = 0 c p = 0 d p = Random forecast a r = 2581 b r = 8711 c r = 9115 d r = To ensure equitability and linearity, we can use the concept of the generalized skill score = (x-x random )/(x perfect -x random ) –Where x is any number derived from the joint PDF –Resulting scores vary linearly from random=0 to perfect=1 Simplest example: Heidke skill score (HSS) uses x=a+d –We will use this as a reference to test other scores Brier skill score uses x=mean squared cloud-fraction difference, Linear Brier skill score (LBSS) uses x=mean absolute difference –Sensitive to errors in model for all values of cloud fraction Cloud deemed to occur when cloud fraction f is larger than some threshold f thresh

Alternative approach How valid is it to estimate 3D cloud fraction from 2D slice? –Henderson and Pincus (2009) imply that it is reasonable, although presumably not in convective conditions Alternative: treat cloud fraction as a probability forecast –Each time the model forecasts a particular cloud fraction, calculate the fraction of time that cloud was observed instantaneously over the site –Leads to a Reliability Diagram: Jakob et al. (2004) Perfect No skill No resolution

Simulate lidar backscatter: –Create subcolumns with max-rand overlap –Forward-model lidar backscatter from ECMWF water content & particle size –Remove signals below lidar sensitivity ECMWF raw cloud fraction ECMWF cloud fraction after processing IceSAT cloud fraction

Testing the model climatology Reduction in model due to lidar attenuation Error due to uncertain extinction-to-backscatter ratio

68 4. ECMWF rain flux parameterisation Without threshold term: dq r /dt q cl LWP of 1000g/m mm/hr LWP of 100gm mm/hr =K q cl Autoconversion of cloud mixing ratio q cl to rain mixing ration q r Threshold term: turns off autoconversion for value below q cl,crit = 0.3 g kg -1 Add threshold assume adiabatic 0.03mm/hr (0dBZ) So why not increase q cl.crit to stop all the drizzle forming? NO! This will increase the lwp of all water clouds, make them too bright and destroy the global radiation balance.