Shaocai Yu , Brian Eder ++, Robin Dennis* ++, Shao-hang Chu, Stephen Schwartz* *Atmospheric Sciences Modeling Division, National Exposure Research.

Shaocai Yu *, Brian Eder* ++, Robin Dennis* ++, Shao-hang Chu**, Stephen Schwartz*** *Atmospheric Sciences Modeling Division, National Exposure Research Laboratory, ** Office of Air Quality Planning and Standards, U.S. EPA, NC 27711 *** Atmospheric Sciences Division, Brookhaven National Laboratory, Upton, NY 11973 ++ Air Resources Laboratory, National Oceanic and Atmospheric Administration, RTP, NC 27711 In this study, we propose new metrics to solve the symmetrical problem between overprediction and underprediction following the concept of factor. Theoretically, factor is defined as ratio of model prediction to observation if the model prediction is higher than the observation, whereas it is defined as ratio of observation to model prediction if the observation is higher than the model prediction. Following this concept, the mean normalized factor bias (M NFB ), mean normalized gross factor error (M NGFE ), normalized mean bias factor (N MBF ) and normalized mean error factor (N MEF ) are proposed and defined as follows: where M i and O i are values of model (prediction) and observation at time and/or location i, respectively, N is number of samples (by time and/or location). The values of M NFB, and N MBF are linear and not bounded (range from -  to +  ). Like M NB and M NGE, M NFB and M NGFE can have another general problem when some observation values (denomination) are trivially low and they can significantly influence the values of those metrics. N MBF and N MEF can avoid this problem because the sum of the observations is used to normalize the bias and error,. The above formulas of N MBF and N MEF can be rewritten for case as follows: N MBF and N MEF are actually the results of summaries of normalized bias (M NB ) and error (M NGE ) with the observational concentrations as a weighting function, respectively. N MBF and N MEF have both advantages of avoiding dominance by the low values of observations in normalization like N MB and N ME and maintaining adequate evaluation symmetry. The meanings of N MBF can be interpreted as follows: if N MBF  0, for example, N MBF =1.2, this means that the model overpredicts the observation by a factor of 2.2 (i.e., N MBF +1=1.2+1=2.2); if N MBF  0, for example, N MBF =-0.2, this means that the model underpredicts the observation by a factor of 1.2 (i.e., N MBF -1=-0.2-1=-1.2). Table 3 and Figures 3 and 4: (1) For weekly data from CASTNet, both N MBF (0.03 to 0.08) and N MEF (0.24 to 0.27) for SO 4 2- are lower than the performance criteria. (2) For 24-hour data from IMPROVE, SEARCH and STN, both N MBF (-0.19 to 0.22) and N MEF (0.42 to 0.46) for SO 4 2- are slightly higher than the performance criteria. The model performed better on the weekly data of CASTNet and better over the eastern region than western region for SO 4 2-. This is consistent with the results of Dennis et al. [1993]. (3) For PM 2.5 NO 3 -, both N MBF (-0.96 to 0.59) and N MEF (0.80 to 1.70) for SEARCH, CASTNet, and IMPROVE data are larger than the performance criteria. Although the N MBF (0.01) for STN data in 2002 is very small, its N MEF (0.77) is still larger than the performance criteria. (4) CMAQ performed well over the Northeastern region but overpredicted most of observed NO 3 - over the mid-Atlantic region by more than a factor of 2 and underpredicted most of observed NO 3 - over the west region and southeast by more than a factor of 2. Both N MBF and N MEF of winter of 2002 are smaller than those of summer of 1999. This is because the NO 3 - concentrations in winter of 2002 were 7 times higher than those in summer of 1999 although M B values of winter of 2002 were higher than those of summer of 1999. (5) One of major reasons for poor performance of model on PM 2.5 NO 3 - is that the PM 2.5 NO 3 - concentrations are very low and are very sensitive to the errors in PM 2.5 SO 4 2- and NH 4 +, and temperature and RH when thermodynamic model (such as ISSORROPIA model here) partitions total nitrate (i.e., aerosol NO 3 - +gas HNO 3 ) between the gas and aerosol phases. 4. Summary and conclusions A set of new statistical quantitative metrics on the basis of concept of factor has been proposed that can provide a rational operational evaluation of air quality model for the relative difference between modeled and observed results. The new metrics (i.e., N MBF, N MEF ) have advantages of both avoiding dominance by the low values of observations and maintaining adequate evaluation symmetry. The application of these new quantitative metrics in operational evaluation of CMAQ performance on PM 2.5 SO 4 2- and NO 3 - shows that the new quantitative metrics are useful and their meanings are also very clear and easy to explain. Figure 1: a dataset for a real case of model and observation for aerosol NO 3 - was separated into four regions, (1) region 1 for model/observation  0.5, (2) region 2 for 0.5  model/observation  1.0, (3) region 3 for 1.0<model/observation  2.0, (4) region 4 for 2.0  model/observation. Results in Table 2: (1) for the only data in region 1 with model/observation  0.5, M NB, N MB, F B, N MFB, M NFB and N MBF are – 0.82, -0.78, -1.43, -1.28, -36.67, and – 3.58, respectively. Obviously, only normalized mean bias factor (N MBF ) gives reasonable description of model performance, i.e., the model underpredicted the observations by a factor of 4.58 in this case. (2) For the only data in region 4 with model/observation>2, M NB, N MB, F B, N MFB, M NFB and N MBF are 4.27, 2.25, 1.12, 1.06, 4.27 and 2.25, respectively. The results of N MBF and N MB reasonably indicate that the model overpredicted the observations by a factor of 3.25. (3) For the results of each metrics on combination case of regions 1 and 4 data, M NB, N MB, F B, N MFB, M NFB and N MBF are 1.50, 0.06, -0.27, 0.06, - 18.02 and 0.06, respectively. Both N MB and N MBF show that the model slightly overpredicted the observations by a factor of 1.06, while F B (-0.27) shows that the model underpredicted the observations. This shows that the value of F B can sometimes result in a misleading conclusion as well. This specific case shows that it is not wise to use F B as an evaluation metric. N ME and N MEF : gross error between observations and model results is 1.19 times of mean observation. The good model performance can be concluded only under the condition that both relative bias (N MBF ) and relative gross error (N MEF ) meet the certain performance standards. (4) For the all data in Figure 1 (combination case 1+2+3+4 in Table 2), M NB, N MB, F B, N MFB, M NFB and N MBF are 0.96, 0.09, -0.13, 0.09, -10.75 and 0.09, respectively. Both N MB and N MBF show that the mean model only overpredicted the mean observation by a factor of 1.09. However, the gross error (N MGE ) between the model and observation is 0.77 times as high as observation. See Figure 1. On the basis of the above analyses and test, it can be concluded that our proposed new statistical metrics (i.e., N MBF and N MEF ) on the basis of concept of factor can show the model performance reasonably. These new metrics use observational data as only reference for the model evaluation and their meanings are also very clear and easy to explain. 3. Applications of new metrics over the US CMAQ model on PM 2.5 SO 4 2- and NO 3 - over the US: 6/15 to 7/17, 1999 and 1/8 to 2/18, 2002. PM 2.5 SO 4 2- and NO 3 - observational data: (1) at 61 rural sites from IMPROVE. Two 24-hour samples are collected on quartz filters each week, on Wednesday and Saturday, beginning at midnight local time. (2) at 73 rural sites from CASNet. Weekly (Tuesday to Tuesday) samples are collected on Teflon filters. (3) at 8 sites from SEARCH (Southeastern Aerosol Research and Characterization project). Daily samples are collected on quartz filters. (4) at 153 urban sites from STN (Speciated Trends Network). 24-hour samples are usually taken once every six days. The recommended performance criteria for O 3 by US EPA (1991): normalized bias (M NB )  5 to  15%; normalized gross error (M NGE ) 30% to 35%. +15% of M NB corresponds to +15% of N MBF while – 15% of M NB corresponds to – 18% of N MBF. New unbiased symmetric metrics for evaluation of the air quality model 1. INTRODUCTION Model performance evaluation is a problem of longstanding concern, and has two important objectives, i.e., (1) determine a model’s degree of acceptability and usefulness for a specific task, and (2) establish that one is getting good results for the right reason (Russell and Dennis, 2000). These objectives are accomplished by two different types of model evaluation, i.e., operational evaluation and diagnostic evaluation (or model physics evaluation) (Fox, 1981; EPA, 1991; Weil et al., 1992; Russell and Dennis, 2000), respectively. Although the operational evaluations for different air quality models have been intensively performed for regulatory purposes in the past years, the resulting array of statistical metrics are so diverse and numerous that it is difficult to judge the overall performance of the models (EPA, 1991; Seigneur et al., 2000; Yu et al., 2003). Some statistical metrics can cause misleading conclusions about the model performance (Seigneur et al., 2000). In this paper, a new set of quantitative metrics for the operational evaluation is proposed and applied in real evaluation cases. The other quantitative metrics frequently used in the operational evaluation are tested and their shortcomings are examined. 2. Quantitative metrics related to the operational evaluation and their examinations Table 1: traditional quantitative metrics,mathematical expressions used in the operational evaluation. Differences between the model and observations: mean bias (M B ) and mean absolute gross error (M AGE ) (or root mean square error). Relative differences between the model and observations: The traditional metrics (such as M NB, M NGE, N MB,and N ME ) two problems that may mislead conclusions with this approach: (1) the values of M NB and N MB can grow disproportionately for overpredictions and underpredicitons because both values of M NB and N MB are bounded by –100% for underprediction; (2) the values of M NB and M NGE can be significantly influenced by some points with trivially low values of observations (denomination). To solve this asymmetric evaluation problem between overpredictions and underpredictions,  fractional bias (F B ) and fractional gross error (F GE ) (Irwin and Smith, 1984; Kasibhatla et al., 1997; Seigneur et al., 2000).  Problems for F B and F GE: (1)what the metrics F B and F GE measure is not clear because the model prediction is not evaluated against observation but average of observation and model prediction, which is contradict to the traditional definition of evaluation. (2)the scales of F B and F GE are not linear and are seriously compressed beyond  1 as F B and F GE are bounded by  2 and +2, respectively. Acknowledgements The authors wish to thank other members at ASMD of EPA for their contributions to the 2002 release version of EPA Models-3/CMAQ during the development and evaluation. This work has been subjected to US Environmental Protection Agency peer review and approved for publication. Mention of trade names or commercial products does not constitute endorsement or recommendation for use. REFERENCES Cox, W.M., Tikvart, J.A., 1990. Atmospheric Environment 24, 2387-2395. Eder, B., Shaocai, Yu, etc., 2003. Atmospheric Environment (in preparation). EPA, 1991. Guideline for regulatory application of the urban airshed model. USEPA Report No. EPA-450/4-91-013. U.S. EPA, Office of Air Quality Planning and Standards, Research Triangle Park, North Carolina. Dennis, R.L., J.N. McHenry, W.R.Barchet, F.S. Binkowski, and D.W. Byun, Atmos. Environ., 26 A(6), 975-997, 1993. Fox, D.G., 1981. Bulletin American Meteorological Society 62, 599-609. Irwin, J., and M. Smith, 1984, Bulletin American Meteorological Society 65, 559-568 Russell, A., and R. Dennis, Atmos. Environ., 34, 2283-2324, 2000. Seigneur, C., et al.,., 2000. Journal of the Air & Waste Management Association 50, 588-599. Weil, J.C., R.I. Sykes, and A. Venkatram, 1992, Journal of Applied Meteorology, 31, 1121-1145. Yu, S.C., Kasibhatla, P.S., Wright, D.L., Schwartz, S.E., McGraw, R., Deng, A., 2003.Journal of Geophysical Research 108(D12), 4353, doi:10.1029/2002JD002890.

Shaocai Yu , Brian Eder ++, Robin Dennis* ++, Shao-hang Chu, Stephen Schwartz* *Atmospheric Sciences Modeling Division, National Exposure Research.

Similar presentations

Presentation on theme: "Shaocai Yu , Brian Eder ++, Robin Dennis* ++, Shao-hang Chu, Stephen Schwartz* *Atmospheric Sciences Modeling Division, National Exposure Research."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Shaocai Yu *, Brian Eder* ++, Robin Dennis* ++, Shao-hang Chu**, Stephen Schwartz*** *Atmospheric Sciences Modeling Division, National Exposure Research.

Similar presentations

Presentation on theme: "Shaocai Yu *, Brian Eder* ++, Robin Dennis* ++, Shao-hang Chu**, Stephen Schwartz*** *Atmospheric Sciences Modeling Division, National Exposure Research."— Presentation transcript:

Similar presentations

About project

Feedback

Shaocai Yu , Brian Eder ++, Robin Dennis* ++, Shao-hang Chu, Stephen Schwartz* *Atmospheric Sciences Modeling Division, National Exposure Research.

Presentation on theme: "Shaocai Yu , Brian Eder ++, Robin Dennis* ++, Shao-hang Chu, Stephen Schwartz* *Atmospheric Sciences Modeling Division, National Exposure Research."— Presentation transcript: