Intercalibration Option 3 results: what is acceptable and what is not ? Sandra Poikane Joint Research Centre Institute for Environment and Sustainability.

Intercalibration Option 3 results: what is acceptable and what is not ? Sandra Poikane Joint Research Centre Institute for Environment and Sustainability

Aims  Review current IC results  Set rules for future IC  Keep consistency between the GIGs, water categories, IC options,,,  Learning-by-doing exercise  Nothing similar before !

Outline  Option 3 comparability indicators or How to evaluate performance ?  Current Option 3 results  Thoughts for future

Option 3  Already developed national assessment methods  Compare directly using a common data set  Evaluate differences in classification results  ”checking whether there are major differences in the results”

Option 3 – direct comparison Member state AMember state B Site 1GoodHigh Site 2High Site 3ModerateGood Site 4Moderate Site 5Good How to evaluate major differences ?

Option 3 comparability indicators  % of agreement using 5 classes  % of agreement using 3 classes  Absolute average class difference  Average class difference

The simplest: % of agreement using 5 classes all boundaries are taken into account (not only HG and GM)

Focus on HG and GM boundary: % of agreement using 3 classes

% agreement using 3 classes: sensitive for how data are distributed over the EQR scale Many low-quality data points: High agreement Few low-quality data points: Low agreement

Avg absolute class difference System ASystem BAbsolute class difference Good 0 HighModerate2 HighGood1 ModerateGood1 High 0 Good 0 High 0 Average absolute class difference 0.57

Average absolute class difference  This metric is very similar to the classification agreement  but takes into account also the magnitude of the classification difference

Average class difference System ASystem B Class difference Good 0 HighModerate-2 HighGood-1 ModerateGood +1 High 0 Good 0 High 0 Average absolute class difference -0.29

Average class difference  This metric quantifies systematic differences between the methods only  Is one system “worse” or “better” as others ?

Outline  Option 3 comparability indicators or How to evaluate performance ?  Current Option 3 results  Thoughts for future

Summary October 2007:  Lake /Coast GIGs used Option 3  Different approaches  Different criteria  Not possible to compare results  Request to perform calculations in the same way (automated Excel sheets by JRC)

Differences between GIGs in criteria for comparability  Data from (almost) all GIGs was re-analysed calculating common comparability indicators –This was not possible for Coast Med Angiosperms; Baltic Macroalgae  This gives the possibility to better compare between GIGs

Information was collected:  Coastal Baltic GIG Benthic invertebrates (separately for 3 types )  Coastal North East Atlantic GIG: Benthic invertebrates  Coastal Mediterranean GIG: Macroalgae  Coastal Mediterranean GIG: Benthic invertebrates  Lake Alpine GIG (2 types separately)  Lake Central Baltic Macrophytes (2 types separately)  Lake Central Baltic GIG Phytoplankton

GIG results 5 class agreement 30 – 70 % agreement

GIG results 5 class agreement Random 22% Max expected 72%

GIG results 3 class agreement Random 45% Max expected 85% 50 – 85 % agreement

GIG results abs avg class diff 50 – 85 % agreement 0.3 – 0.9 class difference

Average class difference  Expresses the difference of one method versus another(s):  Positive value means that the method is more precautionary (gives lower assessment class comparing to others),  Negative value – that the method is less precautionary (gives higher assessment class comparing to others). a

Which criteria ?  main criterion: the absolute average class difference –shows to which extent the Member States’ methods give different classification result –The most comprehensive criterion  Also others as complimentary

Level of sufficient comparability?  The most important task is to set “borderline” – which level of criteria is acceptable, which is not acceptable  Difficult to set this on a purely objective basis…  It is proposed to use less than a half class as a criterion for sufficient comparability –In line with what was acceptable for Option 2 results

GIG results abs avg class diff 50 – 85 % agreement 0.3 – 0.9 class difference

Discussions - summary Case 1  Methods are “similar” (approach, metrics, habitat, pressure, type, ecoregion etc)  We expect the same assessment results  Absolute avg class diff or % agreement: no major differences Case 2  Methods are “different” (approach, metrics, habitat, pressure, type, ecoregion etc)  We expect different assessment results

Case 1 1. Assessment methods are similar  Metrics similar  Pressure the same  Types well defined 2. Abs avg class dif < 0.5 % of agreement > 50%

Case 2 Assessment methods are different  Different aspects  Different habitats  Types broad  Ecological differences  Not possible to have the same results  Possible to minimise difference  no systematic difference

More work to be done:  Comparison of Option 2/3 –Option 2 has hidden “pitfalls” –Set clear rules for Option 2 and Option 3 – New Guidance document  What is WFD compliant method ? –Not a simple issue (e.g. how much expert judgement can be allowed?) –Checking before the start of IC, common rules  Common BSP - before the IC

The last slide : Key issues - the definition of rigorous ref conditions and boundary setting 2 sources of differences : 1) Differences because of ecological factors (climate, geology, water chemistry etc.),,, Our lakes are different ! 2) Interpretations of ‘reference conditions’ (best what we found, expert judgement,,,) The challenge is to differentiate between those differences in “national reference states” that reflect genuine ecological variability among MS and those that reflect differences in classification approach

Thank you !

Random Proposal : 0.5 class differenceAccepted: 0.39 class difference Low agreement : 0.8 class difference

Causes of differences: - Different metrics measure different aspects of the QE  If this is the case Option 3 is not feasible - Uncertainty in the assessment outcome  increased r2 values in regression between methods  If uncertainty is too high Option 3 is not feasible - Systematic differences between methods  Most GIGs have worked to minimise this; a minor difference is allowable (harmonisation band)

Intercalibration Option 3 results: what is acceptable and what is not ? Sandra Poikane Joint Research Centre Institute for Environment and Sustainability.

Similar presentations

Presentation on theme: "Intercalibration Option 3 results: what is acceptable and what is not ? Sandra Poikane Joint Research Centre Institute for Environment and Sustainability."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Intercalibration Option 3 results: what is acceptable and what is not ? Sandra Poikane Joint Research Centre Institute for Environment and Sustainability.

Similar presentations

Presentation on theme: "Intercalibration Option 3 results: what is acceptable and what is not ? Sandra Poikane Joint Research Centre Institute for Environment and Sustainability."— Presentation transcript:

Similar presentations

About project

Feedback