confidence in classification Paper 3 Technical guidance on achieving adequate confidence in classification CIS Working Group 2A ECOSTAT 1 July 2003
the Directive requires us to achieve an adequate level of confidence and to report this ... Annex V, Section 1.3 and Section 1.3.4
consequence ... we need an estimate of the error in the values of metrics used to classify ... e.g. value (plus or minus 15%)
many quality elements ?
QE 1 QE 2 QE 3 QE 4 QE 5 QE 6 QE 7 QE 8 etc
high true false No QE is worse than High-Good limit QE 1 QE 2 QE 3 etc
high good mod poor bad true false true false true false true false No QE is worse than High-Good limit true QE 1 QE 2 false good QE 3 No QE is worse than Good-Mod limit true QE 4 QE 5 mod false QE 6 etc true QE 7 false poor etc QE 8 true false etc etc bad true
one-out all-out
QE 1 QE 2 QE 3 QE 4 QE 5 QE 6 QE 7 QE 8 etc
Single element approach metric 1 QE 1 metric 2 QE 2 metric 3 QE 3 metric 4 QE 4 metric 5 QE 5 metric 6 QE 6 metric 7 QE 7 metric 8 QE 8 etc etc Single element approach
Multi-metric approach QE 1 metric 3 metric 4 metric 5 QE 1 metric 6 metric 7 metric 8 QE 3 metric 9 metric 10 Multi-metric approach
high good mod poor bad true false true false true false true false No QE is worse than High-Good limit true metric 1 metric 2 QE 1 false good metric 3 No QE is worse than Good-Mod limit true metric 4 metric 5 QE 1 mod false metric 6 etc true metric 7 false poor etc metric 8 QE 3 true false metric 9 etc bad metric 10 true
effect of error from monitoring on these models
mean of 12 samples plus or minus 50%
number of taxa 12 ( 11 - 15 )
principles apply to all metrics
leads to mis-classification 20% per QE
~ 20 % of sites site truly good is put wrongly into or high or mod poor bad ~ 20 % of sites
wrong change of class 30%
between biological and wrong difference between biological and chemical class 30%
lots of quality elements each with 20% error one-out / all-out
100 fail 10% true waters
100 %reported %true number of QE’s fail %reported fail %true waters number of QE’s
100 % reported % true number of QE’s fail % reported fail % true waters number of QE’s
100 % reported % true number of QE’s fail % reported fail % true waters number of QE’s
one-out / all-out ... is vulnerable to errors ...
extremely high high
controls ... 1 averaging 2 significance test 3 exclude QE’s
controls ... 1 averaging 2 significance test 3 exclude QE’s all needed
Multi-metric approach 1 averaging Multi-metric approach
high good mod poor bad true false true false true false true false No QE is worse than High-Good limit true metric 1 metric 2 QE 1 false good metric 3 No QE is worse than Good-Mod limit true metric 4 metric 5 QE 1 mod false metric 6 etc true metric 7 false poor etc metric 8 QE 2 true false metric 9 etc bad metric 10 true
high good mod poor bad true false true false true false true false No QE is worse than High-Good limit true metric 1 metric 2 QE 1 false good metric 3 No QE is worse than Good-Mod limit true metric 4 metric 5 QE 1 mod false metric 6 etc true metric 7 false poor etc metric 8 QE 2 true false metric 9 etc bad metric 10 true
8% high good mod poor bad true false true false true false true false No QE is worse than High-Good limit true metric 1 metric 2 QE 1 false good metric 3 No QE is worse than Good-Mod limit true metric 4 metric 5 QE 1 mod false metric 6 etc true metric 7 false poor etc metric 8 QE 2 true false metric 9 etc bad metric 10 true
high good mod poor bad true false true false true false true false No QE is worse than High-Good limit true metric 1 metric 2 QE 1 false good metric 3 No QE is worse than Good-Mod limit true metric 4 metric 5 QE 1 mod false metric 6 etc true metric 7 false poor etc metric 8 QE 2 true false metric 9 etc bad metric 10 true
100 %reported %true number of QE’s fail %reported fail %true waters number of QE’s
limits to averaging ...
averaging will reduce mis-classification hydrology averaging one-out, all-out nutrient averaging organic enrichment metrics grouped by pressure
But averaging can hide impacts ... Sensitive metric
Composition and abundance undisturbed – no impacts 3 species of fish are present
abundance disturbed but composition unaffected abundance has changed but 3 species are still present composition AND abundance must be no more than slightly changed for good status to be achieved
2 significance test
high No QE is significantly worse than High-Good limit true
high good mod poor bad QE 1 QE 2 QE 3 QE 4 QE 5 QE 6 QE 7 QE 8 etc No QE is significantly worse than High-Good limit true QE 1 QE 2 false good QE 3 No QE is significantly worse than Good-Mod limit true QE 4 QE 5 mod false QE 6 etc true QE 7 false poor etc QE 8 true false etc etc bad true
100 %reported %true number of QE’s fail %reported fail %true waters number of QE’s
100 %reported %true number of QE’s fail %reported fail %true waters number of QE’s
at least 95% confidence? what is significant? (for serious consequences)
consequence ... monitoring must produce an estimate of the error in the values of metrics used to classify ... e.g. value (plus or minus 15%)
monitoring where we cannot do this should not be used to classify ...
controls ... 1 averaging 2 significance test 3 exclude QE’s
Annex II, Section 1.3 Where it is not possible to establish reliable ... reference conditions for a quality element ... due to high ... natural variability ... then that element may be excluded ...
Annex V 1. 3. 2 Design of Operational Monitoring Annex V 1.3.2 Design of Operational Monitoring ... to assess the impact of ... pressure Member States shall monitor ... parameters indicative of the biological quality element, or elements, most sensitive to the pressures … parameters indicative of the hydromorphological quality element most sensitive to the pressure
exclude QE if ... no reliable estimate of reference conditions QE not sensitive to the pressures pressure covered by other QEs
Exclude Quality Elements QE 1 QE 2 QE 3 QE 4 QE 5 Exclude Quality Elements QE 6 QE 7 QE 8 etc
Exclude Quality Elements high No relevant QE is significantly worse than High-Good limit true QE 1 QE 2 false good QE 3 No relevant QE is significantly worse than Good-Mod limit true QE 4 QE 5 mod Exclude Quality Elements false QE 6 etc true QE 7 false poor etc QE 8 true false etc etc bad true
Exclude Quality Elements No relevant QE is significantly worse than High-Good limit high true metric 1 metric 2 QE 1 good metric 3 false true metric 4 etc metric 5 QE 1 mod false Exclude Quality Elements metric 6 etc true metric 7 false poor etc metric 8 QE 3 true false metric 9 etc bad metric 10 true
100 % reported % true number of QE’s fail % reported fail % true waters number of QE’s
100 % reported % true number of QE’s fail % reported fail % true waters number of QE’s
summary
100 %reported %true number of QE’s fail %reported fail %true waters number of QE’s
controls ... 1 averaging 2 significance test 3 exclude QE’s
Exclude Quality Elements high no relevant QE is significantly worse than High-Good limit true QE 1 QE 2 false good QE 3 No relevant QE is significantly worse than Good-Mod limit true QE 4 QE 5 mod Exclude Quality Elements false QE 6 etc true QE 7 false poor etc QE 8 true false etc etc bad true
significant? at least 95% confidence? (for serious consequences)
we need an estimate of the error in the values of metrics used to classify ... e.g. value (plus or minus 15%)
100 % reported % true number of QE’s fail % reported fail % true waters number of QE’s
confidence in classification Paper 3 Technical guidance on achieving adequate confidence in classification CIS Working Group 2A ECOSTAT 1 July 2003