SEPA Bathing Waters Signage Calum McPhail Environmental Quality Unit manager Ruth Stidson Bathing Waters Signage Officer.

Slides:



Advertisements
Similar presentations
Outcomes of The Living Murray Icon Sites Application Project Stuart Little Project Officer, The Living Murray Environmental Monitoring eWater CRC Participants.
Advertisements

DECISION TREES. Decision trees  One possible representation for hypotheses.
Developing the Self-Calibrating Palmer Drought Severity Index Is this computer science or climatology? Steve Goddard Computer Science & Engineering, UNL.
User Interface Design Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
Chapter 7 – Classification and Regression Trees
Chapter 12 - Forecasting Forecasting is important in the business decision-making process in which a current choice or decision has future implications:
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Demand Forecasting and Gas Flow Data NW GRI Transparency Workshop Chris Logue – March 2009.
Monitoring and Pollutant Load Estimation. Load = the mass or weight of pollutant that passes a cross-section of the river in a specific amount of time.
Testing an individual module
Total Quality Management BUS 3 – 142 Statistics for Variables Week of Mar 14, 2011.
Software Process and Product Metrics
Air Quality Data Analysis Using Open Source Tools
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Data Mining Techniques
Introduction to Systems Analysis and Design Trisha Cummings.
Chapter 10 Hypothesis Testing
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Use Case Diagrams – Functional Models Chapter 5. Objectives Understand the rules and style guidelines for activity diagrams. Understand the rules and.
CS490D: Introduction to Data Mining Prof. Chris Clifton April 14, 2004 Fraud and Misuse Detection.
Conducting a User Study Human-Computer Interaction.
DATA MINING : CLASSIFICATION. Classification : Definition  Classification is a supervised learning.  Uses training sets which has correct answers (class.
An Evaluation of A Commercial Data Mining Suite Oracle Data Mining Presented by Emily Davis Supervisor: John Ebden.
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
by B. Zadrozny and C. Elkan
ITEC224 Database Programming
Chapter 6 : Software Metrics
1 ADVANCED MICROSOFT EXCEL Lesson 9 Applying Advanced Worksheets and Charts Options.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Event Management & ITIL V3
EC Bathing Waters and Bathing Water quality management Calum McPhail Environmental Quality Unit manager Caroline Dilks Senior Scientist.
PowerPoint Presentation for Dennis, Wixom, & Tegarden Systems Analysis and Design with UML, 3rd Edition Copyright © 2009 John Wiley & Sons, Inc. All rights.
Patterns of Event Causality Suggest More Effective Corrective Actions Abstract: The Occurrence Reporting and Processing System (ORPS) has used a consistent.
Next Colin Clarke-Hill and Ismo Kuhanen 1 Analysing Quantitative Data 1 Forming the Hypothesis Inferential Methods - an overview Research Methods Analysing.
Software Architecture
Part 5 Staffing Activities: Employment
STRATEGIC ENVIRONMENTAL ASSESSMENT METHODOLOGY AND TECHNIQUES.
Objectives 2.1Scatterplots  Scatterplots  Explanatory and response variables  Interpreting scatterplots  Outliers Adapted from authors’ slides © 2012.
I Robot.
C M Clarke-Hill1 Analysing Quantitative Data Forming the Hypothesis Inferential Methods - an overview Research Methods.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Current fixed signs with maps Example of a current Beach results poster:
Spreadsheet Engineering Builders use blueprints or plans – Without plans structures will fail to be effective Advanced planning in any sort of design can.
INTEGRATED SCIENCE PROCESS SKILLS BASIC SCIENCE PROCESS SKILLS
SoftwareServant Pty Ltd 2009 SoftwareServant ® Using the Specification-Only Method.
Special Challenges With Large Data Mining Projects CAS PREDICTIVE MODELING SEMINAR Beth Fitzgerald ISO October 2006.
Refactoring Agile Development Project. Lecture roadmap Refactoring Some issues to address when coding.
A Cyberinfrastructure for Drought Risk Assessment An Application of Geo-Spatial Decision Support to Agriculture Risk Management.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Mutation Testing Laraib Zahid & Mariam Arshad. What is Mutation Testing?  Fault-based Testing: directed towards “typical” faults that could occur in.
Data Mining With SQL Server Data Tools Mining Data Using Tools You Already Have.
How to Use Telemarketing for professional services? Telemarketing I get this question: I want to use telemarketing to find clients for my study. What do.
Metal bioavailability under the Water Framework Directive Implementation in monitoring and assessment frameworks Implementation of Bioavailability 1.
Saving lives, changing minds. Presentation title at-a-glance info (in slide master) Myanmar Climate Change Training Presentation title at-a-glance.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 15: Sample size and Power Marshall University Genomics.
1 © Trinity Horne Limited Analysing pollution and targeting prevention activity in a UK water company Alec Ross Senior Statistician Luke Cooper.
Saskatoon SAS user group
Human Computer Interaction Lecture 21 User Support
Chapter 33 Introduction to the Nursing Process
Chapter 10 Verification and Validation of Simulation Models
UNIT-4 BLACKBOX AND WHITEBOX TESTING
Tomaž Špeh, Rudi Seljak Statistical Office of the Republic of Slovenia
Forecasting methods Presented by: 29 January 2014.
UNIT-4 BLACKBOX AND WHITEBOX TESTING
Presentation transcript:

SEPA Bathing Waters Signage Calum McPhail Environmental Quality Unit manager Ruth Stidson Bathing Waters Signage Officer

Contents  SEPA beach signage – overview and results  Development of the SEPA Signage Prediction Tool  Development of future modelling systems

Background on Bathing Waters  Scotland has had problems of poor quality bathing water in some areas  Combination of diffuse pollution, especially on the west coast, and CSO discharges  For some sites meeting the potential new Directive will be challenging

Signage Overview  SEPA makes a daily water quality prediction, relating to the EU standards for bathing water, at the 10 signage sites throughout the bathing season  This is based on relevant environmental (mainly rainfall) events from the previous two days  This information is then displayed at the beach via an electronic variable message sign and on the web and phone line.

Example of electronic beach sign (at Prestwick) and alternative sign face legends

EC Bathing Water with signage EC Bathing Water SEPA Bathing Waters Signage Scottish Executive initiated & funded Run as a project in 2003 & 2004 Now in place at 10 beaches

Based on 683 samples

Signage validation results

Aberdeen Signage results 2004 & 2005 (excellent, good, or poor status predictions)

First year sign management  Signs set to predict poor quality if:  24 hr rain greater than 10 mm or  48 hr rain greater than 15 mm  These were known as ‘decision trigger levels’  This worked well but there was scope for improvement

Development of the SEPA Signage Prediction Tool  Known relationship between rainfall and coliform levels  SEPA archives for historical datasets (e.g. water quality results and environmental drivers)  Understanding site response to inputs, or recent infrastructure improvements/schemes  Predicting diffuse pollution:  rain events  run-off from fields  increased coliform levels

Further developing the relationship between rain & coliforms  For each of the signage sites:  Relevant rain and river gauging stations were identified  At each raingauge, 1 to 5 days rain was correlated against faecal and non-faecal coliforms and faecal streptococci  Strongest relationships at each site were identified

Conversion into a useful tool  Possible to use relationship to predict coliform levels on any given day  Use this information to predict if the coliform levels will be in exceedance of EC guidelines  Development of signage prediction tool to refine decision trigger levels

What does the tool do?  Site specific  Enables the testing of potential decision trigger levels against actual data from 2000/1 onwards  Instantly allows the user to see the outcomes of trial decision trigger levels  Allows the user to alter the coliform exceedence limits in anticipation of the new Directive

Copy and paste in data set: Coliform values (up to 3 types) Rainfall (up to 4 gauges and 5 time periods) River flow (up to 2 gauges) Years to include Microbiology values Start testing rain and river values SEPA Signage Prediction Tool Immediately see past results

Strengths of the current tool  Very effective at predicting compliance against mandatory standard in Scotland  98% correct or precautionary in 03 & 04, 99% in 05  Simple to:  Use  Update  Apply to additional sites  Transparent

Easy adaptations  Very easy to adapt for other factors IF they can be considered as a single variable  E.g. if sunshine is a major driver can add in test as per river flow  Input sunshine (eg) as hours  Use test such as ‘if sunshine < x hours predict poor’  Use tool to test different values of x  Can use similar technique for wind, tide, telemetered CSO spills etc

More challenging adaptations  It is possible to consider combined factors  IF rain > 10 mm → poor  IF rain > 8 mm AND Wind = onshore OR tide = incoming → poor  Rapidly becomes more complex !

Bathing Water Future Models Colin Gray Data Analyst Modeller, SEPA  Aim:  To try and improve current models  To develop models for future, more stringent EU directive  To utilise new developments and software within SEPA

Data Available to Models  Rainfall data for relevant gauges per beach  River flow data  High tide times & sample times  Weather  Salinity  Can not be used in predictions currently due to sampling methods  Wind direction and speed  Beach usage

Conclusions from Data Analysis  No clear cut splits between all fails and passes for current or future EU rules  Although more extreme levels of rain fall and river flow tend to be failures, there is a large amount of overlap at more moderate (normal) levels  Similar results from several beaches  Important factors are:  Rain fall over time periods  Total rain  River flow  High tide time  Salinity  No trends seen in weather, wind speed or direction, beach usage or other miscellaneous data  Very difficult to visualise multiple factor data and trends  E.g. if x is over this, and y is under this while z is this, then beach will fail  IMiner and S-plus modelling techniques can assist

Software and Techniques  SEPA Statistical and Modelling software  S-plus  ideal for data manipulation and graphing  Insightful Miner (IMiner)  designed for producing work flows and modelling large amounts of data  Both are closely integrated  Models Used  Scoring Method  Classification Trees (a.k.a. Decision Trees)  Neural Networks  Classification Regression  Naïve Bayes

General Principles  Performance will be very dependant on true trends being present  Lack of failure data can lead to models using incorrect assumptions  Computers know nothing of science!  All models need human validation and adjustment to ensure making sensible assumptions and relationships

Decision trees  Uses method called RPART (recursive partitioning)  Builds braches which represent relationships between factors  Helps highlight key factors affecting a bathing water  Easy to interpret and adjust  Very fast to generate and utilise  Widely used in other industries e.g. pharmaceutical  Easy to implement as an everyday prediction tool  Suited well to bathing water predictions  Uses predefined conditions to determine prediction

Decision Trees  Irvine, current EU Directive

Performance for Irvine  For Current EU directive:  Perfect prediction  Tree is very simple and scientifically reasonable

Performance for Irvine  For Future EU directive:  Tree becomes very complex

Performance for Irvine  For Future EU directive:  Although performs well:  No way of controlling the fact that it is preferable to predict a ‘Pass’ as a ‘Fail’ instead of vice versa as in above  Have altered the method to allow for weighting  Some final splits in the tree are likely not to be based on actual reason for failing  Splits will highlight difference in data for results and may have no scientific relevance

Summary of Decision Trees  Decision trees appear to provide a good method of modelling beaches  Easier to interpret and adjust than other methods  Better performance than scoring, neural networks or logistic regression  However careful manipulation of the weighting may be required  Care needed to ensure final splits are scientifically valid  Missing data needs to be handled in a standard method

Future Work  Decision tree models  Derive models for all bathing waters using 2003 and 2004 data  Then use 2005 data to assess performance  Ongoing project to assess usage of rain radar to improve predictions  Potential network expansion to new sites

A View from the Sun - Nov 2004

Models Developed  Scoring Method  Attempt to score factors  Add scores at the end and if over a certain number then it is predicted a fail  Very similar to current method but more flexible  Could improve predictions at Irvine but more difficult at Saltcoats  Very time consuming to develop and very ad-hoc  Neural Networks  Uses IMiner internal neural network method  Produces very complex relationships between factors  Can be very powerful and highly predictive  Is very dependant on quality of training data  Almost impossible to adjust or interpret  Is unlikely to perform well in new circumstances  Logistic Regression  Produces an equation to represent the bathing water  Can weight the outcome of pass or fail  May over generalise factors as applies one coefficient to each

Signage Roles 2003 – 2004 Scottish Executive initiated and funded the project as a pilot SEPA determined the daily water quality predictions SEPA to run beach signage Faber Maunsell Installing and maintaining the electronic signs and communication linkages Local Authorities, Clean Coast Scotland, Public involvement and understanding of the signage project

BW Signage working: SEPA systems WEBSITE PHONELINE PHONE TEXT

Incorporating into predictive tool  If tide / wind / sunshine is to be incorporated into the tool, it needs to be a secondary consideration to rainfall  Say statistical tests show that an onshore wind at Irvine significantly increases coliform concentrations  If the trigger level for Irvine is set at 12 mm of rain within 24 hours, can code into Excel that:  IF rain is x % lower than the trigger level AND the wind is onshore, then OVERRIDE to POOR  IF rain is x % above the trigger level AND the wind is NOT onshore, then OVERRIDE to GOOD  Can potentially code for tide, wind and sunshine for multiple triggers, however this does considerably increase the complexity of the tool