automated detection errors validation

Slides:



Advertisements
Similar presentations
Business Statistics - QBM117 Selecting the sample size.
Advertisements

BA 555 Practical Business Analysis
Estimation Procedures Point Estimation Confidence Interval Estimation.
Experimental Evaluation
C~POD How it works Nick Tregenza Feb Clicks made by dolphins are not very distinctive sounds, and even ‘typical’ porpoise clicks can come from something.
Anomaly detection with Bayesian networks Website: John Sandiford.
RDPStatistical Methods in Scientific Research - Lecture 41 Lecture 4 Sample size determination 4.1 Criteria for sample size determination 4.2 Finding the.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
1 Mean Analysis. 2 Introduction l If we use sample mean (the mean of the sample) to approximate the population mean (the mean of the population), errors.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Statistical Significance Hypothesis Testing.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Lecture 5.  It is done to ensure the questions asked would generate the data that would answer the research questions n research objectives  The respondents.
Project Data Flow.
Inference: Conclusion with Confidence
Reading and Viewing POD data
Big data classification using neural network
Machine Learning with Spark MLlib
Software Testing.
Chapter 8: Estimating with Confidence
Comparing instruments
Active Learning Lecture Slides
More on Inference.
Chapter 8: Estimating with Confidence
Inference for the Mean of a Population
Machine Learning for Computer Security
Choice of Filters.
Making inferences from collected data involve two possible tasks:
Walk through a file Ilha_Grande CPOD1372 file01.CP1
A Personal Tour of Machine Learning and Its Applications
LECTURE 33: STATISTICAL SIGNIFICANCE AND CONFIDENCE (CONT.)
Chapter 21 More About Tests.
Species.
Diel and other patterns
Statistical Data Analysis
WUTS weak unknown train sources
Static Acoustic Monitoring and Noise
Outlier Discovery/Anomaly Detection
Statistical Methods For Engineers
More on Inference.
Ch. 8 Estimating with Confidence
Statistical Inference for the Mean Confidence Interval
Experiments in Machine Learning

with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Chapter 10: Basics of Confidence Intervals
CS240: Advanced Programming Concepts
Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations Greenland et al (2016)
Chapter 10: Estimating with Confidence
Chapter 8: Estimating with Confidence
Power.
The loss function, the normal equation,
Chapter 8: Estimating with Confidence
Mathematical Foundations of BME Reza Shadmehr
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
MIS2502: Data Analytics Classification Using Decision Trees
Chapter 8: Estimating with Confidence
Comparing echo-location monitoring instruments
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Data Mining Anomaly Detection
Pattern Analysis Prof. Bennett
Machine Learning in Business John C. Hull
Machine Learning.
Presentation transcript:

automated detection errors validation

Is your project feasible? Non-automated analysis is too costly, and too inconsistent, so we need automation with: Level of automated false positives << difference being tested e.g. if you are comparing use of sites, the error level needs to be smaller than the larger of sampling error or the differences found. e.g. if you want to detect a population trend of 5% in 1 year the error level needs to be less than say 1%. The aim: Validation becomes quality control i.e. ‘meets the standard’ rather than editing the data. Errors are not removed. A pilot study may be needed in difficult places - this was done by Jay Barlow et al for the Vaquita monitoring and a power analysis was then used to produce a spatial design. The performance prediction proved accurate.

What kind of thing is train detection? Equations. Always published Pure maths – exact, theoretical. Applied maths – introduces approximations. Statistics – deals with extrapolation from samples. Introduce sampling error. Multivariate and clustering methods. Introduce classifiers. Pattern recognition. Handles ‘polluted’ data - often composed mainly of data that do not belong to the focal subject(s) ...face recognition, train detection, etc Intelligent recognition. Uses a wide and adaptable conceptual structure of classifiers. Computer code. Rarely published. 'Black boxes' by virtue of complexity, and unknown input. Machine learning, neural networks, and 'hand-crafted' algorithms are generally in (5) Humans, as detectors, are a long way ahead at (6)

Human visual validation can improve on the KERNO classifiers … …because a human observer takes a wider temporal view of the data and recognises patterns such as: Recurring inter-click-intervals from boat sonar. ‘Chance trains’ resembling ambient noise. Unusual click rate profiles with no ‘normal’ porpoise data - weak unknown train sources. Improving automated rejection of these errors often comes at a cost: wrongly excluding some true positives.

Encounter classifiers: A third level of classifier. 1. Click classification 2. Train classification 3. Encounter classification Encounter classifiers look at a wider span of data than the train classifier and see the character of a whole encounter. The can be designed for a specific task in a specific location – e.g. the Hel1 classifier was developed for the Baltic Sea with the help of a large bowl of strawberries. Three new sources of false positives were found: mini-bursts, chink spikes, and replays. It has been validated on the whole, huge, SAMBAH data set.

Test case: SAMBAH - a very low density population 1,343 data files 61% no detections 15% have 1 to 20 DPM (detection positive minutes) False positives: 76 - boat sonars 65 - playbacks 10 - WUTS 5 - minibursts or chink spikes 2 - porpoises! 157 of 176,000 DPM were false = 0.1% = < 1 error / year

SAMBAH: should we remove the errors? … not needed for population estimation as the rate is far below other sources of error, but a few % of files with low positive rates had only false positives, and removing those improves the presence/absence distribution map.

Q classes: Hi, Mod,Lo, ? represent the algorithm’s confidence that it was from a train source – cetacean, boat sonar, or WUTS, and not just a ‘chance train’. Normally we use Q classes Hi and Mod Species classes: NBHF, other cet, sonar, unclassed. Q and species are more or less independent.

Validating exercises

Workshop Files 1.CPOD1372_teste_da_gaiola Guiana Dolphins and Fransicana Renan Paitach Good data 2. Gulf of Alaska WUTS WUTS Kate Stafford Bad classification 3. Kawda, Sarjekot 2016 05 10 Humpback Dolphins and Finless Porpoise Ketki Jog 4. Kawda Finless bad 5. Kawda One species or two 6. Some errors Sonars and Irawaddy River Dolphins Danielle Kreb 1. Should an encounter classifier be used? 2. This is how bad they get 3. Find some errors 4. Examples of errors 5. Check species classification 6. Find first correct dolphin. This is a compilation of errors. A coal barge, file 6