How to describe Accuracy And why does it matter Jon Proctor, PhotoTopo GIS In The Rockies: October 10, 2013
introduction Accuracy. We think we understand it But, there are more details than you think… This discussion will help you understand what is needed to describe accuracy And Why it matters…
Horizontal Acuracy Horizontal accuracy shall be tested by comparing the planimetric coordinates of well-defined points in the dataset with coordinates of the same points from an independent source of higher accuracy. (NSSDA) projects/accuracy/part3/chapter3
What to measure Measure and record the x and y coordinate of the feature in the product. Ensure that the coordinates are in the same projection as the control Product Control Chip GCP (control) coordinate Observed location
Accuracy Spreadsheet For each point record the control ID, the observed (product) x and y coordinate List the control x and y coordinate Calculate the difference (observed – control) for x and y Plot errors Calculate radius
Is this the accuracy? So far, we only now the errors of the points that were measured We don’t have the accuracy for the product How do we describe the accuracy?
What is Positional Accuracy? Positional accuracy is a statistical measure of a features location. We are not saying that the product is off by 4 meters to the East. Rather we have 4 meters of uncertainty. And we need to describe that uncertainty. It represents the probability that a feature is within a given distance of its true location. Probability and distance Not a specific direction, not a specific distance This probability, statistical description can be represented in numerous ways such as: CE90, CE95, RMSE, StDev, and many more…
What is needed to Describe accuracy? We need 3 descriptors: Measure: 4, 8.5, 10 Unit: meter, feet Statistical description: RMSE r, StDev r, CE90
Definitions
RMSE vs StDev RMSE r Measure radial distance from control (0,0) to data point StDev r Measure radial value from cluster center to data point
RMSE vs StDev For an Un-biased dataset, where the Center of scatter plot is near 0,0 Measures are almost exactly the same RMSE = StDev
(more) Definitions CE90 A CE90 of 10.0 meters is an accuracy in which 90% of the well defined, measured image points are statistically expected to be within 10.0 meters from their surveyed locations. assumes that the survey or truth point is (much) more accurate than the dataset being sampled. CE95 The distance for a 95% probability CEP Circular Error Probable. Circle of Equal Probability. The distance for a 50% probability. Or CE50
Example of 10m CE90 But what if we forget the statistical descriptor of CE90? If the accuracy was only described as 10m, what would the scatter look like?
Why it matters These plots are all at 10m The scatter size changes based on the statistical descriptor CE99.99 CE99 CE95 CE90 CE50 RMSEr RMSExy Without the descriptor, we would not know the tolerance for error on the project
Why it matters (2) Another way to show why the statistical description matters is shown here. These are equivalent values for this error scatter plot m CE m CE m CE m CE m CE m RMSer 4.66 m RMSexy
xy compared to r
xy Compared to r offset with biased dataset
xy Compared to r offset with unbiased dataset With an unbiased dataset, the x values are centered on 0 Radial values are all positive
Types of Error Blunder: A gross error. A careless mistake. An obvious error. Blunders are individual errors that effect each measurement differently. Blunders that can be identified should be dis-regarded, or removed from the report or solution. Marking a control point at the wrong corner for an intersection. Bad image correlation Systematic: An error that tends to shift all measurements in a systematic way. If the systematic error is identified, it can be removed from the results, or the project can be re-processed with the corrected information. A bold of shadowed line used to depict a feature A burr at the end of a measuring stick Sensor calibration error Incorrect interior orientation Incorrect re-projection parameters Random: An error that shifts measurements in a haphazard on inconsistent manner. Rounding of significant figures Coarse DEM postings, not capturing terrain change Noise in signal processing To goal is to identify and remove the blunders, and remove systematic errors. As a result, only Random errors are left in the project.
If we could say the positional accuracy was 4 meter to the east, then we could just shift the data set to the east, and improve the accuracy. But in a good project, you will remove the blunders, and bias. All that is left is random error
Random error at 10m CE90 Example of Random error Observations = ,000 10,000 Distribution is circular and non-biased
Conclusions Specify a measure: 4, 8.5, 10 Specify Unit: meter, feet Use a statistical description RMSE StDev CE50 CE90 Etc… And, be precise RMSE xy vs RMSE r StDev xy vs StDev r Important to know because the conversion factor from X r to CE90 is different than the conversion factor of X xy to CE90
Conversion factors Since these descriptors are all describing 2-dimensional error offsets, we can convert between the different values.
Contact Info Jon Proctor
Backup Slides
3 methods for measuring CE90 Direct rank XY Radius Show the difference with 5 check points, 10, 20 Really, we should say “I am 73% confident that 90% of the well defined points are within 10 meters of their true location” Where my confidence level is based on the sample size, and variance measured from the population
Variation in estimated CE90 These plots show variation in the estimated CE90 based on 3 different methods and a sample size of 20 Estimated CE90 ranges from 6.5 and 12 m CE90 in these 5 samples