What you don’t know can hurt you: uncertainties in georeferencing John Wieczorek Museum of Vertebrate Zoology University of California, Berkeley
Uncertainties What comes out of a system depends on: a)what goes into it b)what you ask of it c)what happens in between
What species occur where? Basis for: conservation bio-prospecting entertainment survival?
What species occur where? species identification
What species occur where? occurrence location
Problem: most original data are in textual form Problem: collection resources are scarce and can’t support large-scale digitization What species occur where? occurrence location
What species occur where? What can Biodiversity Informatics do? Taxonomic Resolution Services
What can Biodiversity Informatics do? Taxonomic Resolution Services What species occur where? Georeferencing Services
IDSpeciesLocality 1Lynx rufusDawson Rd. N Whitehorse 2Pudu pudacerca de Valdivia 3Canis lupus20 mi NW Duluth 9Ursus arctosBear Flat, Haines Junction 4Felis concolorPichi Trafúl 5Lama alpacanear Cuzco 6Panthera leoSan Diego Zoo 7Sorex lyelliLyell Canyon, Yosemite 8Orcinus orca1 mi W San Juan Island What we have: Localities we can read
What we want: Localities we can map
Integration – Species Pages
What is a georeference? A numerical description of a place that can be mapped.
“Davis, Yolo County, California” “point method” Coordinates: Horizontal Geodetic Datum: NAD27
What is an acceptable georeference? A numerical description of a place that can be mapped and that describes the spatial extent of a locality and its associated uncertainties.
1) Map inaccuracy 2) Extent of the reference 3) Coordinate imprecision 4) Undocumented datum 5) Distance imprecision 6) Direction imprecision ScaleUncertainty (ft)Uncertainty (m) 1:1, ft1.0 m 1:2, ft2.0 m 1:4, ft 4.1 m 1:10, ft8.5 m 1:12, ft10.2 m 1:24, ft 12.2 m 1:25, ft12.8 m 1:63, ft32.2 m 1:100, ft50.9 m 1:250, ft127 m Sources of uncertainty
“Davis, Yolo County, California” “bounding-box method” Coordinates: Horizontal Geodetic Datum: NAD27
“Davis, Yolo County, California” “point-radius method” Coordinates: Horizontal Geodetic Datum: NAD27 Maximum Uncertainty: 8325 m
What is an ideal georeference? A numerical description of a place that can be mapped and that describes the spatial extent of a locality and its associated uncertainties as well as possible.
“Davis, Yolo County, California” “shape method”
“20 mi E Hayfork, California” “probability method”
pointeasy to produce no data quality bounding-boxsimple spatial queries difficult quality assessment point-radiuseasy quality assessment difficult spatial queries shapeaccurate representation complex, uniform Method Comparison probabilityaccurate representation complex, non-uniform
Global Biodiversity Information Facility (GBIF) Point-radius Method
“Manual” Georeferencing Tools
Semi-automated Georeferencing Tools
(a) (d)(c) (b) Rowe, Elevational gradient analysis of historical museum specimens: a cautionary tale
What species occur where? Conclusions: 1)We can help users find relevant records 2) We can help users assess data quality and fitness for use 3) In the end, users must exercise due diligence. Without 1) and 2), they can’t.