Georeferencing Introduction: Collaboration to Automation John Wieczorek Museum of Vertebrate Zoology University of California, Berkeley
Georeferencing Collaborations Automation testing slide 2
Georeferencing Collaborations Automation testing slide 2
What is a georeference? testing slide 2
A numerical description of a place that can be mapped. What is a georeference? A numerical description of a place that can be mapped. testing slide 2
A numerical description of a place that can be mapped. What is a georeference? A numerical description of a place that can be mapped. testing slide 2 In other words…
What we have: Localities we can read ID Species Locality 1 Lynx rufus Dawson Rd. N Whitehorse 2 Pudu puda cerca de Valdivia 3 Canis lupus 20 mi NW Duluth 4 Felis concolor Pichi Trafúl 5 Lama alpaca near Cuzco 6 Panthera leo San Diego Zoo 7 Sorex lyelli Lyell Canyon, Yosemite 8 Orcinus orca 1 mi W San Juan Island 9 Ursus arctos Bear Flat, Haines Junction
Darwin Core Location Terms higherGeography waterbody, island, islandGroup continent, country, countryCode, stateProvince, county, municipality locality minimumElevationInMeters, maximumElevationInMeters, minimumDepthInMeters, maxaximumDepthInMeters
What we want: Localities we can map
Darwin Core Georeference Terms decimalLatitude, decimalLongitude geodeticDatum coordinateUncertaintyInMeters georeferencedBy, georeferenceProtocol georeferenceSources georeferenceVerificationStatus georeferenceRemarks coordinatePrecision pointRadiusSpatialFit footprintWKT, footprintSRS, footprintSpatialFit
A numerical description of a place that can be mapped. What is a georeference? A numerical description of a place that can be mapped. testing slide 2
“Davis, Yolo County, California” testing slide 2 Coordinates: 38.5463 -121.7425 Horizontal Geodetic Datum: NAD27 “point method”
Data Quality data have the potential to be used in ways unforeseen when collected. the value of the data is directly related to the fitness for a variety of uses. “as data become more accessible many more uses become apparent.” – Chapman 2005 the GBIF Best Practices (Chapman and Wieczorek 2006) promote data quality and fitness for use.
What is an acceptable georeference? A numerical description of a place that can be mapped and that describes the spatial extent of a locality and its associated uncertainties. testing slide 2
“bounding-box method” “Davis, Yolo County, California” testing slide 2 Coordinates: 38.5486 -121.7542 38.545 -121.7394 Horizontal Geodetic Datum: NAD27 “bounding-box method”
“point-radius method” “Davis, Yolo County, California” testing slide 2 Coordinates: 38.5468 -121.7469 Horizontal Geodetic Datum: NAD27 Maximum Uncertainty: 8325 m “point-radius method”
What is an ideal georeference? A numerical description of a place that can be mapped and that describes the spatial extent of a locality and its associated uncertainties as well as possible. testing slide 2
“Davis, Yolo County, California” testing slide 2 “shape method”
“20 mi E Hayfork, California” “probability method”
Method Comparison point easy to produce no data quality bounding-box simple spatial queries difficult quality assessment point-radius easy quality assessment difficult spatial queries shape accurate representation complex, uniform probability accurate representation complex, non-uniform
uses point-radius representation of georeferences MaNIS/HerpNET/ORNIS (MHO) Guidelines http://manisnet.org/GeorefGuide.html uses point-radius representation of georeferences circle encompasses all sources of uncertainty about the location methodology formalizes assumptions, algorithms, and documentation standards that promote reproducible results methods are universally applicable
Darwin Core Georeference Terms decimalLatitude, decimalLongitude geodeticDatum coordinateUncertaintyInMeters georeferencedBy, georeferenceProtocol georeferenceSources georeferenceVerificationStatus georeferenceRemarks coordinatePrecision pointRadiusSpatialFit footprintWKT, footprintSRS, footprintSpatialFit
Georeferencing Collaborations Automation testing slide 2
Collaborative Distributed Databases for Vertebrates
Collaborations testing slide 2
MaNIS Localities Georeferenced n = 326k localities (1.4M specimens) r = 14 localities/hr (point-radius method) t = 3 yrs (~40 georeferencers)
ORNIS Localities Georeferenced n = 267k localities (1.4M specimens) r = 30 localities/hr (point-radius method) t = 2 yrs (~30 georeferencers)
Scope of the Problem for Natural History Collections ~2.5x109 records testing slide 2
Scope of the Problem for Natural History Collections ~2.5x109 records ~6 records per locality* ~14 localities per hour* testing slide 2 * based on the MaNIS Project
Scope of the Problem for Natural History Collections ~2.5x109 records ~6 records per locality* ~14 localities per hour* testing slide 2 ~15,500 years * based on the MaNIS Project
Scope of the Problem for Natural History Collections ~2.5x109 records ~6 records per locality* ~14 (30) localities per hour* testing slide 2 ~15,500 (7233) years * based on the MaNIS (ORNIS) Project
Georeferencing Collaborations Automation testing slide 2
MaNIS Georeferencing Calculator Automation Combining the Best in Georeferencing GADM GeoLocate BioGeomancer testing slide 2 MaNIS Georeferencing Calculator
Global Administrative Boundaries: GADM Global Administrative Boundaries:
http://www.museum.tulane.edu/geolocate
http://www.biogeomancer.org
Georeferencing Calculator: