Dave Bloom Museum of Vertebrate Zoology University of California, Berkeley Georeferencing Introduction: Collaboration to Automation
Georeferencing Collaborations Automation
Georeferencing Collaborations Automation
What is a georeference?
A numerical description of a place that can be mapped. What is a georeference?
A numerical description of a place that can be mapped. What is a georeference? In other words…
IDSpeciesLocality 1Lynx rufusDawson Rd. N Whitehorse 2Pudu pudacerca de Valdivia 3Canis lupus20 mi NW Duluth 9Ursus arctosBear Flat, Haines Junction 4Felis concolorPichi Trafúl 5Lama alpacanear Cuzco 6Panthera leoSan Diego Zoo 7Sorex lyelliLyell Canyon, Yosemite 8Orcinus orca1 mi W San Juan Island What we have: Localities we can read
Darwin Core Location Terms –higherGeography –waterbody, island, islandGroup –continent, country, countryCode, stateProvince, county, municipality –locality –minimumElevationInMeters, maximumElevationInMeters, minimumDepthInMeters, maximumDepthInMeters
What we want: Localities we can map
Darwin Core Georeference Terms –decimalLatitude, decimalLongitude –geodeticDatum –coordinateUncertaintyInMeters –georeferencedBy, georeferenceProtocol –georeferenceSources –georeferenceVerificationStatus –georeferenceRemarks –coordinatePrecision –pointRadiusSpatialFit –footprintWKT, footprintSRS, footprintSpatialFit
What is a georeference? A numerical description of a place that can be mapped.
“ Davis, Yolo County, California ” “ point method ” Coordinates: Horizontal Geodetic Datum: NAD27
Data Quality data have the potential to be used in ways unforeseen when collected. the value of the data is directly related to the fitness for a variety of uses. “ as data become more accessible many more uses become apparent. ” – Chapman 2005 the GBIF Best Practices (Chapman and Wieczorek 2006) promote data quality and fitness for use.
What is an acceptable georeference? A numerical description of a place that can be mapped and that describes the spatial extent of a locality and its associated uncertainties.
“ Davis, Yolo County, California ” “ bounding-box method ” Coordinates: Horizontal Geodetic Datum: NAD27
“ Davis, Yolo County, California ” “ point-radius method ” Coordinates: Horizontal Geodetic Datum: NAD27 Maximum Uncertainty: 8325 m
What is an ideal georeference? A numerical description of a place that can be mapped and that describes the spatial extent of a locality and its associated uncertainties as well as possible.
“ Davis, Yolo County, California ” “ shape method ”
“ 20 mi E Hayfork, California ” “ probability method ”
pointeasy to produce no data quality bounding-boxsimple spatial queries difficult quality assessment point-radiuseasy quality assessment difficult spatial queries shapeaccurate representation complex, uniform Method Comparison probabilityaccurate representation complex, non-uniform
MaNIS/HerpNET/ORNIS (MHO) Guidelines uses point-radius representation of georeferences circle encompasses all sources of uncertainty about the location methodology formalizes assumptions, algorithms, and documentation standards that promote reproducible results methods are universally applicable
Darwin Core Georeference Terms –decimalLatitude, decimalLongitude –geodeticDatum –coordinateUncertaintyInMeters –georeferencedBy, georeferenceProtocol –georeferenceSources –georeferenceVerificationStatus –georeferenceRemarks –coordinatePrecision –pointRadiusSpatialFit –footprintWKT, footprintSRS, footprintSpatialFit
Georeferencing Collaborations Automation
Collaborative Distributed Databases for Vertebrates
Collaborations
MaNIS Localities Georeferenced n = 326k localities (1.4M specimens) r = 14 localities/hr (point-radius method) t = 3 yrs (~40 georeferencers)
ORNIS Localities Georeferenced n = 267k localities (1.4M specimens) r = 30 localities/hr (point-radius method) t = 2 yrs (~30 georeferencers)
Scope of the Problem for Natural History Collections ~2.5x10 9 records
Scope of the Problem for Natural History Collections ~6 records per locality* ~14 localities per hour* * based on the MaNIS Project ~2.5x10 9 records
Scope of the Problem for Natural History Collections ~6 records per locality* ~14 localities per hour* ~15,500 years * based on the MaNIS Project ~2.5x10 9 records
Scope of the Problem for Natural History Collections ~6 records per locality* ~14 (30) localities per hour* ~15,500 (7233) years * based on the MaNIS (ORNIS) Project ~2.5x10 9 records
Georeferencing Collaborations Automation
Combining the Best in Georeferencing GeoLocate GADM MaNIS Georeferencing Calculator
GADM Global Administrative Boundaries:
Georeferencing Calculator: