Automated Georeferencing of Natural History Museum Data Nelson E. Rios Discussion The Tulane University Fish Collection, with 7.1 million fluid-preserved.

Slides:



Advertisements
Similar presentations
Week 1: Introduction to GIS
Advertisements

Georeferencing Workshop University of the Western Cape Cape Town, South Africa Feb 2011.
Heteroptera: True Bugs 7 infraorders 85 families 40,000 described species.
GEOGRAPHIC INFORMATION SYSTEMS PRESENTATION 1
IDigBio Train the Trainers 2 Georeferencing Workshop Gainesville, FL 12-16, Aug 2013 FishNet Workflow for Large Scale Collaborative Georeferencing Nelson.
Development of a computer information system for wildlife conservation in Louisiana, with a prototype system for fishes Henry L. Bart Jr. and Nelson E.
Web-based Specimen Databasing: Lessons from the Plant Bug Planetary Biodiversity Inventory Project presented by Randall T. Schuh Curator and Chair Division.
GEOLocate. GEOLocate – Automated Georeferencing Desktop application for automated georeferencing of natural history collections data Locality description.
BIS TDWG Conference, New Orleans, 2011 GBIF: Issues in providing federated access to digital information related to biological specimens David Remsen Senior.
Software to Manage EEP Vegetation Plot Data A design proposal Michael Lee January 31, 2011.
Neighborhood Locator Team Members: Qian Hao Nick Miller Doug Shover Tagwireyi Paradzayi.
Georeferencing of RMCA data An Tombeur HerpNET workshop Royal Museum for Central Africa December 2006.
1 The GeoParser. 2 Overview What is a geoparser? –Software for the automated extraction of place names from text Why would you want one? –Document characterisation.
Face Recognition Data Search Tool COMP6703 PRESENTATION Presented by Yan Gao u Supervisor: Professor Tom Gedeon.
Mobile Technology for Real Property Assessment Tax Assessor’s Office Davie County, North Carolina.
Data Input How do I transfer the paper map data and attribute data to a format that is usable by the GIS software? Data input involves both locational.
With Microsoft Access 2010 © 2011 Pearson Education, Inc. Publishing as Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft ® Access.
Managing Data Resources. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single.
Nelson E. Rios Tulane University Museum of Natural History Geospatially Enabling Natural History Collections Data.
Community Building and Collaborative Georeferencing using GEOLocate Nelson E. Rios & Henry L. Bart Jr. Tulane University Museum of Natural History.
The chapter will address the following questions:
Georeferencing Train-the-Trainers Survey Results Selected Findings.
Developing Health Geographic Information Systems (HGIS) for Khorasan Province in Iran (Technical Report) S.H. Sanaei-Nejad, (MSc, PhD) Ferdowsi University.
Software & services for georeferencing of natural history collections data automated georeferencing verification & correction batch processing geographic.
NHD Flow Check. NHDFlowcheck is a utility for geometric network creation and validation of an NHD Flowline feature class that exists in a NHD dataset.
Biological data: georeferencing Monica Papeş University of Kansas
Confidential - Property of Navitas Accelerate define.xml using defineReady - Saravanan June 17, 2015.
John Wieczorek (for BGWG) Museum of Vertebrate Zoology University of California, Berkeley BioGeomancer: Collaboration to Automation.
What is a georeference? A numerical description of a place that can be mapped.
Wiley eGrade. What is eGrade? Web-based software that enables instructors to automate the process of assigning and grading homework and quiz assignments.
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition Tools and Resources to Assess and Enhance Fitness-For-Use.
Why Use MONAHRQ for Health Care Reporting? March 2015 Note: This is one of eight slide sets outlining MONAHRQ and its value, available at
OARE Module 5A: Scopus (Elsevier). Table of Contents About Scopus (Elsevier) Using Scopus Search Page Results/Refine Search Pages Download, PDF, Export,
Assignee Name Harmonization Efforts at the U.S. Patent and Trademark Office US Patent and Trademark Office Office of Electronic Information Products Patent.
Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.
Evaluating Remotely Sensed Images For Use In Inventorying Roadway Infrastructure Features N C R S T INFRASTRUCTURE.
Georeferencing Methods. 1) Read Guidelines: Point-radius method Point radius method for georeferencing locality descriptions and calculating associated.
Electronic Graduate Admissions at Delaware John C. Cavanaugh University of Delaware Council of Graduate Schools and Canadian Association for Graduate Schools.
Synchronize Our Modified Data with the Latest Vendor Update By Charline Avey IT Operations Support Lead.
Site Suitability for Lake Overholser Cassi Poor CRP 551.
This material is based upon work supported by the National Science Foundation under Grant No. ANT Any opinions, findings, and conclusions or recommendations.
Data Creation and Editing Based in part on notes by Prof. Joseph Ferreira and Michael Flaxman Lulu Xue | Nov. 3, :A Workshop on Geographical.
Managing Data Resources. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single.
LBR & WS LAB 1: INTRODUCTION TO GIS.
Verification & Validation. Batch processing In a batch processing system, documents such as sales orders are collected into batches of typically 50 documents.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
TIMOTHY SERVINSKY PROJECT MANAGER CENTER FOR SURVEY RESEARCH Data Preparation: An Introduction to Getting Data Ready for Analysis.
NSF DUE ; Wen M. Andrews J. Sargeant Reynolds Community College Richmond, Virginia.
Train-the-Trainers 2 Workshop Overview August, 2013 iDigBio, Gainesville, Florida (What have we gotten ourselves into?)
Introduction to Geographic Information Systems Fall 2013 (INF 385T-28620) Dr. David Arctur Research Fellow, Adjunct Faculty University of Texas at Austin.
INVITATION TO Computer Science 1 11 Chapter 2 The Algorithmic Foundations of Computer Science.
IDigBio Train the Trainers Georeferencing Workshop Gainesville, FL 8-12, Oct 2012.
Topical Analysis and Visualization of (Network) Data Using Sci2 Ted Polley Research & Editorial Assistant Cyberinfrastructure for Network Science Center.
Automated Geo-referencing of Images Dr. Ronald Briggs Yan Li GeoSpatial Information Sciences The University.
Donna Morrell, CTR NAACCR 2014 Annual Conference Ottawa, Ontario, Canada June 25, 2014 Using Scanners and Optical Character Recognition for Pathology Report.
BioGeomancer: Semi-automated Georeferencing Engine John Wieczorek, Aaron Steele, Dave Neufeld, P. Bryan Heidorn, Robert Guralnick, Reed Beaman, Chris Frazier,
Anti-Money laundering Solution
Confident Data Integration and QC with FME
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
Submitted by: Ala Berawi Sujod Makhlof Samah Hanani Supervisor:
Accelerate define.xml using defineReady - Saravanan June 17, 2015.
Montana Area Bridge Inventory GIS Project
BioGeomancer: Semi-automated Georeferencing Engine
Spatial Data Entry via Digitizing
Combining Geospatial Technology with Surveys
Mike Yost Collection / Data Manager
Geographic Search & Display Updates & Development Plans
Zoning Map Modernization with GIS
automated georeferencing
BioGeomancer: Semi-automated Georeferencing Engine
Presentation transcript:

Automated Georeferencing of Natural History Museum Data Nelson E. Rios Discussion The Tulane University Fish Collection, with 7.1 million fluid-preserved specimens in over 190,000 lots, collected from over 15,000 locations worldwide, is one of the largest collections in the world and is recognized as ”National Center of Ichthyology Resource Collection”. During the early 90's, the entire collection was computerized and georeferenced. Georeferencing the collection took nearly 2 years, requiring labor intensive lookups in a both paper and digital maps. This experience along with the resultant dataset of georeferenced information, became a test bed for the development of an automated georeferencing system for natural history information called GEOLocate. GEOLocate is a software tool that enables researchers to easily assign geographic coordinates to a descriptive string of locality information, visualize the location, and make corrections as necessary. Abstract It is estimated that the number of biological specimens in US museums and herbaria exceeds 750 million. In the vast majority of instances the collection location is recorded as a string of text and typically lacks geographic coordinates. We have developed a tool for interpreting descriptive locality text associated with natural history collections data, determining geographic coordinates and allowing the user to verify and correct the coordinates. Traditional methods for georeferencing collection data from text descriptions are tedious and time consuming, typically involving finding the locality on either a hardcopy or digital maps, plotting the locality and determining the coordinates. Using our tool, GEOLocate, considerably reduces the time required to georeference locality information. It took 1 staff member approximately 1.5 years to georeference the 15,000 unique locality descriptions within the Tulane fish collection. Time trials with GEOLocate suggest that this job could have been accomplished in under 6 months. Introduction Design: Natural Language Processing A locality description along with country, state and county information is input into GEOLocate. Georeferencing begins by standardizing the locality description string into a common terms format. For example, distances mentioned in a locality string are converted to miles. Once standardized the locality string is parsed into key geographic identifiers. Some example geographic identifiers used by GEOLocate include the occurrence of named places, navigable river miles, highway names, water body names, legal locations and displacement patterns. These identifiers within the string are used to determine geographic coordinates from database lookups and geographic calculations. The resulting coordinates are ranked based on the type of information found within the string and plotted on the digital map display for user verification, correction and error determination. Application: Test Bed Results Acknowledgements unique locality descriptions containing geographic coordinates were extracted from the TUMNH database and imported into GEOLocate. Of these, records were auto- assigned coordinates by GEOLocate within a 3 hour period. 36% of the georeferenced records were within 1 mile of the original coordinates. 83% of the records were within 15 miles of the original, permitting easy verification and correction on the map display. Time trials using GEOLocate average seconds to georeference, verify and correct a locality record. Using GEOLocate can significantly reduce the time required to georeference natural history data. GEOLocate was able to assign coordinates to over 98% of the locality data tested. This initial assignment of coordinates should only be considered a "rough" pass at the data and each record should be visually inspected and corrected as necessary. Locality records with incorrect or missing county information typically have greater error associated with resultant coordinates. This is due to the greater search area involved when county information is absent. Depending on the quality of the original locality data, georeferencing results can be improved by prior checks of misspelled, missing, incorrect, and/or ambiguous information within the locality dataset. % of TotalAccuracy 36%within 1 mile 66%within 5 miles 77%within 10 miles 83%within 15 miles 17%greater than 15 miles Features Drag and drop coordinate correction Option to ‘snap’ to found waterbody (U.S. only) Bridge crossing detection (U.S. only) Batch georeferencing File input via.xml,.csv or delimited.txt Polygon error determination Multiple coordinate determination Supports entire United States, Mexico and Canada Street level mapping for United States Overview plotting of input datasets I would like to thank the following for reviewing early versions of GEOLocate: James S. Albert, Jonathan Armbruster, Jeremy Bartley, Andy Bentley, Stephanie Coste, Paul David, Bud Freeman, John Friel, Tom Giermakowski, Robert Glaubitz, Sara J. Gottlieb, Brendan Haley, Chad Hargrave, Dean Hendrickson, Mikaela Howie, Denny Hugg, Janeen Jones, Edie Marsh, Kris McNyset, Jonathon Rothman, Barbara Scudder, Steph Smith and John Wieczorek. This research was supported by a grant from the National Science Foundation (DBI ).