Historical Gazetteer Integration: CHGIS, Regnum Francorum & GeoNames Working Digitally with Historical Maps AAG 2012 Merrick Lex Berman & Johan Åhlfeldt.

Slides:



Advertisements
Similar presentations
Reference Model Ideas. Geospatial Semantics and Ontology Reference Model Metadata Data Sources Underlying Ontologies Semantic and Ontology Services Ontology.
Advertisements

Japan Historical GIS Lex Berman - Harvard Yenching Inst.
Alexandria Digital Library Project Integration of Knowledge Organization Systems into Digital Library Architectures Linda Hill, Olha Buchel, Greg Janée.
Geographical Information Retrieval Instituto Superior Técnico - INESC-ID Data Management and Information Retrieval Group (DMIR) - TagusPark Por Bruno Martins.
Cultural heritage asset management plans (CHAMP) Project Project data sources, transformation and submission Jay Carver- Task Manager.
California Environmental Resources Evaluation System Environmental Information Sharing and Integration.
Semantic Matching of candidates’ profile with job data from Linkedln PRESENTED BY: TING XIAO SARABPREET KAUR DHILLON.
Alexandria Digital Library Project University of California, Santa Barbara.
Persistence of Identity in Historical Gazetteers Merrick Lex Berman - CHGIS, Center for Geographic Analysis.
Linking Geodata for Research 29 th March 2012 BCS – ISKO meeting Jo Walsh – EDINA
Geographical Service: Gianluca Correndo, Manuel Salvadores, Yang Yang, Nicholas Gibbins, Nigel Shadbolt A compass for the Web of Data.
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
1 The GeoParser. 2 Overview What is a geoparser? –Software for the automated extraction of place names from text Why would you want one? –Document characterisation.
Welcome to EDINA Digimap Digimap is an EDINA service offering online access to a range of spatial data. It is authenticated using Athens and is available.
A Survey of Web Mapping Part 3: Linking DARMC and PLEIADES Guoping Huang,
Georeferencing Historical Placenames and Tracking Changes Over Time Georeferencing Workshop Harvard University 21 March 2008 Lex Berman China Historical.
Welcome to EDINA Digimap Digimap is an EDINA service offering online access to a range of spatial data. It is authenticated using the UK Federation and.
Joint Information Systems Committee Supporting Higher and Further Education Development of an Information Environment for UK Learning and Teaching NOF-Digitise.
Looking Forward Mike Goodchild. Where is ESRI going? 9.0 –massively expanded toolbox –script management and metadata –Python, JScript, Perl –visual modeling.
Modeling Spatio-Temporal Networks with CHGIS Lex Berman 贝明远 CHGIS, Harvard University 2 nd Intl Workshop on Monies, Markets & Finance in East Asia Ruhr-University.
EventBook What – An Android based Mobile App. Using Social Networking APIs Who – Every mobile user specially targeted to the age group of 16 – 40 Why –
Persistence and Scale in Historical Gazetteers Merrick Lex Berman - CHGIS, Center for Geographic Analysis.
Retrieving Location-based Data on the Web Andrei Tabarcea,
Determining and Mapping Locations of Study in Scholarly Documents: A Spatial Representation and Visualization Tool for Information Discovery James Creel.
Teaching and Learning with Technology  Allyn and Bacon 2002 Administrative Software Chapter 5 Teaching and Learning with Technology.
Stepping up to a nontraditional challenge.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Matching school attendance boundaries with schools from CCD dataset.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
MADGIC is… MAPS and ATLASES DATA: NUMERIC and GEOSPATIAL (for use with special software) GOVERNMENT INFORMATION (parliamentary and other official reports,
MADGIC is… MAPS and ATLASES DATA (NUMERIC and GEOSPATIAL) for use with special software GOVERNMENT INFORMATION (parliamentary and other official reports,
Florida’s Ephemeral Cities Erich Kesse March.
Beyond Co-occurrence: Discovering and Visualizing Tag Relationships from Geo-spatial and Temporal Similarities Date : 2012/8/6 Resource : WSDM’12 Advisor.
Interoperability through Library APIs Library Technology Services Open House 7/30/15.
 Yingjie Hu, PhD student  Space and Time Knowledge Organization Lab  Department of Geography, UCSB  Summer intern, APL  Sathya Prasad  Lead and.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
A Multilingual Geographic Feature Classification Index for China and Japan Lex Berman, CHGIS, Harvard Yenching Institute PNC – ECAI, Osaka, Sep 2002.
A BOUT USING OF RETROSPECTIVE GEOCODING FOR GEOGRAPHICAL SEARCH IN DIGITAL LIBRARIES Skachkov D.M., Zhizhimov O. L. Institute of Computational Technologies.
Extracting Metadata for Spatially- Aware Information Retrieval on the Internet Clough, Paul University of Sheffield, UK Presented By Mayank Singh.
SE Coastal Network Water Quality Inventory & Monitoring Program Database Development Wade Sheldon & John Carpenter Dept. of Marine Sciences University.
Introduction to GIS For Slavic Humanists, Social Scientists and Librarians 2005 Slavic Digital Text Workshop Eileen Llona, University of Washington.
The Prajna Project Utilities for Understanding Edward Swing.
Alexandria Digital Library Project Introduction ---- Digital Gazetteers Integration into Distributed Library Services JCDL 2002 Workshop Sponsored by Networked.
By Bryan Gentry DIRECTIONAL WELL PLANNING & PROJECT DATABASE.
U.S. Department of the Interior U.S. Geological Survey A Consideration of Geospatial Feature Formation in Linked Open Vocabularies Workshop on Linked Open.
1. Data providers deliver metadata records that describe their datasets through OGC catalogue services for the web. Each metadata record uses keyword concepts.
GeoNames is … Gazetteer aggregator of open geo data I am... Marc Wick GeoNames.
Data Creation and Editing Based in part on notes by Prof. Joseph Ferreira and Michael Flaxman Lulu Xue | Nov. 3, :A Workshop on Geographical.
ALPHA a framework to support collaborative research Matt Bertrrand
Historical Gazetteer Integration: CHGIS, Regnum Francorum & GeoNames Working Digitally with Historical Maps AAG 2012 Merrick Lex Berman & Johan Åhlfeldt.
Integrating CHGIS with other Gazetteer Standards and Classification Systems Merrick Lex Berman, Harvard Yenching Institute PNC – ECAI, Guadalajara Dec.
GeoCrossWalk Use Cases. Reference use Information server Searching (1) Geo-parsing & indexing The GeoCrossWalk Server GeoCrossWalk use cases Searching.
China Historical GIS: Methods for Georeferencing Historical Data PNC – ECAI, Osaka, Sep 2002 Merrick Lex Berman.
U.S. Environmental Protection Agency Central Data Exchange Pilot Project Promoting Geospatial Data Exchange Between EPA and State Partners. April 25, 2007.
CHGIS: China Historical GIS Project Merrick Lex Berman PNC - ECAI, Hong Kong, Jan 2001.
Google Map API The Google Maps API lets you embed Google Maps in your own web pages with JavaScript The API provides a number of utilities for manipulating.
Alexandria Digital Library The ADL Testbed Greg Janée
Why Is It There? Chapter 6. Review: Dueker’s (1979) Definition “a geographic information system is a special case of information systems where the database.
All rights reserved. © 2009 Tableau Software Inc. Advanced Mapping Techniques Austin Dahl, Dirk Karis, Robert Morton Tableau Software.
Essex Insight Introduction to Essex Insight Training Guide Source: Research and Analysis Unit v4.
Alexandria Digital Library ADL Metadata Architecture Greg Janée.
Introduction Most samples in Household Travel Surveys (HTS) complete via web Geocoding is an important element in HTS collection Online geocoding services.
6 ~ GIR.
A Restaurant Recommendation System Based on Range and Skyline Queries
RDF Standard Data Model Exchange
Agenda (AM) 9:30-10:15 Introduction to RDA
人文地理領域的基礎網絡設施 The Cyber Infrastructure For GeoHumanities: 廖泫銘 研究副技師
Tantan Liu, Fan Wang, Gagan Agrawal The Ohio State University
Vancouver Public Library
Presentation transcript:

Historical Gazetteer Integration: CHGIS, Regnum Francorum & GeoNames Working Digitally with Historical Maps AAG 2012 Merrick Lex Berman & Johan Åhlfeldt Harvard University Regnum Francorum

Geographic Info Retrieval [GIR] Search for Place Names, features, and / or locations Queries to existing gazetteer services work for: names, feature types, footprints, administrative districts… GeoTemporal Info Retrieval [GTIR] Search for Place Names, events, and / or locations with dates Queries to existing gazetteers don't work because: dates are optional or entirely missing

Extend GeoNames Schema By Adding Dates

Testbed Methodology - Automate process for matching placenames via open API - Test methods for geonomial and geospatial matching - Discover problems in current datasets, schemas - Evaluate matching results - Propose a schema for exchange of historical placename data - Recommend the types of attributes needed to refine matching

Existing Gazetteers as Web Services

Linked Geographic Data on the Semantic Web

Existing Historical Gazetteers CHGIS Gazetteer of 50,000+ historical placenames Administrative Hierarchy Points have description of “present location” Regnum Francorum Online Gazetteer of 10,000+ historical placenames Administrative Hierarchy Includes explicit links to GeoNames ID Vision of Britain Gazetteer of 120,000+ placenames Administrative Unit Ontology Query API or tabular datasets ?

State of the Art - Locations in Classical Texts and Atlases Perseus Project Pleiades Google Ancient Places Barrington Atlas -> Digital Atlas of Roman and Medieval Civilization

State of the Art - Web Services for historical placenames Unlock: geoparsing (place name text mining and mapping) Chalice: creating linked data historic gazetteer through text mining DEEP: extend Chalice by digitizing the 86 volumes of the English Place Name Survey Unlock web service only - UK only Chalice Schema not published DEEP plans to develop schema for historical place name queries

Gazetteer Augmentation Queries directed to different resources Results augment existing records Geofeature Set

Sample data for testbed: CHGIS Historical placename: Tengyue Ting Present placename: 云南腾冲县 Romanized present placename: yunnantengchongxian Time series county seats: 403 placename records

Sample data for testbed: Regnum Francorum, France Historical placename: Éclance GeoNames ID: France places in RFO: 4164 placename records

GIS sampling of CHGIS names and GeoNames Distribution of study sample 2km and 8km buffers + GeoNames

GIS sampling of France names and GeoNames Distribution of study sample 2km buffers + GeoNames

Matching process

Matching algorithm steps, including pre-processing - Trim white spaces, apostrophes, & a few other problem chars - Trim the source string to the first FIVE chars - Test the string match in BOTH directions - Reverse geocode x, y with Google API - Save locality name from Google API JSON results - Save administrative district info from Google API JSON results - Haversine distance calculations between points (and to Google Point) - Check geospatial results within chosen buffer distances (2km, 8km) - Check geonomial results for string matches in both directions - Augment the original gazetteer record with the information gathered

Why trim to five characters? Why match in both directions? "Henansheng Puyang Shi Puyang Xian Qinghetou Xiang" to "Puyang" After special chars trim: "HenanshengPuyangShiPuyangXianQinghetouXiang" to "Puyang" After first five chars trim: Source to Target: "Henan" to "Puyang" no match "Puyan" to "HenanshengPuyangShiPuyangXianQinghetouXiang" match

Results based on distance filtering

Results based on string matching within buffer distances Main discovery: Historical placenames not often found in GeoNames alternates Much more likely for current GeoName to be found in historical gazetteer record when “present location” is attested

Linda Hill - ADL Schema for the augmented gazetteer

Linda Hill - ADL Schema for the augmented gazetteer

Augmented gazetteer for faceted geo-temporal queries - Placenames - Dates of existence (or related time periods) - Administrative jurisdictions (past and present) - Alternative footprints

Publications & Resources