Download presentation
Presentation is loading. Please wait.
Published byAshlie King Modified over 9 years ago
1
Geographic Parsing Gem: Bplgeo Presented By: Steven Anderson Boston Public Library (BPL) sanderson@bpl.org
2
Why? Geographic data is useful for placing on a map, faceting, and general searching. Unfortunately, most geographic data is unstructured and lacks standardization. Horses--Massachusetts--Hadley--Hockanum Congregational Church in Halifax (Halifax, Mass.) Washington DC (airport) This data turns out very chaotic when you harvest from multiple institutions like we do. Need structured data!... so we made a gem.
3
Example Parsing LCSH Subject
4
Getting TGN or Geonames entry
5
So Given Nothing More Than: “Faneuil Hall, Boston, Massachusetts”
6
North and Central America United States Massachusetts Suffolk Boston 42.35,-71.05 Faneuil Hall, Boston, Massachusetts 42.3600619,-71.056103
7
Geographic Parsing Caveat
8
International LCSH Subject
9
International LCSH Subject 2
10
Big Caveat Zero support for locations that have changed over time. For example:
11
Geonames Geonames has bounding boxes! Very specific locations (like Boston Public Library) represented! Potential linked data to other sources like LOC Subject Headings or Wikipedia! But... any specific place (like BPL) aren't associated with a city in the hierarchy. Limited testing showed Geonames has about a ~25% rate of containing some piece of bad data. Even BPL itself listed in wrong county. Linked data to other sources only there if you manually add it.
12
Getty Thesaurus of Geographic Names (TGN) TGN is stable and well-researched! Very detailed hierarchy! Rarely anything more specific than neighborhoods represented. Point coordinate data only for locations. Unable to be edited to contain more information than they currently provide.
13
A Flower By Any Other Name... Moved to projecthydra-labs per some email indications this would be alright? http://github.com/projecthydra-labs/Bplgeo Unsure on name now... but even better may be to integrate this into Questioning Authority? Would need a developer sit down to figure out “parsing interface” for that gem.
14
Caveats Should you attempt use this on all text fields in your system just in case they contain geographic data? Short answer: no. One must review the source content first for what is appropriate. Still requires constant spot checking of results whenever new collections are processed.
15
Contacts / Final Links Steven Anderson (sanderson@bpl.org) Eben English (eenglish@bpl.org) Github: http://github.com/projecthydra-labs/Bplgeo http://github.com/boston-library/ Digital Repositoy: https://www.digitalcommonwealth.org
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.