Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 The GeoParser. 2 Overview What is a geoparser? –Software for the automated extraction of place names from text Why would you want one? –Document characterisation.

Similar presentations


Presentation on theme: "1 The GeoParser. 2 Overview What is a geoparser? –Software for the automated extraction of place names from text Why would you want one? –Document characterisation."— Presentation transcript:

1 1 The GeoParser

2 2 Overview What is a geoparser? –Software for the automated extraction of place names from text Why would you want one? –Document characterisation –Explicit geocoding of metadata making document inherently geographically searchable How? –‘bruteforce’ –rule based

3 3 Geo-spatial data “data that have some form of spatial or geo- graphic reference that enables them to be located in two- or three-dimensional space” Statistical Account of Scotland NUMBER XIII. PARISH OF CULLEN. (COUNTY OF BANFF, SYNOD OF ABERDEEN, PRESBYTERY OF FORDYCE.) By the Rev. Mr. ROBERT GRANT. Royalty, Extent, Climate, etc. CULLEN, as appears from old charters, was originally called Inverculan, because it stands upon the bank of the Burn of Cullen, which, at the N. end of the town, falls into the sea: but now it is known by the name of Cullen on- ly. Cullen is a royal burgh, formerly a constabulary, of which the Earl of Findlater was hereditary constable. The set, as it is called, of the council, consists of 19, in which num- ber are included the Earl of Findlater, hereditary preses, 3 bailies, a treasurer, a dean-of-guild, and 13 counsellors. The parish extends from the sea fouthward, about 2 English miles in length.

4 4 Input document Geoparse Review Output document Geoparsing Flowline

5 5 Geoparser architecture Web Interface geoXwalk Database Text Docs / web pages Parser : rule based place name id Downloadable metadata record xml, (gml?) Results Table / map preview 2. Geoparse 1.Inputs 3.Review 4.Ouputs

6 6 Demonstration

7 7 Broad Issues What’s a geoparser for? –Geo-referencing tool for enhancing metadata? –Text analysis tool? Areas for improvement –Need for more reliable geoparsing algorithms to disambiguate multiple occurrences of the same place name in the same text to develop automated feature typing Areas for improvement –Need for more reliable geoparsing algorithms to disambiguate multiple occurrences of the same place name in the same text –to develop automated feature typing Degree of user intervention - how ‘semi’ should semi- automatic be? –Interface design depends largely on the ‘accuracy’ of the parser and the user’s motivations ?

8 8 (An aside - Possible Solutions) Implement variety of parsing methods –user selects depending on use e.g. context based approach definitive place name matching against gazetteer Tools made available to user depend on type and number of documents and intended use. –Need to find balance between text analysis and user interaction e.g. Batch facility limited to certain document types and user selected parsing method - minimal user intervention.

9 9 Specific Issues The distinction between parser selected locations and gazetteer locations needs to be more explicit –no. of occurrences in text following geo-referencing? Users will be able to search the gazetteer and add records to output Addition of ‘rogue’ place names to the gazetteer –(Quality assurance issues)

10 10 Continued... Implementation of sorting functions to the results table Output options –currently preview results table –map view for geo-referenced place names –file download required formats (xml, gml?) Original document marked up in html(?)

11 11


Download ppt "1 The GeoParser. 2 Overview What is a geoparser? –Software for the automated extraction of place names from text Why would you want one? –Document characterisation."

Similar presentations


Ads by Google