Geocoding Web Service Example Nancy Read Metropolitan Mosquito Control District
What is Geocoding? Address string Location coordinates 2099 University Ave. W ° St. Paul, MN ° Coordinates used to put point on map:
How we started... MMCD web map site Address look-up
Searches Parcels Gives Choices
Problems Engine, method – Spelling or order errors (West 5 th vs 5 th West) Data – Some parcels not addressed Maintenance – Load and process updates of parcel layer
Identified Need Robust geocoding engine Usable with web site (service) Cascading data sets – address points (when available) – parcels – streets (interpolation) Host for service, data Data maintenance plan
Internal resources available? MMCDs IT Department (note ½ people!) Need to contract out
Who else has need? solution?
Who needs a Geocoder? Web map applications – Map Quest – Local Government – King Maps Other applications – Public Safety Batch – State Government – Businesses
Find others with need, skills, interest Dave Bitner (MAC) Jim Maxwell (TLG) Mark Kotz (Met.Co.) Gordy Chinander (MESB) Bob Basquez (St. Paul) Chris Cialek & Jim Dickerson (LMIC) Kent Treichel (MN Dept. of Revenue) Nancy Read (MMCD)- Project Manager
Find resources Data – Managers of streets, parcels involved Hosting – MN Land Management Info. Center (LMIC) Engine, web service – Possible MetroGIS project funding
MetroGIS Geocoder Project Identified need Applied for MetroGIS project funding, received $14,000 Defined requirements RFP
Existing Geocoding Services Proprietary ($$) – ESRI – Envinsa – PxPoint Online Open Source – Geocoder.us – uses Tiger data ersity+Ave.+W.%2C+St.+Paul%2C+MN ersity+Ave.+W.%2C+St.+Paul%2C+MN+55104
PAGC Geocoder engine Used on large research tasks (e.g, geocoding all fast food retailers in Canada) Matching routines very good, better than others commonly available Handles the tough ones others cant, doesnt cascade too quickly to general layers like postal code Can handle problems with street ranges
PAGC geocoding engine Address Matching Algorithms Match addresses with reference address-ranged street network shapefile Starts with a rule-based Aho-Corasick driven standardization of both data sources.Aho-Corasick Reference data indexed using – BerkeleyDB b-trees for exact key lookups and soundex lookupssoundex – pointerless trie indexing scheme for edit distance lookups, adapted from ideas of Shang and Merrett.edit distanceShang and Merrett Match data records using standard Fellegi-Sunter method with modifications to permit similarity measures.Fellegi-Sunter
PAGC Geocoding engine Written in ANSI C Open Source, freely distributable (LGPL – Lesser General Public License) Has supporting web site, documentation and community, see Can be used on different resource data files (not just Tiger or proprietary)
MetroGIS Geocoder Project Rework PAGC from command line to Service – Pre-process resource street or other data files to build index files with standardized addresses, location data – Handle concurrent requests – Run under Apache, on Linux or Windows – Input – Urlencode – Output - XML, JSON, CSV
MetroGIS Geocoder Project Use applicable standards for parameter names, structures – OGCs OpenLS Location Utility Service specification, supported by ArcWeb and Oracle (XML-based) – FGDC Street Address Data Standard for structuring data (note: splits more than OpenLS)
MetroGIS Geocoder Project Geocoder engine returns location: – Latitude / Longitude – Decimal Degrees Conversion to UTM or state plane etc. to be done by downstream utility
Capabilities Cascading (points, parcels, streets) Geocode to Intersections – will be part of this project – e.g., University&Snelling Not included, to be added later: Landmarks Reverse Geocoding (Location Address)
MetroGIS Geocoder Project Set up service with Metro data – TLG Streets – 7-county Parcel Layer (centroids) – Address Points (as available) Host – LMIC – TLG – Can also install locally – need to pre-process data Expect demo by April 1
Testing planned Engine – Accuracy and hit rate – Performance and speed – Reliability as web service Hosting – Ease of set-up Data – Maintenance
Challenges Legal agreements – funding – licensing (engine, data) Code maintenance – Open Source may help Possible hosting issues if service is popular