Download presentation
Presentation is loading. Please wait.
Published byHarvey Lyons Modified over 9 years ago
1
Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names by Serena Coetzee scoetzee@cs.up.ac.za andscoetzee@cs.up.ac.za Magnus Rademeyer magnus@afrigis.co.zamagnus@afrigis.co.za presented at the ICC 2009, Santiago, Chile, November 2009
2
Overview Why Geocode? The Address Lifecycle Problem statement Address matching with a spatial adjacency match Test runs Results Conclusion Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009
3
¿Why Geocode? We geocode addresses to link attribute data to physical positions for the purpose of logistics, governance (elections, rates and taxes), customer database analysis (risk, trade area analytics) and many more…. Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009
4
The Address Lifecycle Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009 We geocode addresses to link attribute data to physical positions for the purpose of logistics, governance (elections, rates and taxes), customer database analysis (risk, trade area analytics) and many more….
5
Alphanumeric matching 101 Rubida Street, Murrayfield incorrectly matched to 110 Rubida Street, Murrayfield Problem statement Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009
6
Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009 Problem statement Alphanumeric matching by itself can cause errors (previous slide) Potential solution: attribute relaxation (i.e. ignore suburb) Most common cause of errors (Goldberg et al. 2007)
7
With spatial adjacency match Intiendo = alphanumeric matching + spatial adjacency match Improves geocoding results Alphanumeric match: propose matched address from reference dataset Above threshold? Yes, proposed matched address is an acceptable result No, search for street number in radius around proposed address Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009
8
With spatial adjacency match Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009 NO YES
9
With spatial adjacency match Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009
10
With spatial adjacency match Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009 1.Geocode without SpatialAdjacencyMatch (Non-spatial run) 2.Geocode with SpatialAdjacencyMatch enabled (Spatial run) Compare results
11
With spatial adjacency match Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009 Sample input address data 14,760 address records Test for misleading names Therefore include only addresses for which province, suburb, street name and street number are populated ProvinceTownSuburbStreet Name Street Number GautengJohannesburgSaxonwoldEngelwold Road19 GautengPretoriaAtteridgevilleSekukuni Street104 GautengMidrandNoordwykSagewood Avenue637
12
Gauteng Johannesburg Braamfontein Sandton Pretoria Die Wilgers Rubida Street 101 110111112 Lynnwood Road Waterkloof With spatial adjacency match Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009 Intiendo hierarchy database Reference dataset: AfriGIS address data
13
Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009 Intiendo settings Test runs
14
Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009 Results
15
Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009 Results 3% is low but improvement on bigger address sets can be significant (next slide), e.g. address on different sides of a highway Spatial runNon-spatial run Customer address records14,670 Matched address records8,905 (61%)8,514 (58%) Non-matched address records5,765 (39%)6,156 (42%)
16
Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009 Results Subsequent real life implementations on bigger datasets have yielded significantly improved results. In a dataset recently analysed for a major credit bureau, 21 million records were examined. Without Spatial adjacency 3.87 million were successfully geocoded automatically, with Spatial adjacency on, an additional 0.95 million were geocoded for a total of 4.82 million. Thus the spatial adjacent match yielded a 24.5% improvement.
17
Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009 Results Specific example SourceProvinceTownSuburbStreet Name Street Number 1InputGautengAlbertonNew RedruthVoortrekker Road16 2NSR Gauteng (100%) Alberton (100%) New Redruth (100%) Voortrekker Road (100%) 35 (96%) 3SR Gauteng (100%) Alberton (100%) South Crest (44%) Voortrekker Road (100%) 16 (100%)
18
Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009 35 Voortrekker Road 16 Voortrekker Road Results
19
Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009 Results If there are misleading suburb names in addresses, alphanumeric match by itself can cause errors. Intiendo = alphanumeric + spatial adjacency match More input addresses are matched more accurately Improves quality of results Sample test runs: 3% improvement Real life example: 24.5% improvement
20
Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place names, Serena Coetzee and Magnus Rademeyer, presented at the ICC 2009, Santiago, Chile, November 2009 Conclusion Intiendo address matching = alphanumeric string matching + spatial adjacency match Improves quality of results More addresses matched more accurately This work Specific sample dataset showed improvement Future More tests to understand average percentage improvement
21
Acknowledgements Christopher Ueckermann from AfriGIS for running the geocoding tests with Intiendo
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.