Download presentation
Presentation is loading. Please wait.
1
David Martin Department of Geography University of Southampton 2001 Census: the emergence of a new geographical framework
2
Overview Background issues Postcode building blocks Output areas by automated zone design –Zone design experiments –Illustrative results –Demonstrator project Application to SAM specification A new project… Conclusions
3
Background issues 1991: EDs designed for data collection, but used for both data collection and output 2001: separation of collection and output geographies - purpose-specific geographies New output areas built from synthetic unit postcode polygons Application of automated zone design (after Openshaw, 1977)
4
Postcode building blocks Approx 1.7m unit postcodes Aggregation of these small building blocks into output areas (OAs) ensures best census-postal geography match No pre-existing polygons, (exc. Scotland) NISRA to digitize, ONS to generate OS to create separate new product !!
5
Generation of postcode polygons (1) Thiessen polygons around individual ADDRESS- POINTS, clipped to statutory boundaries and topographic features
6
Generation of postcode polygons (2) Boundaries dissolved between adjacent address polygons with common postcode, to form postcode polygons
7
OA design methodology Automated zoning procedures derived from Openshaw (1977)… Variety of alternative approaches Computationally intensive, iterative search for ‘best’ solution to the zoning problem, given a set of constraints Not feasible in previous data and computing environments
8
Output areas by automated zone design Initial Random Aggregation of Building Blocks Initial Random Aggregation of Building Blocks Iterative Recombination Design Constraints (Contiguity, Thresholds, Shape, Size, Homogeneity) Design Constraints (Contiguity, Thresholds, Shape, Size, Homogeneity) 2001 Output Areas 2001 Output Areas
9
OA design (1) Initial random aggregation of postcodes into potential output areas
10
OA design (2) Choose one postcode at random as candidate for swapping into a different output area
11
OA design (3) Make the swap and evaluate the impact on the overall solution
12
OA design (4) If swap does not result in an improvement, go back to the previous configuration
13
OA design (5) Choose another postcode at random as candidate for swapping into another output area
14
OA design (6) If the swap results in an overall improvement, keep it as part of the solution and examine a new potential swap…
15
Constraints (1) Contiguity: output areas from adjacent postcodes (NB problem of stacks) Thresholds: output areas above population thresholds (NB problem of sub-threshold parishes) Shape: output areas should be as compact as possible minimize perimeter 2 /area
16
Constraints (2) Size: output areas should be as uniformly sized as possible - avoiding very large and very small populations minimize (OApop-target) 2 Homogeneity: output areas should be as socially uniform as possible existing ONS tenure-based measure maximize intra-area correlations
17
Intra-area correlation Measures similarity of values within any area of interest (Holt et al., 1996; Tranmer and Steel, 1998) Higher correlation: greater homogeneity (theoretical maximum of 1.0) Can be computed for a single category (eg. ‘owner occupied’ or for multi-category variables Tenure and dwelling type tested in project
18
Zone design experiments ONS postcode polygons for test areas Populated with plausible synthetic populations by iterative sampling of SAR individuals (PCs structured by dwelling type) Test OAs constructed using alternative combinations of design constraints: (OApop only; OApop+shape; OApop+homog; OApop+shape+homog)
19
Illustrative results (urban/rural) EDOAOA(SH)PC n1713173242529 x(s)308166(44)162(54)21 d(ten)0.0030.0390.0400.030 d(dwe)0.0030.1530.1630.388 n1423943923477 x(s)427154(37)155(42)17 d(ten)0.0060.0280.0330.011 d(dwe)0.0070.1640.1810.416
20
1991-style EDs EDOAOA(SH)PC n1713173242529 x(s)308166(44)162(54)21 d(ten)0.0030.0390.0400.030 d(dwe)0.0030.1530.1630.388 n1423943923477 x(s)427154(37)155(42)17 d(ten)0.0060.0280.0330.011 d(dwe)0.0070.1640.1810.416
21
Unit postcodes EDOAOA(SH)PC n1713173242529 x(s)308166(44)162(54)21 d(ten)0.0030.0390.0400.030 d(dwe)0.0030.1530.1630.388 n1423943923477 x(s)427154(37)155(42)17 d(ten)0.0060.0280.0330.011 d(dwe)0.0070.1640.1810.416
22
2001-style Output Areas EDOAOA(SH)PC n1713173242529 x(s)308166(44)162(54)21 d(ten)0.0030.0390.0400.030 d(dwe)0.0030.1530.1630.388 n1423943923477 x(s)427154(37)155(42)17 d(ten)0.0060.0280.0330.011 d(dwe)0.0070.1640.1810.416
23
Project website http://www.geog.soton.ac.uk/research/oa2001/
24
Demonstrator data…
25
Application to SAM specification Proposal for small area microdata (SAM) – more spatial, less attribute detail than SARs Use wards as building blocks, target SAM areas 7-10k population Same procedures as for postcode to OA Subsequent splitting of ‘superwards’
26
Hampshire wards n = 235 mean = 5872 min = 996 max = 15684 Portsmouth Basingstoke Southampton
27
Hampshire SAM areas @ 5k n = 176 mean = 8230 min = 5035 max = 15684
28
Hampshire SAM areas @ 15k n = 66 mean = 22835 min = 15170 max = 51368
29
A new project… Problem of matching two sets of areal units: –1991 ED data for 1981 EDs? –2001 OA data for 1991 EDs? Various approaches possible: –Individual-level data within Census Offices –Lookup table approximations –Areal interpolation (various) Which is best matching configuration?
30
A new project: automated zone matching More general computational problem: Given two boundary sets and some target zone characteristics, find the optimal match Can be conceptualized as a modified AZP process (iterative, computationally intensive, general purpose problem) Automatic tool when no lookup tables etc.
31
First boundary set Take a familiar area: Boundary set A eg. 1991 EDs A1 A2 A3 A4
32
Secondary boundary set For the same area: Boundary set B eg. 2001 OAs B1 B4 B2 B3 B5
33
Full intersection Intersect A and B Clean topology A1B1 A2B1 A2B2 A2B3 A1B4 A3B1 A3B4 A4B5
34
Set up automated zone matching Set up design criteria: equality of population size, area, density, etc. Adjust weight for ancillary variable Set one zone as source which must be maintained (eg. that for which data are available) Set up initial random aggregation incorporating true matches Over to (modified) AZP…
35
Alternative solutions… Solution 1: perfect match maintaining all zones complete eg. creation of census tracts O1 = A1+A2+A3 = B1+B2+B3+B4 O2 = A4 = B5
36
Alternative solutions… Solution 2: boundary set B unbroken, closest match to A eg. creation of lookup tables, local approximations O1 = B1 A1 O4 = A4 = B5 O3 = B4 A3 O2 = B2+B3 A2
37
Conclusions Major application of geographical technique developed 20+ years ago Multiple purpose-specific geographies – generated from existing spatial data Multiple applications of the same approach –Census output areas –SAM areas –Generic geography matching
38
Demonstrator RSS meeting: Nov 2000
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.