Download presentation
Published byMelvyn O’Brien’ Modified over 9 years ago
1
Creating a geodemographic system and more: OAC and the way forward
Centre for Advanced Spatial Analysis, UCL, 10th December 2008 Creating a geodemographic system and more: OAC and the way forward Dr Dan Vickers Social and Spatial Inequalities Group, Department of Geography, University of Sheffield
2
What are you in for? Creating the Output Area Classification (OAC)
Geodemographics over time Measuring Segregation with geodemographics 15/04/2017 © The University of Sheffield
3
Why classify areas? “All the real knowledge which we possess depends on methods by which we distinguish the similar from the dissimilar. The greater number of natural distinctions this method comprehends the clearer becomes our idea of things. The more numerous the objects which employ our attention the more difficult it becomes to form such a method and the more necessary.” (Linnæus 1737) 15/04/2017 © The University of Sheffield
5
The myths of geodemographics
The more data the better? The smaller the areas the more meaningful? These are distinct groups? A true depiction of reality?
6
The Objective in Creating OAC
To provide a publicly accessible classification, free at the point of use for researchers, policy makers and the public, avoiding the expense of commercial classifications. To provide a transparent and replicable methodology with the input data as well as the classification output. To stimulate value added research either using the classification or the raw data. 15/04/2017 © The University of Sheffield
7
The seven steps of cluster analysis
Clustering elements (objects to cluster, also known as “operational taxonomic units”) Clustering variables (attributes of objects to be used) Variable standardisation Measure of association (proximity measure) Clustering method Number of clusters Interpretation, testing and replication adapted from Milligan, G. W. (1996) Clustering validation: Results and implications for applied analyses. in Arabie, P., Hubert, L. J. and De Soete, G. Eds., Clustering and Classification. Singapore: World Scientific.
8
What went in Output Areas are the smallest area for general census output. 223,060 in the UK England & Wales Number of OAs: 174,434, Min size: 40 households, 100 people, Mean size: 124 households, 297 people Scotland Number of OAs: 42,604, Min size: 20 households, 50 people, Mean size: 52 households, 119 people Northern Ireland Number of OAs: 5,022, Min size: 40 households, 100 people, Mean size: 125 households, 336 people
9
What went in 41 Census Variables covering:
Demographic attributes: Including - age, ethnicity, country of birth and population density Household composition: Including - living arrangements, family type and family size. Housing characteristics: Including - tenure , type & size, and quality/overcrowding Socio-economic traits: Including - education, socio-economic class, car ownership & commuting and health & care. Employment attributes: Including - level of economic activity and employment class type.
10
Standardising the variables
Log Transformation Reduce the effect of extreme values Range Standardisation (0-1) Problems will occur if there are differing scales or magnitudes among the variables. In general, variables with larger values and greater variation will have more impact on the final similarity measure. It is necessary therefore to make each variable equally represented in the distance measure by standardising the data.
11
Clustering the data K-means clustering
K-means is an iterative relocation algorithm based on an error sum of squares measure. The basic operation of the algorithm is to move a case from one cluster to another to see if the move would improve the sum of squared deviations within each cluster (Aldenderfer and Blashfield, 1984). The case will then be assigned/re-allocated to the cluster to which it brings the greatest improvement. The next iteration occurs when all the cases have been processed. A stable classification is therefore reached when no moves occur during a complete iteration of the data. After clustering is complete, it is then possible to examine the means of each cluster for each dimension (variable) in order to assess the distinctiveness of the clusters (Everitt et al., 2001).
12
Issues of cluster number selection
Issue 1: Analysis of average distance from cluster centres for each cluster number option. The ideal solution would be the number of clusters which gives smallest average distance from the cluster centre across all clusters. Issue 2: Analysis of cluster size homogeneity for each cluster number option. It would be useful, where possible, to have clusters of as similar size as possible in terms of the number of members within each. Issue 3: The number of clusters produced should be as close to the perceived ideal as possible. This means that the number of clusters needs to be of a size that is useful for further analysis.
13
Issues of cluster number selection
First Level target 6, 7 selected based on analysis of, average distance from cluster centre and size of each cluster. Second Level target 20, 21 selected based on analysis of, average distance from cluster centre and size of each cluster. Third Level target 50, 52 selected based on size of each cluster. Split into either 2 or 3 groups
14
Creating a hierarchy
15
1: BLUE COLLAR COMMUNITIES
1a: Terraced Blue Collar 1b: Younger Blue Collar 1c: Older Blue Collar 2: CITY LIVING 2a: Transient Communities 2b: Settled in the City 3: COUNTRYSIDE 3a: Village Life 3b: Agricultural 3c: Accessible Countryside 4: PROSPERING SUBURBS 4a: Prospering Younger Families 4b: Prospering Older Families 4c: Prospering Semis 4d: Thriving Suburbs 5: CONSTRAINED BY CIRCUMSTANCES 5a: Senior Communities 5b: Older Workers 5c: Public Housing 6: TYPICAL TRAITS 6a: Settled Households 6b: Least Divergent 6c: Young Families in Terraced Homes 6d: Aspiring Households 7: MULTICULTURAL 7a: Asian Communities 7b: Afro-Caribbean Communities
16
1: Blue Collar Communities
17
2: City Living
18
3: Countryside
19
4: Prospering Suburbs
20
5: Constrained by Circumstance
21
6: Typical Traits
22
7: Multicultural
23
Alternative Profiles
24
Mapping the Classification
25
Geodemographics over time
26
Variables V01: Age 0-4 V02: Age 5-14 V03: Age 25-44 V04: Age 45-64
V06: Indian, Pakistani & Bangladeshi V07: Black African, Black Caribbean & Black Other V08: Born Outside the UK V09: Unemployed V10: Working part-time V13: Economically inactive looking after family V14: No central heating V16: Rent (private) V17: Rent (public) V18: 2+ Car Households V20: Flats V21: Detached V22: Terraced V23: Lone parent household V24: Single pensioner household V25: Single person (not pensioner) household V26: Population Density 15/04/2017 © The University of Sheffield
27
Methodology England only
1991 data assigned to 2001 Output areas Simpson (2002), Norman et al. (2003) Clustered based on the 2001 data 1991 data assigned to centroids created from 2001 data 15/04/2017 © The University of Sheffield
28
The clusters were named as follows:
Variable Cluster 1 2 3 4 5 6 7 Age 0-4 .17 .24 .18 .31 .19 .26 Age 5-14 .13 .23 .29 .22 .20 Age 25-44 .47 .41 .30 .38 .34 .36 .33 Age 45-64 .50 .28 .43 Age 65+ .15 .10 .21 Indian, Pakistani & Bangladeshi .05 .03 .01 .44 .02 .06 Black African/ Caribbean / Other .07 .00 .14 .16 Born Outside the UK .08 .39 Unemployed .12 Working part-time .32 Looking after family .25 No central heating .04 Rent (private) Rent (public) .11 .67 .52 2+ Car Households .57 Flats .68 .77 Detached Terraced .66 Lone parent household Single pensioner household Single person (not pensioner) hhold .09 Population Density The clusters were named as follows: 1. Urban Melting Pot 2. Mixed Communities 3. Out in the Sticks 4. Asian Influence 5. Middle Class Achievers 6. Down and Out 7. Working Class Endeavour
29
Changing prevalence of types
1991 Frequency Percent 2001 Change Freq Change % 1. Urban Melting Pot 11,350 6.9 15,392 9.3 4,042 2.4 2. Mixed Communities 21,949 13.2 25,910 15.6 3,961 3. Out in the Sticks 32,915 19.9 35,704 21.6 2,789 1.7 4. Asian Influence 5,039 3.0 4,930 -109 5. Middle Class Achievers 50,578 30.5 46,791 28.2 -3,787 -2.3 6. Down and Out 8,257 5.0 10,878 6.6 2,621 1.6 7. Working Class Endeavour 35,577 21.5 26,060 15.7 -9,517 -5.8 Total 165,665 100.0 0.0
30
Cluster results 1991 1 2 3 4 5 6 7 Total 2001 1: Urban Melting Pot
9,200 1,318 195 467 2,245 1,116 851 15,392 2: Mixed Communities 779 15,991 237 432 3,445 76 4,950 25,910 3: Out in the Sticks 23 206 26,561 8,444 8 456 35,704 4: Asian Influence 153 519 3,699 117 129 310 4,930 5: Middle Class Achievers 482 2,628 5,658 103 33,044 27 4,849 46,791 6: Down and Out 618 126 29 204 505 6,401 2,995 10,878 7: Working Class Endeavour 95 1,161 232 128 2,778 500 21,166 26,060 11,350 21,949 32,915 5,039 50,578 8,257 35,577 165,665
31
London Sheffield
32
Tyneside Surry
33
Measuring Segregation with geodemographics
34
Segregation Index Calculates the number of output areas that would need to be moved between Local Authority Districts to have an equal distribution of OAC types across the country
35
Segregation index The index of segregation, as used here, is the index of dissimilarity between a group and the whole of the population. This is the index we use to measure the degree to which a group is spatially polarised. The index of dissimilarity between two groups is half the sum over all areas of the absolute differences between each group divided by its respective total population. It ranges between 0 and near 100%. where Da is the index between groups a and b (b in our case is the population as a whole), Pi is the population of area i of group a, * represents all areas, and N is the number of areas being considered 15/04/2017 © The University of Sheffield
36
Most Segregated types Blue Collar Communities 26.5% City Living 48.8%
Countryside 53.3% Prospering Suburbs 21.4% Constrained by Circumstances 28.8% Typical Traits 22.7% Multicultural 67.7%
37
Where is most geodemographicly segregated
15/04/2017 © The University of Sheffield
38
Segregated Places Most Least 1 Newham 89.3 2 City of London 87.0 3
Hackney 82.7 4 Westminster 5 Tower Hamlets 82.6 6 Lambeth 82.5 7 Kensington and Chelsea 82.2 8 Camden 82.0 9 Hammersmith and Fulham 10 Islington 81.7 425 Dacorum 16.3 426 Wellingborough 15.0 427 Welwyn Hatfield 14.8 428 Gravesham 14.5 429 Cardiff 14.3 430 Sheffield 13.8 431 Leeds 11.6 432 Peterborough 11.2 433 North Hertfordshire 10.5 434 Epping Forest 9.5
39
Thank you D.Vickers@sheffield.ac.uk Areaclassification.org.uk
Sheffield.ac.uk/SASI To add or change a slide background picture: 1. On the Format menu, click Slide Background. 2. On the pop-up menu under Background fill, click Fill Effects, and then click the Picture tab. 3. Click Select Picture, find the folder that contains the picture you want (preferably one saved at 72dpi), click the file name, click Insert, and then click OK. 4. To apply the change to the current slide, click Apply. (If the menus in your version of PowerPoint are different, look up “background picture” in the PowerPoint Help menu.) 15/04/2017 © The University of Sheffield
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.