Download presentation
Presentation is loading. Please wait.
Published byAmber Rodgers Modified over 9 years ago
1
T HE DEVELOPMENT OF A NEW MIGRATION CLASSIFICATION FOR LOCAL AUTHORITY DISTRICTS IN B RITAIN Presentation to the Office for National Statistics – 5 th May 2009 Adam Dennett Centre for Interaction Data Estimation and Research School of Geography University of Leeds
2
P RESENTATION O UTLINE Background: area classifications and their use in migration research Introduction to current research. Classification rationale and considerations for a migration classification. Developing a migration classification - methodology and issues Results – a new migration classification for local authority districts Conclusions and future research agenda
3
B ACKGROUND Classification of areas according to the individuals residing within has a long history: – Charles Booth 19 th Century London – Burgess and the Chicago School in 1920s – CACI (ACORN), Experian (MOSAIC), Eurodirect (CAMEO) – UK government (ONS) since the 1960s. Recent classifications for Local Authorities, Wards, SOAs, OAs & Health Authorities. Area Classifications
4
B ACKGROUND Dennett and Stillwell (2008) Population Trends 134 Dennett and Stillwell (2009) Population Space and Place – available online ‘Early View’ 6 th May Dennett and Stillwell (forthcoming) in Stillwell, J., Duke-Williams, O. and Dennett, A. (eds) Technologies for Migration and Commuting Analysis: Spatial Interaction Data Applications. Area classifications and migration research
5
At a relatively aggregate (Family) level, it can be argued that the process of counterurbanisation is continuing. At Group and Class levels, however, this is less so. There is evidence for a two tier ‘rural’ when Commuter belt is taken into consideration (even outside of city regions). BUT…Everything varies by age Net migrants and net migration rates by district classification – all ages, 2000-01
6
A MIGRATION DATA CLASSIFICATION Interaction data are inherently more complex than standard count data – origin and destination. One of the principal purposes of any classification is to simplify complex data to aid understanding. District level classifications such as those developed by ONS and Vickers et al. have been created without the use of migration/commuting variables. Underlying populations in areas may or may not be representative of migrant populations; thus classifying areas separately by their migrants may reveal things not shown in other general purpose classifications. Rationale – Classification to aid understanding
7
A MIGRATION DATA CLASSIFICATION An new area classification will be useful as a framework for monitoring migration measured from other data sources post-2001 Further research need not be limited to using the classification directly: – The development of any classification necessitates the evaluation and selection of variables. Any variables selected are likely to be more helpful in explaining the migration system and thus useful for future migration research – e.g. the development of projection models. Rationale – Classification as part of the research process
8
A MIGRATION DATA CLASSIFICATION Where the processes of internal migration in Britain are complex, our understanding can be enhanced through examining the particular characteristics migrants and migrant flows can give to defined areas. Britain can be usefully classified by the types of migrant and the particular flows that they exhibit to, from and within defined geographical areas, and that classified areas will be distinct in the profiles they exhibit. Research Hypothesis
9
C ONSIDERATIONS FOR THE NEW CLASSIFICATION Interaction data available at variety of scales in UK (OA to Region), although more variables available at coarser scales. Census provides more variables at a wider range of scales than any other data source in the UK. Finer grained data less attribute rich and more prone to small count perturbation. A classification of coarser areal units may encourage more false ecological inferences. Data at some levels not available for same units of measurement in UK (districts – Britain, Parliamentary constituencies – NI). Scale
10
C ONSIDERATIONS FOR THE NEW CLASSIFICATION Internal migration data available from a variety of sources disaggregated by numerous variables (especially when census data are used). Largest influence on population change in the UK. International migration data available from fewer sources, although census data are directly comparable with internal migration census data. Evidence for linkage with internal migration (Stillwell and Duke- Williams, 2005). Commuting data are similarly available from census and non-census sources. Phenomenon closely related to migration – especially where longer distance, longer stay commutes are concerned. Internal migration, International migration, Commuting?
11
C ONSIDERATIONS FOR THE NEW CLASSIFICATION It could be argued that the defining element of interaction data is the linkage between origins and destinations and the flow between those points. Classification of flows has been carried out before – ‘functional regionalisation’ used by Coombes et al. (1986, 2002) to create Travel To Work Areas creates sets of contiguous areas based upon commuting flow data. The same technique used with migration data to create Housing Market Areas (Coombes et al., 2004). Classifying flows tends to create sets of contiguous areas, although contiguity may be an unnecessary constraint in an interaction data classification. Classifying flows or individuals?
12
A MIGRATION DATA CLASSIFICATION Scale: District level data used initially due to increased attribute availability, but only for Britain. Scope for trialling a ward level classification (including NI) at a later stage. Data: Internal and international migration data used initially. Classification type: This classification will concentrate on classifying individuals rather than flows. Contiguity constraints not useful at this stage. Initial decisions for a trial classification
13
A TRIAL CLASSIFICATION A standard clustering methodology similar to the one suggested by Milligan (1996) was used to create the classification. This consists of 6 steps: 1.408 Local authority districts in Britain were chosen as objects to cluster 2.56 variables were selected from an original list of 5559 taken from 7 2001 Census SMS level 1 tables. Domains included: – In/out/within/international/no usual address migration rates for age, ethnicity, economic activity and long term illness – Migration efficiency rates (due to no suitable denominator) for Socio-economic status, family status and housing tenure Selection through a combination of correlation analysis, principal components analysis and analysis of standard deviation stats. Methodology
14
A TRIAL CLASSIFICATION 3. All variables were standardised by z-scores where measurements were on different scales 4. Euclidean distance was chosen as measure of proximity used to compute clusters 5. Ward’s hierarchical clustering algorithm was used in the initial partition as it does not require definition of the number of clusters prior to the clustering procedure – a suitable number of clusters chosen from Ward’s algorithm output. Methodology
15
A TRIAL CLASSIFICATION Data input to SPSS; algorithm run; cluster solution produced. Job done? Initial classification has produced a set of areas with distinctive characteristics in line with what might be expected (e.g. London, London periphery, university towns, ex-industrial economically more depressed areas, rural periphery all forming distinct clusters). BUT… how do we know this is a reliable classification? Are all districts clustered appropriately?
16
R EFINING AND CREATING A ROBUST CLASSIFICATION The robustness of any classification will depend on the quality of the decisions made at each point of the construction process. Initial trial classification can be seen as a draft – some decisions were made hastily in order that some output be produced which could then be improved upon. In refining the classification Important decisions need to be made in relation to: – data – methodology
17
R EFINING AND CREATING A ROBUST CLASSIFICATION Trial classification ignored variable distributions. Various authors (Vickers, 2006 and Založnik, 2006) advocate transformation of skewed variables as it stated they can bias clusters. Others (Openshaw and Wymer, 1995) say not necessary… Who is correct? Only solution to test the data… Data - badly behaved variables
18
R EFINING AND CREATING A ROBUST CLASSIFICATION Transforming data (log and sqrt transformations) failed to improve variable distributions – therefore not necessary to transform But, do skewed variables bias clusters? Next step to experiment with dropping very skewed variables… 8 dropped first Transforming or removing variables
19
R EFINING AND CREATING A ROBUST CLASSIFICATION Further 4 very skewed variables dropped (international variables) Dropping skewed variables has little effect, however removing these variables reduces complexity in the final solution Dropping international variables means that this is now an internal migration classification Transforming or removing variables
20
R EFINING AND CREATING A ROBUST CLASSIFICATION Ward’s algorithm can produce sub-optimal solutions - cases cannot be reallocated to more suitable clusters at a later stage in the algorithm. K-means algorithm can reallocate cases. BUT… Implementation of k-means in some software packages can produce different results from the same data through starting with different cluster centroids… Methodology – An optimising algorithm
21
C LUSTER O PTIMISATION K-means in SPSS with cases ordered differently
22
C HOOSING K MATLAB implements K- means such that a solution is chosen with cases surrounding the optimum cluster centroids. Solution with minimum average distance to these centroids is selected In a single tier classification, the number of clusters is key… Comparison of silhouette plot data enables the choice of the most appropriate number of clusters k Optimal Clustering Sub-optimal Clustering
23
F INAL SOLUTION Selection of most the appropriate variables and the best clustering techniques through an exhaustive process of testing and evaluation ensures the most robust final solution. But what does the classification reveal?
24
A ‘ FUZZY ’ CAVEAT Even though this is the best classification possible, some districts exhibit stronger cluster membership than others Silhouette values give an indication of the strength of cluster membership for each district Cluster profiles will have a greater or lesser association with each district accordingly
25
C LUSTER P ROFILES The cluster is characterised by in- migrants and within-area migrants in the older age groups – 45 and above. Younger in-migrants are very much underrepresented. Migrants into these areas are from across the socio-economic spectrum, although the very high socio-economic groups are less common. Migrants preferentially move into owner occupied accommodation, and tend to be either or alone or in couples, far more than parent families. Cluster 1 - Coastal and Rural Retirement Migrants
26
C LUSTER P ROFILES The cluster is characterised by high levels of student in- migration, and young person within-area migration. Non-household moving groups into privately rented accommodation are common in this cluster, as are non-family households and individuals moving into communal establishments – all characteristics of a student population. In addition, non-white within-area migration is important, as is in- migration of economically inactive migrants. Cluster 3 – Student Towns
27
C LUSTER P ROFILES Districts in this cluster have very much below average in-migration and out-migration for all age groups, signifying a degree of isolation from the rest of the clusters in Britain. Shorter distance, within-area migration is slightly above average. Moves into these areas come from individuals in the lower socio-economic groups, with moves of one-parent families being above average. Moves of economically inactive individuals are very much below average. Cluster 5 - Constrained, Working-Class, Local Britain
28
C LUSTER P ROFILES The cluster is defined by some of the highest and lowest z-score values, indicating it is the most dynamic cluster in the classification. Very high rates of in-migration for the migrants under 30, but below average in- migration rates for those over 30. Out-migration rates are very high for all groups, but especially for those between 30 and 45. This cluster features the highest rates of movement of the economically inactive. Across the four highest socio-economic groups there are positive efficiency rates for other moving groups, but negative rates for wholly moving households, Moves into privately rented accommodation are common. Cluster 7 – Dynamic London
29
C ONCLUSIONS AND MOVING FORWARD It is possible to classify districts in Britain by their migration characteristics Clusters in the classification exhibit very different profiles which are useful for understanding the complex internal migration processes and patterns going on in Britain BUT… A classification can only reliable once care has been taken at each stage of the construction process Even the most robust classification will have a degree of ‘fuzziness’ Back to the research hypothesis…
30
C ONCLUSIONS AND MOVING FORWARD The district level classification now provides a framework for the analysis of internal migration in Britain post-2001 Where annual internal migration data from Patient Registers are attribute poor, an attribute rich classification can add significant value The classification can also provide a framework for estimating and projecting internal migration into the future
31
C ONCLUSIONS AND MOVING FORWARD Comparison with other classifications ONS classificationVickers et al. Classification ONS IMPS LA clusters
32
C ONCLUSIONS AND MOVING FORWARD Producing a sister, ward-level classification – potentially including flow data from a new set of functional regions. This could be used for validation of district level classification. Analysis of patient register data 2001-2007 using the classification as a framework - What are changes over time? Analysis of other migration data – HESA? Calibration and testing of spatial interaction models which can be used to estimate migration between classified areas and project migration patterns leading up to the next census in 2011.
33
T HANK YOU A NY QUESTIONS, COMMENTS OR SUGGESTIONS ? Working paper detailing all of this work will be available here: http://www.geog.leeds.ac.uk/wpapers/ Me: http://www.geog.leeds.ac.uk/people/adennett.html
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.