Download presentation
Presentation is loading. Please wait.
Published byDylan Wiley Modified over 10 years ago
1
Methods of interpolating data to create long-run time series Ian Gregory (University of Portsmouth) & Paul Ell (Queens University, Belfast)
2
Minor changes: Registration Districts (1840-1910): 400 Local Govt. Districts (1890s-1972): 4,000 Parishes (1876-1972): 20,000 Administrative Units in England and Wales from 1801
3
The Newport area, 1911
4
Creating a standard geography Areal Weighting: –Assumption – Variable y is homogeneously distributed across the source zones –Using this: –BUT: Very unrealistic assumption.
5
Other sources of information (1) 1. Dasymetric technique: –There were 15,000 parishes as opposed to 600/1,500 districts –Total population is available at this scale –Assumptions: The distribution of y follows the distribution of the total population Parish-level population is homogeneously distributed –Problem: Most districts in towns and cities consist of only one parish. –1911, 30% of pop lived in districts that consisted of only one parish
6
Other sources of information (2) 2. Data from target districts as ancillary information: –Can provide information on the distribution of source zone data –EM algorithm is used –E.g. 1. Sub-divide target zones into rural and urban 2. Assume that rural and urban targets have the same population densities 3. Allocate y to targets using this assumption 4. Find the average population density of rural and urban target districts 5. Go back to stage three using the new population densities and repeat until the algorithm converges –Can use y for the target districts or total population at parish level as ancillary information –Relies on having relevant information for target districts
7
Other sources of information (3) 3. Combined technique –Brings together the dasymetric technique and the EM algorithm –Makes use of all available information –Tests all the assumptions
8
Choice of technique Based on aggregating 1991 EDs to form pseudo- parishes and districts Conclusions: No one technique for all variables Careful choice of technique reduces error significantly Using regression techniques can help determine which is most appropriate Error will still be appear in the interpolated data
9
Predicting error Possible techniques: 1.Space – where target zones consist of many large fragments of source zones they are error prone 2. Attribute – error is most prevalent when data have been allocated from urban zones to rural ones 3. Time – error will cause unrealistic changes in population
11
Using population change to locate error Water Orton – parish on the edge of Birmingham 1901-1951, Water Orton (1951: Pop. 1,841, area 2.3km 2, pop. den 796 p/km 2 ) 1861-1891, part of Aston: (1891: Pop. 250,000, area 57km 2, pop. den 4,300p/km 2 ) 1851, Water Orton: (1851, Pop. 190, area 2.6km 2, pop. den 73 p/km 2 ) Pop. Change = (y 2 -y 1 )/(y 2 +y 1 )1851: Est. Pop: 182 Actual Pop: 190
12
Using population change to locate error Birmingham 1951: Pop. 1,100,000, area 210km 2, pop. den. 5,235p/km 2 1931: Pop. 1,000,000, area 187km 2, pop. den. 5,367p/km 2 1891: Pop. 246,000, area 12.2km 2, pop. den. 20,123p/km 2 1851: Pop. 919, area 0.94km 2, pop. den. 977p/km 2
13
Using population change to locate error Castle Bromwich – parish on the edge of Birmingham 1951, Castle Bromwich (1951: Pop. 4,356, area 4.7km 2, pop. den 927p/km 2 ) 1921-1931, part of Birmingham: (1931: Pop. 1,000,000, area 187km 2, pop. den 5,367p/km 2) 1861-1911, part of Aston: (1891: Pop. 250,000, area 57km 2, pop. den 4,300p/km 2 ) 1851, Castle Bromwich: (1851, Pop. 6426, area 18.7km 2, pop. den 344p/km 2 )
14
Conclusions Can interpolate data to create long-run time-series Choice of best technique will depend on nature of the variable –No one size fits all technique All techniques will create some error What to do about error: –Attempt to smooth it out –Explicitly incorporate it into an analysis
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.