Programming for Geographical Information Analysis: Advanced Skills Online mini-lecture: Network flows Dr Andy Evans [with thanks to Dr Kirk Harland]

What is a Spatial Interaction Model? Developments in spatial interaction modelling theory. Rounding without loss or gain. Further spatial interaction model developments. What to use as a distance measure. Calibration. This Lecture

Spatial Interaction Models A mathematical model for simulating / predicting the interaction between two geographical features. Interaction between features can be measured in: goods, information, money, or people

A spatial interaction model has three key components: 1) origins or origin masses, 2) destinations or destination masses; and 3) a representation of the relationship between each origin-destination pairs physical locations in geographical space distance (network / Euclidean) cost of travel time etc… Spatial Interaction Models

They are aggregate models and thus individuals are poorly represented. Common applications have been: migration (Stillwell, 1978); journey to work (Senior, 1979); retail location planning (Fotheringham, 1983; Fotheringham and Trew, 1993); commercial retail marketing (Birkin et al., 2004); and recently applied to education planning problems (Harland, 2008) Spatial Interaction Models

The pioneering work on spatial interaction models was carried out in the 1850s. It was based on the contemporary scientific theory of interaction between physical bodies in space, based on Sir Isaac Newton’s Theory of Universal Gravitation. The level of interaction between two bodies varies directly proportionally to their masses and inversely with their relative locations in space. Spatial Interaction Models

where is the predicted flow between i and j O i is the mass of origin zone i D j is the mass of destination zone j f(d ij ) is the distance function k is a balancing factor or constraint ensuring flows equate to a known value Unsurprisingly, it became known as the gravity model: Gravity Models

Obviously, systems will have multiple origins/destinations. These could be the same, when studying migration, for example, you may be interested in flows between wards in Leeds. origins = Leeds wards and destination = Leeds wards Location planning may use different features for origins and destinations. origins = Leeds wards but the destinations could be retail outlets or schools. Gravity Models

So an example system may look something like… Gravity Models

But of course we need all the associated information… Gravity Models

Once we have all the information about a system it is just a matter of iterating over all the origin-destination pairs and plugging the information into the equation…simple! So for origin-destination pair A-X the gravity model equation would be: k × 10 × 95.4 × f(7.499) But what is k? And what is f? Gravity Models

f is a function used to represent distance decay and is usually a negative exponential. k is simply the ratio between the simulated flow events and the observed flow events. As an equation this is: where T is the total number of observed flow events. For our example system this works out as: k =k = T ∑ i ∑ j O i D j exp(-d ij ) k =k = 24 = 8.57 2.799966 Gravity Models

Each flow can now be calculated: k × O i × D j × exp(-d ij ) = (A-X) 8.57 × 10 × 95.4 × exp(-7.499) = 4.53 (A-Y) 8.57 × 10 × 58.4 × exp(-6.979) = 4.66 (A-Z) 8.57 × 10 × 63.9 × exp(-7.099) = 4.52 (B-X) 8.57 × 14 × 95.4 × exp(-7.708) = 5.14 (B-Y) 8.57 × 14 × 95.4 × exp(-7.910) = 2.57 (B-Z) 8.57 × 14 × 95.4 × exp(-8.000) = 2.57 } The sum of all the flows = 24 (ish) Gravity Models

The traditional way to display the result is in the form of a flow matrix, however for large datasets these can be difficult to read… Destination (j) XYZTotals A4.534.664.5213.71 B5.142.57 10.28 Totals9.677.237.0923.99 Origin (i) Gravity Models

As noted by Wilson (1967), if we double a destination attractiveness and an origin mass the flow between the two quadruples… surely that’s not right? The flows look fine for money but how can you have a fraction of a person moving between an origin and a destination? Can different types of people or sectors be represented? If we’re using zones, what do we use as a distance / cost of travel measure? And what about calibration, how can we do that? Issues

Wilson (1971) introduced the idea of constraints into the spatial interaction modelling theory. The idea is to retain as much information known about a system as possible. Through the application of constraints the issue of quadrupling interaction by doubling origin and destination masses is resolved. Developments

Wilson’s (1971) family of spatial interaction models comprise of: Unconstrained (or more accurately total constrained) model Production (or origin) constrained model Attraction (or destination) constrained model Production-attraction (or doubly) constrained model He also tied spatial interaction modelling to an established theory of gas particle movement and provided a sound mathematical derivation, the resulting model is known as ‘entropy maximisation’ Developments

First of all what is a constraint? It is simply a process where some known information is incorporated into the model equation to make the numbers add up… A bit like the balancing factor but a bit more detailed We will have a look at our example system using an origin constrained model to understand the process Developments

An origin constrained model equation looks something like this where k is replaced by A i and D j is replaced by W 2 j It is the notation that makes the equation look complicated, it still only comprises the original terms... Developments

However, we do have to calculate the A i balancing factor Remember that our re-ranging was: k = total real flows / simulated flow T ∑ i ∑ j O i D j exp(-1d ij ) However, here we want the total flows to equal only those from one origin O i, and O i doesn’t change on the bottom, so we can: T → O i →1 ∑ i ∑ j O i D j exp(-d ij ) ∑ i ∑ j O i D j exp(-d ij ) ∑ j D j exp(-d ij )

Subsituting W 2 for D gives us: But wait before you run out of the room this isn’t as bad as it looks… It simply means, for each origin, sum the attractiveness estimate multiplied by the distance decay term and divide the result into 1… that’s not so bad! Spatial Interaction Models

If it helps to understand Openshaw (1998) proposes shifting O i separating the constraint and model equation into a two stage process: Stage 1 produces an initial matrix of flows: Stage 2 converts these relative flows into predicted flows by proportionally fitting the relative flows for each i to the known O i value: Spatial Interaction Models

Using the balancing factor equation we can calculate our balancing factors by plugging in the values from our system A A = 1 = 6.25 (95.4 × exp(-7.499)) + (58.4 × exp(-6.979)) + (63.9 × exp(-7.099)) A B = 1 = 11.66 (95.4 × exp(-7.708)) + (58.4 × exp(-7.910)) + (63.9 × exp(-8.000)) Spatial Interaction Models

Each flow can now be calculated using the updated equation : A i × O i × W j 2 × exp(-d ij ) = (A-X) 6.25 × 10 × 95.4 × exp(-7.499) = 3.30 (A-Y) 6.25 × 10 × 58.4 × exp(-6.979) = 3.40 (A-Z) 6.25 × 10 × 63.9 × exp(-7.099) = 3.30 (B-X) 11.66 × 14 × 95.4 × exp(-7.708) = 7.00 (B-Y) 11.66 × 14 × 95.4 × exp(-7.910) = 3.50 (B-Z) 11.66 × 14 × 95.4 × exp(-8.000) = 3.50 Spatial Interaction Models

Origin Constrained Destination (j) XYZTotals A3.303.403.3010.00 B7.003.50 14.00 Totals10.306.906.8024.00 Origin (i) Gravity ModelDestination (j) XYZTotals A4.534.664.5213.71 B5.142.57 10.28 Totals9.677.237.0923.99 Origin (i) The new flow matrix is a much better fit than the original one. Spatial Interaction Models

where β is a calibrated distance decay parameter In actual fact, the distance-decay is usually parameterised and calibrated (Wilson 1971): Spatial Interaction Models

It is true, having fractions of persons or discrete goods flowing between areas makes no sense. Applying conventional rounding routines can cause problems. To exemplify this lets return to our example system, we left it like this. Origin Constrained Destination (j) XYZTotals A3.303.403.3010.00 B7.003.50 14.00 Totals10.306.906.8024.00 Origin (i) Rounding without loss/gain

So lets just apply a conventional rounding routine to it We have whole numbers as flows and in the destination totals The overall total still adds up but… The origin totals are not the same as we started with! Using this sort of routine it is very common to end up with fewer people in the resulting matrix than we start with. Origin Constrained Destination (j) XYZTotals A3.30=33.40=33.30=39 B7.00=73.50=4 15 Totals107724 Origin (i) Rounding without loss/gain

Now let’s look at a ‘lossless’ rounding routine: 1.Order values in ascending order 2.Initialise a store variable to 0 3.Add store variable to current number 4.Take fraction part of number and place in store variable 5.If not at the end of the values move to next and go to stage 3 6.Place values into original order. Because we are working towards a whole number, there shouldn’t be anything left in the store. Rounding without loss/gain

So for origin A Stage 1X = 3.3, Z = 3.3, Y = 3.4 Stage 2Store = 0 Stage 33.3 + 0 (X + Store) = 3.3Store = 0 Stage 4X = 3Store = 0.3 Stage 5Move to Z and go to stage 3Store = 0.3 Stage 33.3 + 0.3 (Z + Store) = 3.6Store = 0.3 Stage 4Z = 3Store = 0.6 Stage 5Move to Y and go to Stage 3Store = 0.6 Stage 33.4 + 0.6 (Y + Store) = 4Store = 0.6 Stage 4Y = 4Store = 0 Stage 5Reached final value so continueStore = 0 Stage 6X = 3, Y = 4, Z = 3Store = 0 Rounding without loss/gain

After applying the lossless rounding routine We have whole numbers as flows and in the destination totals The overall total still adds up AND The origin totals ARE the same as we started with! So we now have whole people or goods moving between areas Origin Constrained Destination (j) XYZTotals A34310 B73414 Totals107724 Origin (i) Rounding without loss/gain

Applying the rounding routine must be done in the same direction as the constraint application. Origin Constrained Destination (j) XYZTotals A B Origin (i) Destination Constrained Destination (j) XYZTotals A B Origin (i) For origin constraints we go across the rows. For destination constraints we go down the columns. Rounding without loss/gain

A great deal of effort has been put into representing different types of individuals within spatial interaction models. One of the first methods was demonstrated by Wilson (1971). He used different spatial interaction model configurations to represent different modes of transport in his transportation planning model. It can be thought of as a three dimensional spatial interaction model with k being the third dimension. Developments

Fortheringham and Trew (1993) experimented with representing different consumer choices using statistical approaches. Other approaches have been represented in the calibration stage by using parameters specific to an origin-destination pair. This approach has been most widely used in migration modelling (Stillwell 1978), but when applied to spatial interaction models used for planning purposes difficulties can arise if the user wants to add a destination in a scenario. Developments

Other sector specific developments include Incorporating elastic demand using the example of cinemas in the leisure service industry (Birkin et al., 2004). Examining the competing and agglomeration effects of stores in the retail sector (Fotheringham, 1983). Applying flexible capacity constraints in the education sector (Harland, 2008). Developing a spatial interaction - agent based hybrid model for petrol price modelling (Heppenstall et al., 2005). Although progress has been made consumers are still generally represented as groups, and this can be problematic. Developments

This is most definitely a problem with aggregate models We can use either network, Euclidean distances, cost of travel or time… bur where does the journey start and end? If we are using a point as a destination then we can use the X and Y coordinate of the destination as the end point, but the origin is generally always a zone or area, so where does the journey start? Should the closest point be used (1)? The centroid of the zone (2)? Maybe the furthest point (3)? 1 2 3 Distances

Perhaps the safest thing is to use the population weighted centroid… But then what about a barrier like a river or a major road Would the people on the right of the zone be as likely to travel to the destination shown as the those on the left 1 2 3 Distances

Using network distances can help to bring a little more realism to our model… but they also bring their own issues Network distances are computationally intensive to calculate Spatial interaction models require all possible origin-destination distance combinations to be calculated For six years of school data in Leeds over 42,000,000 network distance calculations were required If we add a destination within a ‘what if’ scenario, new network distance calculations have to be calculated ‘on the fly’ Distances

Currently, there is no one definitive answer Assess the model utility and choose the most appropriate spatial representation You may end up with several different spatial interaction models in a 3 dimensional layered structure similar to that of Wilson (1971), with different spatial representations in each layer Distances

The process of calibration is to estimate parameters in the model equation. Remember this: where β (beta) is a calibrated distance decay parameter Another common parameter is the attractiveness parameter called alpha (α), which the destination attractiveness is generally raised to the power of… but we’ll just concentrate on beta today. Calibration

By substituting different values into β we will get different resulting flow matrices. Calibration is the process of adjusting the parameter(s) to produce the best fitting flow matrix result to observed data. We can use different statistics such as the SRMSE or R 2. The aim is to either minimise the statistic (if the better fit between simulated and observed data is indicated by a low statistic value as with SRMSE) or maximise the statistic (if the better fit is indicated by a higher statistic value as with R 2 ). Calibration

Wilson (1971) used the entropy statistic to calibrate his famous entropy maximising model. Entropy can be described as a measure of uncertainty within the resulting matrix. By reducing entropy you reduce the uncertainty…so why did Wilson maximise it?. Calibration

The answer is that the user / developer of a model is uncertain of the micro level events within a simulation model therefore entropy or uncertainty needs to be maximised. But in an unconstrained model environment, this type of calibration would distribute interaction evenly across the system. To coin an American phrase it would ‘cover all the bases’. 3 1 1 1 Calibration

The genius in the Wilson model was that he enforced constraints on the result (origin, destination and distance) to ensure that the final matrix reflected reality. The final results maximised the uncertainty at the micro-level while enforcing macro level constraints to produce a result where all the totals added up, travelling distances were within the expected ranges and realistic interactions between zones were simulated. Genius… but complicated, for a really good guide to the process (although the equations are quite scary) see Senior (1979). We’ll look at calibration in Lecture 7. Calibration

The spatial interaction model building process can be summarised into four stages, two of which are performed iteratively until we have a satisfactory model fit. Planning how to calibrate and evaluate your model is crucial, and you need to think where your data is going to come from. It is possible to over fit your model and that is why we have the evaluation stage. Calibration

Spatial interaction models are aggregate models. They have been successfully applied in many research and commercial areas. A few notable companies that apply variations on spatial interaction modelling theory: ExxonMobil Sainsbury Tesco Ford Summary

Several developments in spatial interaction applications have helped to bring more realism to the spatial interaction modelling process. The application of lossless rounding routines facilitates the modelling of individuals or discrete goods items. However, there are still issues surrounding the selection of distance / cost of travel measures and these must be considered when developing / applying a model. Summary

Programming for Geographical Information Analysis: Advanced Skills Online mini-lecture: Network flows Dr Andy Evans [with thanks to Dr Kirk Harland]

Similar presentations

Presentation on theme: "Programming for Geographical Information Analysis: Advanced Skills Online mini-lecture: Network flows Dr Andy Evans [with thanks to Dr Kirk Harland]"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Programming for Geographical Information Analysis: Advanced Skills Online mini-lecture: Network flows Dr Andy Evans [with thanks to Dr Kirk Harland]

Similar presentations

Presentation on theme: "Programming for Geographical Information Analysis: Advanced Skills Online mini-lecture: Network flows Dr Andy Evans [with thanks to Dr Kirk Harland]"— Presentation transcript:

Similar presentations

About project

Feedback