Presentation is loading. Please wait.

Presentation is loading. Please wait.

Aksel Thomsen Erik Sommer

Similar presentations


Presentation on theme: "Aksel Thomsen Erik Sommer"— Presentation transcript:

1 Aksel Thomsen Erik Sommer
Clustered data - grids Aksel Thomsen Erik Sommer

2 Aksel Thomsen Erik Sommer
Clustered data - grids Aksel Thomsen Erik Sommer

3 Outline Grid data Our method Result examples Potential expansions
Commercial aspects

4 Grid data Either 100x100m or 1x1km grid cells
Vast majority consists of few households  Clustering is needed No. of cells <100 households Percentage 100x100 m 423,755 421,655 99.5% 1x1 km 38,908 34,951 89.9%

5 Bornholm – Case study

6 Population in Bornholm I

7 Population in Bornholm II

8 Method - Principles Each grid is assigned to a unique municipality
Time consistent No. of households in a grid is defined as the minimum over e.g. two years All cells with min. K households are their own cluster The remaining cells are clustered by an algorithm

9 Method - Algorithm Start in the South western corner
Combined with the nearest remaining cell New center is calculated The nearest still remaining cell is added to the cluster 3. and 5. are repeated until the cluster consists of min. K households If less than K households remains they are added to the last cluster

10 Method – Example

11 Method – Example

12 Method – Example

13 Method – Example

14 Method – Example

15 Method – Example

16 Bornholm – Result

17 Bornholm– Average income

18 Potential expansions Modify the distance parameter
Now: Only geographical distance Potential: Prioritize similar grids nearby Same households types Same income Same demographics Avoid mixing very different households in the same cluster

19 Commercial aspects (1 of 4) – action done by customers
Many of the customers actually handle the clustering themselves. The clustering done by the customers/users has to meet our requirements for the minimum of households for at least two years. The clustering done by the customers can be very complex and already include a number of the potential expansions listed by Statistics Denmark.

20 Commercial aspects (2 of 4) – role Statistics Denmark.
The primary role for Statistics Denmark in regards to clustering of grids is to be an alternative supplier. The primary demand for our clustering has been for us to be a supplier of simple clusters that are easy to understand and easy to use “keeping it simple”. Very often it seems like that the creator of the clusters tend to forget the important task of explaining and illustrating the methods used – so this is an important factor for as a supplier.

21 Commercial aspects (3 of 4) – two approaches
Clusters can be done either simple using nearest cell approach (as shown by Aksel ) or more complex including various factors in the algorithm creating more optimized clusters (as listed as potential expansions for Statistics Denmark and already used by existing customers). Clusters can then either be created first and then be fixed as static clusters (non-dynamic) and then variables can be added or the clusters can be created by using/sorting the selected variable making dynamic clusters (changed by each variable used).

22 Commercial aspects (4 of 4) – two approaches
Clusters with a minimum of 20, 50, 100 or 150 households used for the static clusters (non-dynamic). Micro Clusters with a minimum of 5 household used to create dynamic macro clusters with a total minimum of 300 households within a municipality where the first cluster will have the best value in regards to the selected variable and the second clusters will have the next best value etc. for example sorted by decreasing average household income.

23 Aksel Thomsen Erik Sommer
Clustered data – grids Aksel Thomsen Erik Sommer


Download ppt "Aksel Thomsen Erik Sommer"

Similar presentations


Ads by Google