Visualising Geographic Variations Dr Orian Brook University of Stirling Firstly we need to think about what we mean by big data. It’s a popular term, widely used in the media, but it can mean different things to different people. There’s no one agreed-upon definition, so I’m going to talk about some of the characteristics of big data and what they mean.
Inescapable problems of mapping data Maps are simplifying complex reality: “to present a useful and truthful picture, an accurate map must tell white lies” “A single map is but one of an indefinitely large number of maps that might be produced for the same situation or from the same data” from p1-2 of Monmonier , M (1996) How to Lie With Maps See also: Dorling, D (2012) The Visualisation of Spatial Social Structure
Issues and Tips for Visualising Geographic Variations Problems and challenges presented by: Variations in population density Choice of data ranges Differences that aggregation level/choice can make Colours/symbology Isn’t necessarily one right answer. But be aware of pitfalls and possible alternatives.
Population Density
Urban-Rural Indicator 2013-2014 Rural Areas dominate map, but only 18% of Scotland live in them 70% live in Urban areas Can give misleading impression Difficult to see detail of other areas
Adjust areas according to population? - Cartogram In theory, displays two dimensions – population size and rurality In practice, has disadvantages
Cartograms Can make maps very difficult to relate to known geographies Are options as to degree of rescaling But then hard to know what you’re looking at? Lots more eg.s at www.worldmapper.org Eg infant mortality
Other options Local Council Election Results www.improving- visualisation.org Still hard to identify places Requires specialist tools
Solutions? Think carefully how you interpret a map! Think about zooming in on areas of interest
Data Ranges
Options for Summarising Data Three common approaches: Quantiles – where a fixed proportion of areas falls into each group Quartiles – 4 groups, quintiles - 5 groups, deciles – 10 groups Equal Interval Eg 11-20%, 21-30%, 31-40% Manual Choose your own on the basis of what fits the data/tells the story Best approach will depend on the data & what you want to say Which provides most useful discrimination & comparison?
Quantiles Advantages Relatable – the 25% most deprived areas Always get map with visual range Consistency of approach - good for comparing different data eg 25% highest crime areas, 25% lowest income areas Cons Not good for narrow data ranges or clusters Eg % Chinese, male/female ratio
Equal Interval Advantages Relatable – 10%, 20%, Good for showing contrasts of same measure Eg areas with 30% or more graduates vs with 30% or more no qualifications Cons No consistency of approach for comparing different data Eg mapping counts of burglaries vs arson Dependent on variability of data Doesn’t work well when it’s skewed
From www.petitionmap.unboxedconsulting.com Stop spending .7% GNP on Foreign Aid Parliament to sit on Saturdays
SIMD Crime Domain Quartiles vs Equal Interval
Areal Units
Different Areas=Different Results Technically known as modifiable areal unit problem As data is aggregated to different (esp larger) units, proportions will change (variances are smoothed out) Different areas in England lay claim to “most deprived”, based on datazone, ward, local authority, constituency …
John Snow’s Cholera Map
Colours
Greater colour contrast isn’t always better – can be confusing Won’t print in black and white/photocopy, May look different depending on screen/projector Remember colour blindness (8% of men) Range of similar colours may work better
Make one yourself? Try indiemapper.com