An ecological analysis of crime and antisocial behaviour in English Output Areas, 2011/12 Regression modelling of spatially hierarchical count data
Overview 1.Background 2.Data 3.Count data models 4.Extension to multilevel models 5.Extension to spatial models 6.Results and conclusions
Purpose of research Ecological factors affecting crime incidence: Demographic Physical Environment Opportunity Cost Social Economic
Purpose of research Unit of analysis: Output areas Modern techniques: - Count data models - Hierarchical data models - Spatial models Contextualise raw statistics often quoted Coverage: full population Interrogate ‘newly’ available data Illustrate the use of open data
Context Increasing divergence between police recorded crime and the Crime Survey of England and Wales →Crime statistics de-designation – January 2014 →House of Commons PASC report – April 2014 →HMIC report – November 2014 August 2011 riots
Crime Data Source: data.police.uk Period: 2011/12 Given And the coefficient of variation is given by An appropriate sample size is therefore determined based on Cochran’s formula
Covariate data 2011 Census variables e.g. young adult population, sex ratios, race, divorce rates, household structure, qualifications, method of travel to work, employment, population density ONS Neighbourhood Statistics variables e.g. benefit claimants, small area income estimates DCLG variables indices of deprivation and land use Summary classifications Output Area Classifications and Rural Urban Classifications
Count data models – Poisson regression Poisson probability density function (PDF): Model form: Rate parameterisation: Variance = mean = μ Goodness of fit tests:
Violation of equidispersion Causes of apparent overdispersion: -Omitted explanatory variables -Outliers -Omitted interaction terms -Omitted variable transformations -Mis-specified link function Tests of equidispersion: -Pearson/Deviance dispersion statistics ≠ 1 -Boundary likelihood ratio test: -Score test: H 0 :α=0; H 1 :α≠0
Count data models – Negative binomial model (NB2) Origin from binomial PDF Wide range of formulations e.g. NB-C, NB1, NB2, NB-P, geometric negative binomial etc Traditional formulation is the NB2 model Derivation of NB2 model as a GLM: -Poisson PDF with heterogeneity “gamma” -Derive the NB-C model -Convert to log-linked form Variance: μ + μ 2 /v μ + αμ 2
Hierarchical count data models property crime rates per 1000 fixed assets by police force area total crime rate per usual resident by police force for Output Area populations in the sample
Multilevel NB2 model The level 1 variance is
Controlling for unobserved spatial dependencies Moran’s I is the linear association between a value and the weighted average of its neighbours
Final model Pearson dispersion statistic < 1.148
Results Variance: Between police force variability is significant:
Results Parameter Estimate Standard Error Exponentiated Parameter Estimate Fixed Part (truncated) fixed intercept perc_age16_29_meancentred sex_ratio_perc_meancentred divorced_percent_meancentred perc_leaders_meancentred spatial lag Random Part random intercept ancillary parameter
Conclusions and policy implications There are significant differences in crime rates across police force areas Urbanisation has the strongest influence on the relative risk of crime in output areas The relative affluence rather than absolute affluence of an area has an impact on crime Racial composition and immigrant populations have no significant impact on crimes in England
Questions? Contact details Chuka Ilochi Abbey 2, Floor 5 BIS 1 Victoria Street London SW1H 0ET Tel: