Bayesian space-time models for surveillance and policy evaluation using small area data Nicky Best Department of Epidemiology and Biostatistics Imperial.

Bayesian space-time models for surveillance and policy evaluation using small area data Nicky Best Department of Epidemiology and Biostatistics Imperial College, London Joint work with Guangquan (Philip) Li, Sylvia Richardson, Bob Haining, Anna Hansell, Mireille Toledano, Lea Fortunato

Outline Introduction Policy Evaluation: Evaluating Cambridgeshire Constabulary’s ‘no cold calling’ initiative Surveillance: Detecting unusual trends in chronic disease rates

Introduction Bayesian space-time modelling of small-area data is now common in many application areas  disease mapping  small area estimation (official statistics)  mapping crime rates  modelling population change ..... Key feature is that data are sparse Bayesian hierarchical model allows smoothing over space and time → improved inference

Introduction Many different inferential goals  description  prediction  surveillance  estimation of change / policy impact ..... Many different ways of formulating the space-time model  space + time (separable effects)  space + time + interaction  space-time mixture models .....

Our set-up Inferential goals: detection of areas with ‘unusual’ time trends  Goal 1: Policy evaluation  a policy or intervention has been implemented in a known subset of areas, and we wish to evaluate whether this has had a measureable impact on the event rate in those areas  Goal 2: Surveillance  no a priori subset of areas of interest; we just wish to identify any areas whose event rate differs markedly from the general time trend General modelling framework  Assume most areas exhibit a common temporal trend (separable space and time effects) – the ‘common trend’ model  For a small subset of areas, assume time trend is unusual (space-time interaction) – the ‘local trend’ model

Goal 1: Policy Evaluation Evaluating Cambridgeshire Constabulary’s ‘No Cold Calling’ initiative In collaboration with Guangquan Li *, Robert Haining +, Sylvia Richardson + University of Cambridge * Imperial College, London

Definition of a “cold call” A visit or a telephone call to a consumer by a trader, whether or not the trader supplies goods or services, which takes place without the consumer expressly requesting the contact. Not illegal but often associated with forms of burglary and “rogue trading”. To discourage cold calling police have targeted specific neighbourhoods as “no cold calling” (NCC) areas: street and house signage; information packs for residents; informal follow-up meetings. Cambridgeshire Constabulary initiated NCC scheme in parts of Peterborough in 2005 and extended it in 2006.

Locations of the NCC areas in Peterborough

Summary of NCC-targeted areas

Data for evaluation All reported “burglary in a dwelling” events (Home Office classification code 18, sub-codes 0-10, and code 29) used as outcome  Surrogate for rouge trading and distraction burglary (very small number or recorded events) Data aggregated to annual counts by Census Output Area (COA) in Peterborough Time period: 2001-2008 Total of 9388 recorded burglaries  Median burglaries per area per year = 2  5 th and 95 th percentiles: 0 – 8

Raw data: individual and aggregated time trends Positive impact of policy? Poisson test RR 01-04 = 1.06, p=0.56 RR 05-08 = 0.85, p=0.19

Strategy for evaluation Compare burglary rates before and after implementation of NCC scheme  difference between 2 time periods is indicative of impact of policy Comparison is done after adjustment for systematic changes in burglary rate in other non-NCC areas  use of ‘control’ areas helps to differentiate how much of the change is due to the policy and how much to other external factors Deal with sparsity of the data (i.e. small number of burglary events) by  Data aggregation → assessing overall impact  Hierarchical modelling of local impacts → assessing both overall and local impacts → Separate signal from noise

Control Criterion DescriptionNo. of LSOAs 1All LSOAs in Peterborough88 2±10% burglary rate of the NCC group in 20059 3±20% burglary rate of the NCC group in 200520 4±30% burglary rate of the NCC group in both 2004 and 20057 5LSOAs containing the NCC-targeted COAs (but excluding the NCC-targeted COAs) 10 6LSOAs that had “similar” multiple deprivation scores (MDS) as those for the NCC LSOAs in 2004 46 Constructing the control group Control areas are selected to have similar local characteristics (e.g. burglary rates; deprivation scores) to those in the NCC-targeted group Control areas are chosen to be Lower Super Output Areas (LSOA) to obtain reliable control data (results are similar with COA-level controls)

Evaluation procedure

The impact function We consider various functional forms for the impact function (Box and Tiao, 1975) The impact of the policy is quantified through the estimation of the function parameter(s) Model selection via DIC NameFunctional form No change Step change A linear function of time A generalization function

Full model specification Control areas + NCC areas pre-scheme NCC areas post-scheme (t ≥t 0 )

y it  it uiui tt    uu Implementation Common trend model Model fitted in WinBUGS Common trend model fitted to control areas (all years) plus NCC areas (years before scheme only)

y it  it uiui tt    uu uk*uk*  kt *  t *  * y kt bkbk Implementation Common trend model Local trend model, t ≥ t 0 Model fitted in WinBUGS Common trend model fitted to control areas (all years) plus NCC areas (years before scheme only) Local trend model (impact function) fitted to NCC areas (years after scheme) for k=i bbbb   

y it  it uiui tt    uu uk*uk*  kt *  t *  * y kt bkbk Implementation Common trend model Local trend model, t ≥ t 0 Model fitted in WinBUGS Common trend model fitted to control areas (all years) plus NCC areas (years before scheme only) Local trend model (impact function) fitted to NCC areas (years after scheme) ‘Cut’ function used to prevent NCC area (post-scheme) data influencing estimation of common trend model parameters for k=i bbbb    ‘cut’ link * distributional constant (no learning)

Results: choice of impact function Linear impact function has smallest DIC No ChangeStepLinear Generalization function Dbar15.2714.329.7711.75 pD1.212.292.252.57 DIC16.4916.6112.0214.33

Posterior probability of “success” i.e. Pr( b k < 0) No change

Heterogeneity of local impacts f (t, b k ) = b k ∙(t  t 0 +1); b k =  +  x k +  k ;  k ~ N(0,  2 ) Some of the variability in local NCC impacts may be due to coverage The larger the proportion of properties that were visited in a COA, the greater the impact of the NCC scheme  = -1.1 95% CI(-2.6, 0.2)

Heterogeneity of local impacts Two possible explanations for coverage effect A “threshold” effect  NCC scheme does not have a measurable impact (in terms of reducing burglary rates) unless a sufficient number of households in the local area are visited A “dilution” effect  Because the COA is the unit of analysis, the NCC scheme impact could be diluted when the households that are visited are only a small proportion of the total households in the COA  Neither of these explanations for the coverage effect undermines our overall assessment of the policy’s success

Conclusions: NCC scheme NCC scheme led to overall “success”  Overall, NCC-targeted areas experienced a 16% (95% CI: -2% to 34%) reduction in burglary rate per year This suggests a positive impact of the NCC policy which had the effect of stabilizing burglary rate in the targeted areas while overall burglary rates were going up Linear impact function is better at describing the data than the other 3, suggesting a gradual and persistent change There exist different impacts between targeted COAs, perhaps due to local differences in implementing the schemes

Assessing NCC impact for whole of Cambridgeshire The NCC scheme was extended to the whole of Cambridgeshire for the period 2005-08 We applied our evaluation model to assess impact of NCC scheme separately for urban and rural areas Overall, schemes in urban areas were more successful than those in rural areas.

29 Urban Rural % change in burglary rates after 1 st year of NCC scheme Overall (0.96) Overall (0.38) No change

Conclusions: Model Hierarchical model allows borrowing of strength across NCC areas  enables evaluation of local impacts even when data are sparse Joint estimation of common trend and local trend models enables full propagation of uncertainty  Parameters of common trend model treated as ‘distributional constants’ in local trend model  Facilitated using ‘cut’ function in WinBUGS More complex impact functions could be implemented, but need sufficient time points post-policy for reliable estimation

Goal 2: Surveillance Detecting unusual trends in chronic disease rates In collaboration with Guangquan Li, Sylvia Richardson, Anna Hansell, Mireille Toledano, Lea Fortunato Imperial College, London

Surveillance of small area data For many areas of application, such as small area estimates of income, unemployment, crime rates and rates of chronic diseases, smooth time changes are expected in most areas However, policy makers and researchers are often interested in identifying areas that ‘buck’ the national trend and exhibit unusual temporal patterns These abrupt changes may be due to emergence of localised predictors/risk factors(s) or the impact of a new policy or intervention Detection of areas with “unusual” temporal patterns is therefore important as a screening tool for further investigations

Motivating example 1: COPD mortality Chronic Obstructive Pulmonary Disease (COPD) is a common chronic condition characterized by slowly progressive and irreversible decline in lung function  responsible for approximately 5% of deaths in the UK Main risk factors include  Smoking  Occupational exposure to high levels of dusts and fumes  Outdoor air pollution “Umbrella” term for broad range of disease phenotypes Time trends may reflect variation in risk factors and also variation in diagnostic practice/definitions

Motivating example 1: COPD mortality Objective 1: Retrospective surveillance  to highlight areas with a potential need for further investigation and/or intervention (e.g. additional resource allocation) Objective 2: Policy assessment  Industrial Injuries Disablement Benefit was made available for miners developing COPD from 1992 onwards in the UK  As miners with other respiratory problems with similar symptoms (e.g., asthma) could potentially have benefited from this scheme, there was debate on whether this policy may have differentially increased the likelihood of a COPD diagnosis in mining areas

Data Observed and age- standardized expected annual counts of COPD deaths in males aged 45+ years  374 local authority districts in England & Wales  8 years (1990 – 1997) Difficult to assess departures of the local temporal patterns by eye Need methods to  quantify the difference between the common trend pattern and the local trend patterns  express uncertainty about the detection outcomes

Bayesian Space-Time Detection: BaySTDetect BaySTDetect (Li et al 2011) is a novel detection method for short time series of small area data using Bayesian model choice between two competing space-time models  Model 1 assumes space-time separablility for all areas → one common temporal pattern across the whole study region  Model 2 provides local time trend estimates for each spatial unit individually For each area, a model indicator is introduced to decide whether Model 1 or Model 2 is supported by the data → Quantifying the difference A Bayesian procedure of controlling the false discovery rate is employed → Expressing uncertainty about detected areas

BaySTDetect: modelling framework 37 The temporal trend pattern is the same for all areas Temporal trends are independently estimated for each area. Model selection A model indicator z i indicates for each area whether Model 1 ( z i =1) or Model 2 ( z i =0) is supported by the data

Implementation Model 1: Common trend y it  it [C] ii tt E it Model 2: Local trend y it  it [L] uiui  it E it y it  it E it Selection model zizi

Prior on model indicator For the model indicator z i, we have This prior on z i  reflects the surveillance nature of the analysis where we expect to find only a small number of unusual areas a priori  ensures that a common trend can be meaningfully defined and estimated

Classifiying areas as “unusual” Classification of areas as “unusual” is based on the posterior model probabilities p i = Pr( z i | data) Small values of p i indicate low probability that area i fits the common trend → high probability of being “unusual” Need a rule for calibrating the p i that acknowledges the multiple testing setting  How low does p i need to be in order to declare area i as unusual? False Discovery Rate (FDR) is the proportion of detected areas that are false (i.e. not truly unusual) (Benjamini & Hochberg, 1995) Various methods to estimate or control FDR Here we control the posterior expected FDR (Newton et al 2004)

Detection rule based on FDR control First rank the areas according to increasing values of p i At a nominal FDR level of , the first k ranked areas are declared as unusual where k is the maximum integer satisfying where p (j) is the j th ranked posterior common-trend model probability This procedure ensures that (posterior) expected number of false positives is no more than ( k ×  ) of the k declared unusual areas

Simulation study to evaluate operating characteristics of BaySTDetect Simulated data were based on the observed COPD mortality data Three departure patterns were considered When simulating the data, either the original set of expected counts from the COPD data or a reduced set (multiplying the original by 1/5) were used 15 areas (approx. 4%) were chosen to have the unusual trend patterns  areas were chosen to cover a wide range expected count values and overall spatial risks Results were compared to those from the popular SaTScan space-time scan statistic

Simulation Study: Departure patterns Common trend, exp(  t ) Departure pattern, exp(  t ∙  ) 2 different departure magnitudes:  =1.5 and  =2.0

Simulation Study: expected counts Table: Summary of the original set of age-adjusted expected counts used in the simulation

Simulation Study: FDR control Empirical FDR vs corresponding pre-defined level: Pattern 2 SaTScan: Empirical FDR = 0.19 (0.00 to 0.78) for scenario with original expected counts and  =2.0 0.05 0.10 0.15 0.20 Pre-set FDR level 0.05 0.10 0.15 0.20 Pre-set FDR level 0.05 0.10 0.15 0.20 Pre-set FDR level Empirical FDR 0.0 0.2 0.4 0.6 0.8 1.0 Empirical FDR 0.0 0.2 0.4 0.6 0.8 1.0 Empirical FDR 0.0 0.2 0.4 0.6 0.8 1.0 Original expected;  =1.5 Original expected;  =2.0 Reduced expected;  =2.0 mean 95% sampling interval

Sensitivity of detecting the 15 truly unusual areas E=24 E=33 E=42 E=52 E=80 Expected count quantiles E=24 E=33 E=42 E=52 E=80 Expected count quantiles E=24 E=33 E=42 E=52 E=80 Expected count quantiles E=24 E=33 E=42 E=52 E=80 Expected count quantiles Sensitivity 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 0.0 0.2 0.4 0.6 0.8 1.0 Sensitivity 0.0 0.2 0.4 0.6 0.8 1.0 BaySTDetect (FDR=0.1) SaTScan (p=0.05) True departure magnitude:  =2.0 True departure magnitude:  =1.5 Pattern 2

Sensitivity of detecting the 15 truly unusual areas: reduced expected counts E=5 E=6 E=8 E=11 E=16 Expected count quantiles Sensitivity 0.0 0.2 0.4 0.6 0.8 1.0 BaySTDetect (FDR=0.1) SaTScan (p=0.05) Pattern 2; True departure magnitude:  =2.0 E=5 E=6 E=8 E=11 E=16 Expected count quantiles Sensitivity 0.0 0.2 0.4 0.6 0.8 1.0

COPD application: Detected areas (FDR=0.05)

COPD application: Interpretation Results provide little support for hypothesis regarding the industrial injuries policy  only 3 out of 40 ‘mining’ districts detected (Barnsley, Carmarthenshire and Rotherham)  unusual trend patterns in these areas are not consistent Two unusual districts (Lewisham and Tower Hamlets) with an increasing trend (against a national decreasing trend) were identified in inner London  These areas are very deprived, with high in-migration and ethnic minorities → might expect different trends to rest of country  In fact, Tower Hamlets has been commissioning various local enhanced services to tackle high rates of COPD mortality since 2008.  This rising trend could potentially have been recognised earlier in the 1990s through using BaySTDetect as a surveillance tool.

COPD application: SaTScan Primary cluster: North (46 districts) – excess risk of 1.05 during 1990-92 Secondary cluster: Wales (19 districts) – excess risk of 1.12 during 1995-96

Example 2: Data mining of cancer registries The Thames Cancer Registry (TCR) collects data on newly diagnosed cases of cancer in the population of London and South East England It is one of the largest cancer registries in Europe, covering a population of over 12 million, and holds nearly 3 million cancer registration records. We perform a retrospective surveillance of time trends for several cancer types using BaySTDetect  aim to provide screening tool to detect of areas with “unusual” temporal patterns  automatically flag-up areas warranting further investigations

Cancer data Cancer incidence for population aged 30+ years  Breast (female only)  Colon (males and females combined)  Lung (males and females, separately) South East England, ward level (1899 areas) Period 1981-2008  Data were aggregated by 4-year intervals  7 time periods for the detection analysis

Cancer data summary MinQ1MedianMeanQ3Max breast OBS 0.010.016.017.624.069.0 EXP 0.011.316.517.623.056.5 colon OBS 0.05.08.09.112.042.0 EXP 0.05.78.59.111.834.6 Female lung OBS 0.03.05.06.49.034.0 EXP 0.04.05.96.48.324.5 Male lung OBS 0.06.010.011.816.066.0 EXP 0.07.611.211.815.239.5 Comparable to reduced expected count scenario in simulation study

Results: Number of detected areas (out of 1899) Cancer typeFDR=0.05FDR=0.1FDR=0.15FDR=0.2 Breast9193554 Colon0358 Lung (female)0124 Lung (male)6142439 54

Detected areas: breast cancer

Summarising the unusual trends With a relatively large number of detected areas (e.g., breast and male lung cancer), examination of the individual trends becomes difficult For the detected areas, the estimated RR trends from the local trend model are fed into a standard hierarchical clustering method ( hclust in R) The cluster-specific trends are then compared to the overall RR trend 56

Breast cancer FDR=0.2 Black line = common trend Coloured lines = average local trend in each cluster 1 cluster2 clusters 3 clusters4 clusters5 clusters

BaySTDetect: Conclusions and Extensions We have proposed a Bayesian space-time model for retrospective detection of unusual time trends Simulation study has shown good performance of the model in detecting various realistic departures with relatively modest sample sizes Possible extensions include:  Spatial prior on z i to allow for clusters of areas with unusual trends  Time-specific model choice indicator z it, to allow longer time series to be analysed  Alternative approaches to calibrating posterior model probabilities, e.g. decision theoretic approach (Wakefield, 2007; Muller et al., 2007)

G. Li, R. Haining, S. Richardson and N. Best. Evaluating Neighbourhood Policing using Bayesian Hierarchical Models: No Cold Calling in Peterborough, England. Submitted G. Li, N. Best, A. Hansell, I. Ahmed, and S. Richardson. BaySTDetect: detecting unusual temporal patterns in small area data via Bayesian model choice. Submitted G. Li, S. Richardson, L. Fortunato, I. Ahmed, A. Hansell and N. Best. Data mining cancer registries: retrospective surveillance of small area time trends in cancer incidence using BaySTDetect. Proceedings of the International Workshop on Spatial and Spatiotemporal Data Mining, 2011. www.bias-project.org.uk Funded by ESRC National Centre for Research Methods References

Bayesian space-time models for surveillance and policy evaluation using small area data Nicky Best Department of Epidemiology and Biostatistics Imperial.

Similar presentations

Presentation on theme: "Bayesian space-time models for surveillance and policy evaluation using small area data Nicky Best Department of Epidemiology and Biostatistics Imperial."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Bayesian space-time models for surveillance and policy evaluation using small area data Nicky Best Department of Epidemiology and Biostatistics Imperial.

Similar presentations

Presentation on theme: "Bayesian space-time models for surveillance and policy evaluation using small area data Nicky Best Department of Epidemiology and Biostatistics Imperial."— Presentation transcript:

Similar presentations

About project

Feedback