Disease Prevalence Estimates for Neighbourhoods: Combining Spatial Interpolation and Spatial Factor Models Peter Congdon, Queen Mary University of London.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

MCMC estimation in MlwiN
Estimating Prevalence of Diabetes and Other Chronic Diseases for Small Geographic Areas Peter Congdon, Geography, QMUL.
Latent Variable and Structural Equation Models: Bayesian Perspectives and Implementation. Peter Congdon, Queen Mary University of London, School of Geography.
Pattern Recognition and Machine Learning
Sampling: Final and Initial Sample Size Determination
Budapest May 27, 2008 Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction Anders Grimvall, Sackmone Sirisack, Agne.
OverviewOverview Motion correction Smoothing kernel Spatial normalisation Standard template fMRI time-series Statistical Parametric Map General Linear.
Peter Congdon, Department of Geography, Queen Mary University of London. 1.
Markov-Chain Monte Carlo
SPATIAL DATA ANALYSIS Tony E. Smith University of Pennsylvania Point Pattern Analysis Spatial Regression Analysis Continuous Pattern Analysis.
A model for spatially varying crime rates in English districts: the effects of social capital, fragmentation, deprivation and urbanicity Peter Congdon,
Chapter 4: Linear Models for Classification
Basic geostatistics Austin Troy.
Gaussian Processes I have known
GIS and Spatial Statistics: Methods and Applications in Public Health
Paper Discussion: “Simultaneous Localization and Environmental Mapping with a Sensor Network”, Marinakis et. al. ICRA 2011.
Constraining Astronomical Populations with Truncated Data Sets Brandon C. Kelly (CfA, Hubble Fellow, 6/11/2015Brandon C. Kelly,
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
An Optimal Learning Approach to Finding an Outbreak of a Disease Warren Scott Warren Powell
Approximate Bayesian Methods in Genetic Data Analysis Mark A. Beaumont, University of Reading,
GIS in Spatial Epidemiology: small area studies of exposure- outcome relationships Robert Haining Department of Geography University of Cambridge.
OMS 201 Review. Range The range of a data set is the difference between the largest and smallest data values. It is the simplest measure of dispersion.
Amos Storkey, School of Informatics. Density Traversal Clustering and Generative Kernels a generative framework for spectral clustering Amos Storkey, Tom.
Peter Congdon, Centre for Statistics and Department of Geography, Queen Mary University of London. 1 Spatial Path Models with Multiple.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
Measuring spatial clustering in disease patterns. Peter Congdon, Queen Mary University of London
Bringing Inverse Modeling to the Scientific Community Hydrologic Data and the Method of Anchored Distributions (MAD) Matthew Over 1, Daniel P. Ames 2,
Queensland University of Technology CRICOS No J Towards Likelihood Free Inference Tony Pettitt QUT, Brisbane Joint work with.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
Julian Center on Regression for Proportion Data July 10, 2007 (68)
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Annealing Paths for the Evaluation of Topic Models James Foulds Padhraic Smyth Department of Computer Science University of California, Irvine* *James.
Learning Theory Reza Shadmehr logistic regression, iterative re-weighted least squares.
Department of SOCIAL MEDICINE Producing Small Area Estimates of the Need for Hip and Knee Replacement Surgery ANDY JUDGE Nicky Welton Mary Shaw Yoav Ben-Shlomo.
Extending Spatial Hot Spot Detection Techniques to Temporal Dimensions Sungsoon Hwang Department of Geography State University of New York at Buffalo DMGIS.
1 10 Statistical Inference for Two Samples 10-1 Inference on the Difference in Means of Two Normal Distributions, Variances Known Hypothesis tests.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Bayesian MCMC QTL mapping in outbred mice Andrew Morris, Binnaz Yalcin, Jan Fullerton, Angela Meesaq, Rob Deacon, Nick Rawlins and Jonathan Flint Wellcome.
Spatial Interpolation III
Three Frameworks for Statistical Analysis. Sample Design Forest, N=6 Field, N=4 Count ant nests per quadrat.
CY3A2 System identification1 Maximum Likelihood Estimation: Maximum Likelihood is an ancient concept in estimation theory. Suppose that e is a discrete.
Bayesian Statistics and Decision Analysis
California Pacific Medical Center
Lecture 2: Statistical learning primer for biologists
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Introduction to Disease Prevalence modelling Day 6 23 rd September 2009 James Hollinshead Paul Fryers Ben Kearns.
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Why use landscape models?  Models allow us to generate and test hypotheses on systems Collect data, construct model based on assumptions, observe behavior.
Fitting normal distribution: ML 1Computer vision: models, learning and inference. ©2011 Simon J.D. Prince.
Diagnostic Likelihood Ratio Presented by Juan Wang.
Exposure Prediction and Measurement Error in Air Pollution and Health Studies Lianne Sheppard Adam A. Szpiro, Sun-Young Kim University of Washington CMAS.
Overview G. Jogesh Babu. R Programming environment Introduction to R programming language R is an integrated suite of software facilities for data manipulation,
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Bayesian analysis of a conceptual transpiration model with a comparison of canopy conductance sub-models Sudeep Samanta Department of Forest Ecology and.
Probability Theory and Parameter Estimation I
Quantifying Scale and Pattern Lecture 7 February 15, 2005
2nd Level Analysis Methods for Dummies 2010/11 - 2nd Feb 2011
Special Topics In Scientific Computing
Overview of Supervised Learning
A JMP® SCRIPT FOR GEOSTATISTICAL CLUSTER ANALYSIS OF MIXED DATA SETS WITH SPATIAL INFORMATION Steffen Brammer.
Estimating the Spatial Sensitivity Function of A Light Sensor N. K
Mathematical Foundations of BME Reza Shadmehr
Pattern Recognition and Machine Learning
LECTURE 07: BAYESIAN ESTIMATION
Will Penny Wellcome Trust Centre for Neuroimaging,
Percentage of people aged 60 or over in England as the lower super output area (LSOA) level, 2004 (left) and 2015 (right)*†. *The mean percentage of people.
Yalchin Efendiev Texas A&M University
Probabilistic Surrogate Models
Presentation transcript:

Disease Prevalence Estimates for Neighbourhoods: Combining Spatial Interpolation and Spatial Factor Models Peter Congdon, Queen Mary University of London 1

Data on disease prevalence  Health data may be collected across one spatial framework (e.g. health providers), but policy interest may be contrasts in health over another spatial framework (e.g. neighbourhoods).  Seek to use data for one framework to provide spatially interpolated estimates of disease prevalence for the other.  But also incorporate neighbourhood morbidity indicators that may also provide information on prevalence 2

Data Framework  Focusing on England, prevalence totals for chronic diseases maintained by 8200 general practices for their populations (subject to measurement error, excess or deficits in “case- finding”). See Prevalence data tables at See Prevalence data tables at  These data not provided for any small area populations, e.g neighbourhoods across England (Lower Super Output Areas or LSOAs)  Study focus: GP populations and LSOAs in Outer NE London (970K population) and on estimating neighbourhood psychosis prevalence 3

London Borough Map 4

Discrete Process Convolution  Use principles of discrete process convolution to estimate neighbourhood prevalence.  Geostatistical techniques (multivariate Gaussian process) computationally demanding for large number of units involved  Base Framework: Prevalence for GP Populations  Target Framework: Prevalence for Neighbourhoods 5

Discrete Process Model 6

Model for Base Framework, Study Data 7

Model for Target Framework 8

INCORPORATING OBSERVED INDICATORS of NEIGHBOURHOOD PREVALENCE 9

SCHEMATIC REPRESENTATION 10

LIKELIHOOD: REFLEXIVE INDICATORS 11

PARAMETER IDENTIFICATION 12

POTENTIAL SENSITIVITY IN INFERENCES & FIT  Sensitivity to kernel density choice  Sensitivity to constraint adopted (kernel scale set or known; process variance set or unknown)  Sensitivity to form of process effects: e.g. w j normal vs Student t  Sensitivity to density of discrete grid 13

SPATIAL SENSITIVITY IN INTERPOLATED NEIGHBOURHOOD PREVALENCE  Can compare models in terms of localised hot spot probabilities of high psychosis risk  Pr(  k >1|y,h)>0.9  Or compare clustering of excess psychosis risk. Define binary indicators  J k =I(  k >1)  Over MCMC iterations monitor excess risk in both neighbourhood k and its adjacent neighbourhoods l =1,..,L k.  C k is probability indicator of high risk cluster centred on neighbourhood k. 14

Study Specifications 15

Fit Comparisons 16

Comparing Neighbourhood Spatial Risk Patterns 17

OVERLAP AT NEIGHBOURHOOD LEVEL (K=562) 18

Density plot (M4), prevalence rate 19

Map of Interpolated Neighbourhood Prevalenceunder M4 20

Map of Clustering Probabilities under M4 (posterior means of C k ) 21

Future Research  Modify interpolation to include “formative” influences on prevalence (e.g. area deprivation)  How does model work with other chronic diseases, or with jointly dependent disease outcomes (e.g. diabetes, obesity)  Space-time prevalence models, etc 22

References  Austerlitz C et al (2004) Using genetic markers to estimate the pollen dispersal curve Molecular Ecology, 13, 937–954  Clark J et al (1999) Seed dispersal near and far: patterns across temperate and tropical forests. Ecology, 80, 1475–