Download presentation
Presentation is loading. Please wait.
Published byMilton Skinner Modified over 9 years ago
1
Developing Geographical Information Systems In A Cohort Study Andy Boyd ALSPAC, Social Medicine University of Bristol
2
2 Geographical Data Matching - the ALSPAC resource - Overview of our data, the issues involve and our plan for the future Time for questions Time for discussion on how other studies have developed their GIS data resource
3
Defining GIS GIS combine mapping and a record of location with database technology. This can be used in the storage, analysis, management or presentation of data. 3 E.W.Gilbert‘s1955 version of John Snow’s 1855 Soho Cholera Outbreak Map
4
Scope of this presentation Not about GIS tools Not about GIS analysis or techniques It is about the capture and storage of data in an accessible manner to allow future GIS analysis Uses ALSPAC as an example 4
5
5 The ALSPAC GIS dataset Geographic identifiers collected directly from the cohort Data collected via external data sources Geographical data linkage Precision of geographic variables – accuracy Precision of geographic variables – ethics Providing the data as an integral part of the resource Current data availability
6
6 ALSPAC administered data collection Residential Address (~50000 address points) updated from cohort (self reported) team who tracks lost cases email second contacts database searches (osis, electoral roll) School the young person attends / wishes to attend via questionnaire (ALSPAC questionnaires/assessments administered in schools, primary to secondary transition questionnaire) clinic attendance interview collected from the school
7
7 Linkage to external data sources Validation / Cleaning Validation and cleaning of self reported data using data collected via record linkage (NSTS – NHS Tracing, NPD – National Pupil DB, Royal Mail/OS products) Missing Data Enhancing the resource through record linkage Data collection via geographical identifiers Accessing existing data organised around geographical IDs (census data,neighbourhood data) Primary data collection (distance to overhead power lines, air quality, commuting, school selection)
8
8 Data Collection through Record Linkage Office National Statistics (ONS) Tracing Health Authority Embarkation NSTS (NHS Strategic Tracing Service) Address registered with GP National Pupil Database (DCSF, DIUS*, UCAS*) School Address Pupil Residential Address DWP* Home Office* * Linkage currently being investigated
9
9 G.I.S – ALSPAC Resource ~50,000 ALSPAC residential address points, associated with a date range which can then be linked to ALSPAC data collection Schools attendance data from NPD ~17000 Schools attendance data from ALSPAC collection ~ 10000 The geographic relation between household income and polluting factories – FoE 1999
10
10 G.I.S Precision Spatial data held at many geographic levels Geographies range in scale from 0.1 meters to regional/national data Tied together via address, postcode or grid reference as central ID Key resources include: –NSPD ( was All Fields Postcode Directory) - geo linking database –Deprivation & Socio Economic indices (IMD, Townsend, Acorn) –Census data
11
11 G.I.S – How we link cases to data Master file of Postcodes (NSPD) Postcodes linked to grid reference Grid references of various scales PCs/GridRef mapped to: –Electoral geographies –Census geographies Ethics: –We don’t generally identify residence at PC or equivalent level Ordinance Survey – The National Grid
12
12 G.I.S – How we link geographies Current Situation Use Postcode / postcode centroid grid reference as our highest precision variable Link geographies using NSPD/AFPD appropriate to the measure required Proposed Method Use property reference number (UPRN) / property centroid grid reference as highest precision variable
13
13 G.I.S Problems Shifting geographies across time points Royal Mail change postcode areas (and therefore postcode centroids) Postcodes are ‘recycled’ Postcode not precise enough in some cases Postcode boundaries are not contiguous with other geographic boundaries
14
14 Accuracy issues with analysis at postcode level Address levelPostcode level
15
15 Accuracy issues with analysis at postcode level Address levelPostcode level
16
16 Accuracy issues with analysis at postcode level Address levelPostcode level
17
17 Linkage problems with the cohort data Missing data –Especially problematic for the cases who didn’t enrol in the original recruitment –Gaps in the address data –Move date often date we were informed not the actual move date However… –ONS matched 99.7% mothers, so we have their old & new NHS numbers and cleaned data (original recruitment cases only)
18
18 GIS Data Availability Collected as administrative resource Not yet cleaned, documented and presented to usual ALSPAC standards Initiatives under way to validate and fill gaps in record Schools GIS data in the main not processed Aim to build into standard ALSPAC resource
19
19 GIS Ethics Postcode level or greater accuracy treated as a personal identifier Research proposals to use these data need ALSPAC Law & Ethics Approval Broader geographical data can be released in normal manner A two-stage process is used to collect and process precise data Data collected via linkage not available for all cases due to ethical decisions
20
20 GIS Data Access Step 1 – Postcodes (or full address) provided to researcher with unique collection ID with no other data attached Step 2 – Researcher attaches their data and returns file to ALSPAC Step 3 – ID converted to the appropriate collaborator ID, postcode data removed Step 4 – Requested ALSPAC data added to the file and data sent to the researcher
21
Andy Boyd A.W.Boyd@Bristol.ac.uk
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.