Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University.

Similar presentations


Presentation on theme: "Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University."— Presentation transcript:

1 Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University

2 Summary  Dracones:  Built with MapServer/PostGIS  We'll be covering:  Public Health context  Software architecture  Some specific problems

3 Public Health - Two Perspectives  Case management  Individual cases of notifiable diseases  Relationship networks  Population surveillance  Larger risk patterns

4 Case Management  Questions/problems:  Is a case due to recent transmission?  If so, does the case share any feature with other, recent cases?  Ways it's being done:  Investigations/interviews  Meeting with other investigators

5 Population Surveillance  Questions/problems:  Are more cases happening than expected?  Does an excess suggest ongoing transmission in a specific region?  Way it's being done:  Semi-automated routine temporal and space- time statistical analysis (SaTScan)

6 Montreal DSP  Département de santé publique de Montréal (Public Health Agency)  Need: incorporate spatial data + analysis capabilities within workflow  One reason: research shows that spatial information helps  Answer: Dracones project  Funded in part by GeoConnections  Led by David Buckeridge, MD, PhD  15 month contract

7 Case Management at the DSP  Current Situation  Information on paper entered into system (Oracle DB + Forms)  System contains sensitive data (names, addresses)  Limited tools for analyzing case data  Project Goal  Capture spatial data  Visualize and analyze spatial distribution of cases

8 Population Surveillance at the DSP  Current Situation  Routine temporal and space-time statistical analysis  Capacity to visualize time-series but not maps  Project Goal  Add mapping capacity  Extend range of analytic methods

9 Why Location Matters - Case Management  If you are studying a case of a certain disease that was just declared  It is harder to picture the situation by looking at something as this..

10 Why Location Matters - Case Management

11  Than by looking at this..

12 Why Location Matters - Case Management

13 Why Location Matters - Population Surveillance  If you are studying the spatial distribution of a set of disease clusters  This would seem more difficult..

14 Why Location Matters - Population Surveillance

15  Than this..

16 Why Location Matters - Population Surveillance

17 Development Process  Management Team  Led by public health MD with informatics training  Members from each area of DSP involved  User Involvement  Users on management team  Input throughout requirements, design, development

18 Software Required and Our Choices Software Type RequiredOur Choice ~GISMapServer General + Spatial DBPostgreSQL + PostGIS Cartography-enabled clientHTML/Javascript Analytical / statistical toolsSaTScan, R, Python

19 Web Architecture Benefits  Usually lighter/simpler technologies  Cross-platform  Ease of deployment and integration  Builds on existing set of conventions and behaviours

20 System Architecture Oracle DB Oracle Forms Current Case Management System Web client Bridge { Python R SaTScan { Apache + PHP MapServer + MapScript PostgreSQL/PostGIS DB Dracones

21 Client Side - UI  UI is 100% Javascript (ExtJS library)  Future project: extract the map- manipulation parts:  Tile-based panning  Zooming  Layer activation And releasing them under an OS license

22 Client Side - Functions  From the results of a query performed in the Oracle client, launch the application to visualize the results  Inspect those results by varying certain parameters  Launch external analysis tools

23 Server Side - MapServer  MapServer: OS tool that add geospatial content to web applications  Can be used as a CGI  Interface with many programming languages  Works very closely with PostGIS

24 Server Side - MapServer  MapServer with Apache 2.2, using PHP5  Linux and Windows  Since it's stateless, each interaction:  Build a map object from a base mapfile  Modify the map object (according to client parameters)  Return rendered map as a file to the client (that will display it)

25 MapServer - Layers  A map object is made of layers  A layer can be loaded from a shapefile (ESRI open format), that specifies its geometry  Or it can be loaded directly from a PostGIS table

26 PostGIS  PostGIS: spatial extension for PostgreSQL  Adds geometry types (points, lines, polygons, etc)  Spatial functions and operators (distance, convex hull, intersection, etc)  Spatial indexes

27 PostGIS  Queries that mix spatial and non-spatial aspects of the data  If you have a case table: case_idconditionregion_id 1TB10 2Gastro20

28 PostGIS And a region table: region_idnamegeom 10Centre-SudPOLYGON(…) 20HochelagaPOLYGON(…)

29 PostGIS You can then build a query like this: SELECT * FROM case, region WHERE case.condition = 'TB' AND case.region_id = region.id AND within(region.geom, GeomFromText('POLYGON(…)')

30 PostGIS  A MS layer can be built simply by adding a connection attribute, pointing to the PG table (two lines really!)  Shapefile and table sources can be mixed

31 Analysis Tools - SaTScan  Requirement: interfacing with analysis tools  SaTScan: detection of space-time clusters  Scan for areas where the probability of being a case is significantly higher than being a non-case

32 Analysis Tools  Since it's a command-line tool without an open API, we use Python to run it, parse the results and plot them using MapServer  We do the same for some external R routines

33 System Data Sources  Health data  Reportable disease database  Ancillary data on contacts  Geographical data  Street networks and postal code file  Health regions, census, postal boundaries

34 Using Address Data from a Public Health Database  Problem: addresses are stored as character fields:  No validation at the entry point  Data quality is compromised Address: 1500-a Sherbroooke St. Ouest

35 Two Problems with Address Processing  The addresses need to be parsed, and possible (and numerous) transcript errors and ambiguities must be solved  The ones which refer to a same place must be identified and treated as a unique object

36 Possible Solutions  These could be solved in a more SQL- integrated manner: edit distance module for PG (?)  We decided however to go the procedural way (using Python)

37 Address Validation Algorithm - Requirements  A database with (1) the street network geometry  (2) the street segment address ranges  And (3) the postal code geometry and street range association

38 Address Validation Algorithm So you will know for instance that: Sherbrooke Street 1001 2001 3001 998 1998 2998 H2X2T1 H2X2T2

39 Address Validation Algorithm - Steps  Parse the text addresses in 3 tokens:  {S#, SN, PC}  For each triplet:  Try to find an exact match, by being tolerant on SN (maximum coverage, edit distance..)  By being tolerant on SN, try to vary PC  Idem with SN, fix PC and vary S#

40 Address Validation Algorithm - Batch Results  By doing a batch analysis of the DSP data (105K records), we found that:  84% of the address records were "exact"  14.5% were recoverable errors  1.5% were non-recoverable errors

41 Last Address Processing Step: Geocoding Geocoding by interpolation: Sherbrooke Street 1001 2001 3001 998 1998 2998 H2X2T1 H2X2T2 1500 Sherbrooke

42 A Last Problem  DSP management system is read-only (for us)  Not spatially enabled  Must not affect performance

43 And its Solution  Create a mirror of the DSP data model, using PG  Augmented with spatial aspects (and more adapted address handling)  Refreshed periodically  Reprocessing of the content that has changed  Extraction of the new one

44 A Challenge  Interface and extend existing:  System  Environment (including an important community of users and developers)

45 Lessons Learned  Very strong interest in using spatial information at the DSP but infrastructure, skills and data quality are limiting  Large effort to validate and correct all addresses  The science of spatial analysis in public health often lags the technology  How to analyze multiple locations for each individual?  How important is spatial location in an urban area?  Open-source, web-based mapping software and spatial databases (MapServer, PostGIS) are robust and easy to work with for skilled developers

46 Acknowledgements  GeoConnections, CIHR  McGill University  Aman Verma, Sherry Olsen, Andrew Carter  Montreal DSP  Louise Marcotte  Robert Allard, Lucie Bedard, André Bilodeau  Montreal Chest Institute  Kevin Schwartzman, Jonathan Richard  Alice Zwerling, Marie-Josee Dion


Download ppt "Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance Christian Jauvin David Buckeridge McGill University."

Similar presentations


Ads by Google