dracones web based mapping and spatial analysis for public health surveillance l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance PowerPoint Presentation
Download Presentation
Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance

Loading in 2 Seconds...

play fullscreen
1 / 46

Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance - PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on

Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance. Christian Jauvin David Buckeridge McGill University. Summary. Dracones: Built with MapServer/PostGIS We'll be covering: Public Health context Software architecture Some specific problems.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance' - sema


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
dracones web based mapping and spatial analysis for public health surveillance

Dracones: Web-Based Mapping and Spatial Analysis for Public Health Surveillance

Christian Jauvin

David Buckeridge

McGill University

summary
Summary
  • Dracones:
    • Built with MapServer/PostGIS
  • We'll be covering:
    • Public Health context
    • Software architecture
    • Some specific problems
public health two perspectives
Public Health - Two Perspectives
  • Case management
    • Individual cases of notifiable diseases
    • Relationship networks
  • Population surveillance
    • Larger risk patterns
case management
Case Management
  • Questions/problems:
    • Is a case due to recent transmission?
    • If so, does the case share any feature with other, recent cases?
  • Ways it's being done:
    • Investigations/interviews
    • Meeting with other investigators
population surveillance
Population Surveillance
  • Questions/problems:
    • Are more cases happening than expected?
    • Does an excess suggest ongoing transmission in a specific region?
  • Way it's being done:
    • Semi-automated routine temporal and space-time statistical analysis (SaTScan)
montreal dsp
Montreal DSP
  • Département de santé publique de Montréal (Public Health Agency)
  • Need: incorporate spatial data + analysis capabilities within workflow
  • One reason: research shows that spatial information helps
  • Answer: Dracones project
    • Funded in part by GeoConnections
    • Led by David Buckeridge, MD, PhD
    • 15 month contract
case management at the dsp
Case Management at the DSP
  • Current Situation
    • Information on paper entered into system (Oracle DB + Forms)
    • System contains sensitive data (names, addresses)
    • Limited tools for analyzing case data
  • Project Goal
    • Capture spatial data
    • Visualize and analyze spatial distribution of cases
population surveillance at the dsp
Population Surveillance at the DSP
  • Current Situation
    • Routine temporal and space-time statistical analysis
    • Capacity to visualize time-series but not maps
  • Project Goal
    • Add mapping capacity
    • Extend range of analytic methods
why location matters case management
Why Location Matters - Case Management
  • If you are studying a case of a certain disease that was just declared
  • It is harder to picture the situation by looking at something as this..
why location matters population surveillance
Why Location Matters - Population Surveillance
  • If you are studying the spatial distribution of a set of disease clusters
  • This would seem more difficult..
development process
Development Process
  • Management Team
    • Led by public health MD with informatics training
    • Members from each area of DSP involved
  • User Involvement
    • Users on management team
    • Input throughout requirements, design, development
web architecture benefits
Web Architecture Benefits
  • Usually lighter/simpler technologies
  • Cross-platform
  • Ease of deployment and integration
  • Builds on existing set of conventions and behaviours
system architecture
System Architecture

Dracones

Current Case Management System

Python

R

SaTScan

{

Web client

Oracle Forms

{

Apache + PHP

MapServer + MapScript

Bridge

Oracle DB

PostgreSQL/PostGIS DB

client side ui
Client Side - UI
  • UI is 100% Javascript (ExtJS library)
  • Future project: extract the map-manipulation parts:
    • Tile-based panning
    • Zooming
    • Layer activation

And releasing them under an OS license

client side functions
Client Side - Functions
  • From the results of a query performed in the Oracle client, launch the application to visualize the results
  • Inspect those results by varying certain parameters
  • Launch external analysis tools
server side mapserver
Server Side - MapServer
  • MapServer: OS tool that add geospatial content to web applications
  • Can be used as a CGI
  • Interface with many programming languages
  • Works very closely with PostGIS
server side mapserver24
Server Side - MapServer
  • MapServer with Apache 2.2, using PHP5
  • Linux and Windows
  • Since it's stateless, each interaction:
    • Build a map object from a base mapfile
    • Modify the map object (according to client parameters)
    • Return rendered map as a file to the client (that will display it)
mapserver layers
MapServer - Layers
  • A map object is made of layers
  • A layer can be loaded from a shapefile (ESRI open format), that specifies its geometry
  • Or it can be loaded directly from a PostGIS table
postgis
PostGIS
  • PostGIS: spatial extension for PostgreSQL
  • Adds geometry types (points, lines, polygons, etc)
  • Spatial functions and operators (distance, convex hull, intersection, etc)
  • Spatial indexes
postgis27
PostGIS
  • Queries that mix spatial and non-spatial aspects of the data
  • If you have a case table:
postgis28
PostGIS

And a region table:

postgis29
PostGIS

You can then build a query like this:

SELECT * FROM case, region

WHERE case.condition = 'TB'

AND case.region_id = region.id

AND within(region.geom,

GeomFromText('POLYGON(…)')

postgis30
PostGIS
  • A MS layer can be built simply by adding a connection attribute, pointing to the PG table (two lines really!)
  • Shapefile and table sources can be mixed
analysis tools satscan
Analysis Tools - SaTScan
  • Requirement: interfacing with analysis tools
  • SaTScan: detection of space-time clusters
  • Scan for areas where the probability of being a case is significantly higher than being a non-case
analysis tools
Analysis Tools
  • Since it's a command-line tool without an open API, we use Python to run it, parse the results and plot them using MapServer
  • We do the same for some external R routines
system data sources
System Data Sources
  • Health data
    • Reportable disease database
    • Ancillary data on contacts
  • Geographical data
    • Street networks and postal code file
    • Health regions, census, postal boundaries
using address data from a public health database

Address:

1500-a Sherbroooke St. Ouest

Using Address Data from a Public Health Database
  • Problem: addresses are stored as character fields:
  • No validation at the entry point
  • Data quality is compromised
two problems with address processing
Two Problems with Address Processing
  • The addresses need to be parsed, and possible (and numerous) transcript errors and ambiguities must be solved
  • The ones which refer to a same place must be identified and treated as a unique object
possible solutions
Possible Solutions
  • These could be solved in a more SQL-integrated manner: edit distance module for PG (?)
  • We decided however to go the procedural way (using Python)
address validation algorithm requirements
Address Validation Algorithm - Requirements
  • A database with (1) the street network geometry
  • (2) the street segment address ranges
  • And (3) the postal code geometry and street range association
address validation algorithm

H2X2T1

2001

H2X2T2

1001

1998

3001

Sherbrooke Street

998

Sherbrooke Street

2998

Address Validation Algorithm

So you will know for instance that:

address validation algorithm steps
Address Validation Algorithm - Steps
  • Parse the text addresses in 3 tokens:
    • {S#, SN, PC}
  • For each triplet:
    • Try to find an exact match, by being tolerant on SN (maximum coverage, edit distance..)
    • By being tolerant on SN, try to vary PC
    • Idem with SN, fix PC and vary S#
address validation algorithm batch results
Address Validation Algorithm - Batch Results
  • By doing a batch analysis of the DSP data (105K records), we found that:
    • 84% of the address records were "exact"
    • 14.5% were recoverable errors
    • 1.5% were non-recoverable errors
last address processing step geocoding

H2X2T1

2001

H2X2T2

1001

1998

3001

Sherbrooke Street

998

Sherbrooke Street

2998

1500 Sherbrooke

Last Address Processing Step: Geocoding

Geocoding by interpolation:

a last problem
A Last Problem
  • DSP management system is read-only (for us)
  • Not spatially enabled
  • Must not affect performance
and its solution
And its Solution
  • Create a mirror of the DSP data model, using PG
  • Augmented with spatial aspects (and more adapted address handling)
  • Refreshed periodically
    • Reprocessing of the content that has changed
    • Extraction of the new one
a challenge
A Challenge
  • Interface and extend existing:
    • System
    • Environment (including an important community of users and developers)
lessons learned
Lessons Learned
  • Very strong interest in using spatial information at the DSP but infrastructure, skills and data quality are limiting
    • Large effort to validate and correct all addresses
  • The science of spatial analysis in public health often lags the technology
    • How to analyze multiple locations for each individual?
    • How important is spatial location in an urban area?
  • Open-source, web-based mapping software and spatial databases (MapServer, PostGIS) are robust and easy to work with for skilled developers
acknowledgements
Acknowledgements
  • GeoConnections, CIHR
  • McGill University
    • Aman Verma, Sherry Olsen, Andrew Carter
  • Montreal DSP
    • Louise Marcotte
    • Robert Allard, Lucie Bedard, André Bilodeau
  • Montreal Chest Institute
    • Kevin Schwartzman, Jonathan Richard
    • Alice Zwerling, Marie-Josee Dion