West Nile Virus: DYCAST spatial-temporal model
Why spatial is special Modifiable area unit problem (MAUP) –Results of statistical analysis are sensitive to the zoning system used to report aggregated data –Results of statistical analysis are sensitive to the scale at which the analysis are performed –Examine sensitivity of results to MAUP Boundary problem –Study areas are bounded and results just outside the study are can affect results. –Size and shape can affect results Migration –Rhode Island (xs) –Tennessee (xl) –Ohio (jr)
Why spatial is special (cont.) Spatial sampling –Space can be used as a means of stratification Spatial autocorrelation –Refers to the fact that values of phenomena close in space are related Problem: Implication for sampling is that samples close in space may not be independent –Spatial autocorrelation can be calculated and variances can be adjusted accordingly Prospects: spatial autocorrelation can be used to estimate values at unknown locations based on surrounding know points (interpolation).
Why spatial is special (cont.) Data management –Editing Editing of spatial data is a long transaction –User needs to “check out” a region for extended periods of time –Other users need access Spatial databases are version managed to permit multiple long-transaction editing –Access Indexes are spatially based –Quad-tree recursive algorithm Addition of temporal dimension requires a second index. Optimization of spatial-temporal searching is still a topic under research
Map to Geographic Information Systems (GIS) Maps as layers of geographic information Desire to ‘automate’ map Evolution of GIS –Create automated mapping systems –Analyze geographic relationships –Model real-world phenomena
What is GIS? Component definition: set of subsystems for the input, storage, transformation and retrieval of geographic data. Tool definition: measuring and analyzing aspects of geographic phenomena and processes. Model definition: a model of the real world.
GIS: It’s about Modeling and analyzing relationships and processes that occur across space, time and different scales. New tools for modeling –Geo-statistical procedures (Dead Crows) –Object-based GIS (Tiger model) –Seamless geographic databases (Big Apple)
Global issues and motivation Hundreds Dead Thousands Infected and Sick. Sickness can last for months and result in long term neurological problems. Threatening the blood supply. One of the most common pathogens. Kills wildlife and threatens ecological balance. Remediation can cause problems.
Diffusion of West Nile Virus in Birds, USA Jan 1, 1999 to Dec 31, 1999
Diffusion of West Nile Virus in Birds, USA Jan 1, 2000 to Dec 31, 2000
Diffusion of West Nile Virus in Birds, USA Jan 1, 2001 to Dec 31, 2001
Diffusion of West Nile Virus in Birds, USA Jan 1, 2002 to Dec 31, 2002
Diffusion of West Nile Virus in Birds, USA Jan 1, 2002 to Dec 31, 2002 Jan 1, 2003 to Dec 31, 2003
Diffusion of West Nile Virus in Birds, USA Jan 1, 2004 to Dec 31, 2004
Diffusion of West Nile Virus in Birds, USA Jan 1, 2005 to Dec 31, 2005
Diffusion of West Nile Virus in Birds, USA Jan 1, 2006 to Dec 31, 2006
Diffusion of West Nile Virus in Birds, USA Jan 1, 2007 to Sept. 25, 2007
Confronting the problem at hand Newly introduced infectious agent arrives in New York City Observations –Wildlife are killed especially birds –Individuals become sick in close geographic proximity –Seasonal effect
Synthesizing a hypothesis: literature review What do we know about this disease from other parts of the world? –Outbreaks have been observed for decades in the Middle East, Africa and Europe –Mosquitoes are the vectors These mosquitoes tend to be ornithophilic –Birds play a primary role as the reservoir host Amplification cycle and spillover
Synthesizing a hypothesis: local observations and experience Many birds die prior to human onset Most are resident Passerines particularly Corvids Patterns of birds deaths tend to be highly localized and dynamic Human infections tend to follow these patterns of bird deaths
Source: The Centers for Disease Control and Prevention; Spillover effect hypothesized by some researchers
Birds Resident, wild passerine birds act as the principal amplifying hosts of West Nile virus. Data from Komar (2003) Crows suffer highest casualties. 82% dead in Illinois, by The nature of the bird as a reservoir for WNV transmission is still! under investigation. Photo Source: Ornithology and Mammalogy Department, Cornell University
Birds continued Data Source: Komar, N. unpublished. Used with permission
Mosquitoes Culex pipiens : –The most common pest mosquito in urban and suburban settings. –An indicator of polluted water in the immediate vicinity. –Recognized as the primary vector of St. Louis encephalitis (SLE). –Is normally considered to be a bird feeder but some urban strains have a predilection for mammalian hosts and feed readily on humans. (American Hybrids?). –Extrinsic incubation period of 4-12 days. Species identified in transmission in NYC include: Culex pipiens, Culex restuans, Culex salinarius and Aedes vexans. Photo source: Iowa State University online image gallery
Hypotheses Primary Hypothesis: Dead birds are an integral part of the process that results in human infection. Sub goals –How do we quantify dead bird activity? –How can we establish the relationship between dead birds and human infection? –Is there a statistical procedure that mirrors the process governing this relationship? –Are the statistical measures adequate?
Point Indicators of WNV –Laboratory Confirmation in Birds-Mosquitoes Temporal lag between laboratory detection of positives and actual presence of virus in the wild. Does not allow for early identification of amplification cycle. Point data, no continuity in space. Quantifying WNV dead bird activity.
Area estimates of WNV infection –Density of Dead Crows and Blue Jays Arbitrary thresholds. Surveillance bias. Modifiable Areal Unit Problem (MAUP). Data regarding the ecology of the disease ignored.
Quantifying WNV dead bird activity. DYCAST Analysis (Dynamic Continuous Area Space Time Analysis) Assumptions: –Good surveillance design and adequate public participation in reporting. –Persons are infected at place of residence. –Non-random space-time interaction of bird deaths attributed to WNV. –WNV is continuous across space.
Quantifying WNV dead bird activity. DYCAST Analysis (contd.) Model Components –Space-time correspondence of the death of birds as amplification measure. Knox method (statistical) –Run Knox as an interpolation function to estimate a surface of WNV activity. –Calibrate the model using ecological information and statistical analysis. –Dynamic: Use a moving window for the temporal domain.
Quantifying WNV dead bird activity. Statistical Approach. MEASURES OF SPACE TIME INTERACTION THE KNOX TEST (1963) Where: T: the test statistic t ij : the distance between points i and j: 0 if greater than the critical distance, 1 otherwise s ij : the time between points i and j: 0 if greater than the critical time, 1 otherwise Where: N : the total number of pairs that can be formed from: n data points Where: cell o 11 is T, close in space and time cell o 21 are the pairs close in space only (not in space and time) cell o 12 are the pairs close in time only (not in space and time) cell O 22 are pairs not close in space nor time
Quantifying WNV dead bird activity. Significance Testing Poisson P(X T) = 1 - Chi-Square P(X T) = where: O ij = O 11, O 12, O 21, O 22 of the Knox matrix Monte Carlo: Space-Time Label switching. Monte Carlo: Completely random seeding in space and time.
Count the number of pairs that can be formed from the points that fall in the smaller cylinder of closeness. Monte Carlo Simulations Space- Time Swindles 1.5 m 21 days Repeat 5000 times and rank the counts of close pairs. Randomly swap the time labels keeping the location fixed 0.25 m 3 days Sweep the cylinder with a smaller cylinder of closeness in search for close pairs. Problem: In case of heavy clustering in either dimension, the swapping of already close labels, results in variance underestimation.
Count the number of pairs that can be formed from the points that fall in the smaller cylinder of closeness. Also keep track of close-space, close-time pairs. Random Monte Carlo Simulations 1.5 m 21 days Repeat 5000 times. Randomly seed the cylinder with X number of points. i.e m 3 days Sweep the cylinder with a smaller cylinder of closeness in search for close pairs.
Methodology Calibration Methodology –Home address of humans testing positive considered the most definitive location of WNV existence. –Calibration date assumed to be 7 days before symptoms onset for each case. –Spatial and temporal domains of 1.5 miles and 21 days were chosen based on ecological factors. –Close space/time values were chosen from an ecologically relevant range ( miles/3-7 days).
Calibration Results 2000 retrospective analysis in New York City
Methodology Spatial Design-Prospective Surveillance Overlay Grid (0.5 x 0.5 miles ) across NYC and Chicago and run Knox test at centroid of grid cells (each as a potential human case) on a daily basis for the year 2001 season, using all birds except pigeons.
Result evaluation Ran for NYC in 2001 not sufficient number of human cases to quantify. Chicago: 215 human cases. Rate of success. Kappa index of agreement. Chi-Squared test.
Publication
CHICAGO 2002 Unconditional Monte Carlo
Days before area was identified as at-risk
Number of days area was lit.
Kappa where: N is the total number of areas considered, and x ii, x i+, x +i are the elements of the following matrix: Rater 1 Rat er 2 Class 1 Class 2 Class 1 x 11 x 12 Class 2 x 21 x 22 The sum of which amounts to N. Measures inter-rater agreement excluding chance:
Space-Time Application of kappa: Run for a selected combination of windows and days prior
Monte Carlo kappa table
Interpreting the results The maximum kappa value is for a 2 day window for 12 days prior –With a 1 day reporting lag and lag for maximum viremia 1-2 days prior to death we have maximum viremia occurring on days 15 and 16 prior to onset of human illness. –Given that extrinsic incubation period in mosquitoes averages 9 days and intrinsic incubation in humans averages 7 days, the above results are consistent with this pathology.
Comparison of statistical analysis and epidemiology Figure 1 Illustration of temporal windows and days prior to onset and model prediction: most likely time maximum viremia exist in environment Figure 2. Time: mosquito infection to onset date of human infection.
Interpreting the results Maximum kappa is followed by a gradual drop of 30% by 7 days prior to infection. –This can be explained by a reduction in avian hosts which may be causing mosquitoes to search for other sources of blood meals perhaps humans –This coincides with the likely infection of humans by mosquitoes and may explain the so called “spill over effect”. Maximum kappa occurred for window size 2, 3 and 1 respective –Maximum viremia in birds occurs between 1-3 days
Monte Carlo-Chi-Square comparison Monte Carlo Chi- Square RiskNo Risk Risk No Risk Significant at < level.
Broader implications of results Proved the role of dead-birds in human infections. Important for control. Supported hypothesis concerning the amplification cycle and spillover effect in WNV Identified a weakness of the Knox statistic and proposed a way of resolving it. First space-time implementation of the Kappa statistic.
Publication
DYCAST Implementation in California
Implementation
DYCAST Implementation in California
For 2006/07 the entire state of California (every ½ by ½ mile grid cell) is being run every day beginning May 1, 2006/07 and ending October 1 of each year
Alert to Mosquito control boards in California Dave, Here is an update on the DYCAST risk in Sacramento and Yolo counties, in case you may find it useful in advance of the aerial spraying scheduled for next week. The risk has continued to rise sharply in Sacramento County, with a new, large cluster appearing in the Citrus Heights/Foothill Farms/North Highlands area (Attachment A). As you can see from Attachment B, the level of DYCAST risk in Sacramento County is at the exact same level as it was on this date last year (199 lit tiles, square miles). Sacramento County also has the highest level of risk (i.e., the largest combined square mileage of high risk areas) of any county in California at this time (Attachment C). A: current DYCAST risk map B: comparative DYCAST risk profiles from C: comparative DYCAST risk profiles (top 6 high risk counties), 2007 D: animation of the DYCAST high risk areas from June 16 to July 26, 2007 DYCAST high risk areas in 2007: Sacramento Yolo date* # tiles sq. mi. # tiles sq. mi. 6/17/ /1/ /2/ /3/ /4/ /5/ /6/ /7/ /8/ /9/ /10/ /11/ /12/ /13/ /14/ /15/ /16/ /17/ /18/ /19/ /20/ /21/ /22/ /23/ /24/ /25/ /26/ Ryan M. Carney Coordinator, West Nile Virus Dead Bird Surveillance Program Associate Public Health Biologist California Department of Public Health Vector-Borne Disease Section 850 Marina Bay Parkway Richmond, CA Phone: (510) Fax: (510)
24-Bit Encoding Schemes (Master Templates) ArcEngine Model with Daily Sacramento Area DYCAST Output Raster 2005 Sacramento Season Sacramento CA Accuracy Deriving Cellular Automata Rules for Areas at Risk of West Nile Virus Infection G. Green, PhD student, CARSI, Hunter College – City University of New York; S. Ahearn, CARSI, Hunter College – CUNY; R. Carney, California Department of Health Services; and A. McConchie, CARSI, Hunter College - CUNY Selection of master template and sub-templates via mutual information and genetic algorithm based on accuracy of CA output: Data: California Department of Health Services