Download presentation
Presentation is loading. Please wait.
1
Network Dynamics and Simulation Science Laboratory A Data-driven Epidemiological Model Stephen Eubank, Christopher Barrett, Madhav V. Marathe GIACS Conference on "Data in Complex Systems" Palermo, Italy, April 7-9 2008
5
Network Dynamics and Simulation Science Laboratory Data driven epidemiological models I.Complex system II.Data driven, individual-based simulation III.Privacy and accuracy issues
6
Network Dynamics and Simulation Science Laboratory What’s so complex about epidemiology? Consider an “outbreak” among 4 people removed infectious susceptible
7
Network Dynamics and Simulation Science Laboratory Outbreaks can be represented as Markov processes A given configuration of the system probabilistically transitions into any of several other configurations. Even a small system has many possible configurations.
8
Network Dynamics and Simulation Science Laboratory Very little data is available to estimate this process Historically, we (partially) observe 1 or 2 Markov chains We want to estimate transition probabilities on every edge
9
Network Dynamics and Simulation Science Laboratory Aggregation simplifies the model … … at the cost of reduced information content. p(C’ t+1 | C’ t ) is less informative than p(C t+1 | C t ) when C’ C, 0 1 2 3 4 #S #I 4 3 2 1 0
10
Network Dynamics and Simulation Science Laboratory Other assumptions further simplify the model … … but are unwarranted in social systems, where components are 1.Heterogenous (distinguishable) 2.Intentional (behavior not determined by physical laws)
11
Network Dynamics and Simulation Science Laboratory Aggregation naturally makes contact with observations Observations of outbreaks often ignore heterogeneity and intention, and provide only point estimates.
12
“An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem” - J. Tukey “All models are wrong, but some are useful” - G.E.P. Box A system is complex “if its behavior crucially depends on the details of its parts.” - G. Parisi
13
Network Dynamics and Simulation Science Laboratory Interaction approach simplifies process itself Interactions among system components completely determine transition probabilities among configurations replaced with
14
Network Dynamics and Simulation Science Laboratory Calibrating with unexpectedly rich data For aerosol borne pathogens, the probability of transmission is related to physical proximity, duration, etc. The interaction approach reduces to estimating a social network. There is much more data available for this than for outbreaks. But it is not directly observable.
15
How can we estimate a social network?
16
Network Dynamics and Simulation Science Laboratory A possible approach we didn’t use Consider a subset of random networks subject to certain constraints Constraints should be relevant to the global dynamics, i.e. epidemics But what are those? A “chicken or the egg” problem: It would seem offhand that a taxonomy of “nets” … would arise naturally from the consideration of the statistical parameters... But the statistical parameters themselves are singled out on the basis of taxonomic considerations, which have yet to be clarified. - Anatol Rapoport and William Horvath, Behav Sci. 1961, 6, 279–291
17
Network Dynamics and Simulation Science Laboratory Questions to drive model development 1.What is the optimal targeted allocation of antivirals used prophylactically or therapeutically to mitigate influenza pandemic? 2.What combination of targeted antivirals and feasible, community- based, non-pharmaceutical interventions (e.g. closing schools, allowing liberal leave from work) can best delay an outbreak from becoming epidemic for several months? 1 & 2 Models must compare changes in social network with changes in transmissibility This is an example of policy informatics for complex systems
18
Network Dynamics and Simulation Science Laboratory Interventions specified naturally by effect on network No single “knob” reduces overall transmission by 50%
19
Network Dynamics and Simulation Science Laboratory Step 1. Create a synthetic population Census data –Individual demographics Age and gender –Household characteristics Size and Income
20
Network Dynamics and Simulation Science Laboratory Start from a proto-population, e.g. a list of ids. Add observed data Capture correlations in data using statistical models (iterative proportional fitting from Public Use Microdata) Start from a proto-population, e.g. a list of ids. Add observed data Start from a proto-population, e.g. a list of ids. Successive refinement of synthetic data IDGender House hold 1M1.................. 3 x 10 8 F1.2 x 10 8
21
Network Dynamics and Simulation Science Laboratory Step 2. Assign activities, locations & times Locations –Dunn and Bradstreet data Activity surveys –Matched to households by demographics –Matched to locations by activity type & travel time
22
Network Dynamics and Simulation Science Laboratory Surveys are very different kinds of data sources than census This step depends on data fusion capability Some values may be outcomes of very large games, not statistical models Successive refinement of synthetic data Surveys are very different kinds of data sources than census This step depends on data fusion capability IDGender House hold Activities Activity Locations Activity Times 1M1 School shop 27 43 8:00 3:00.................................... 3 x 10 8 F1.2 x 10 8 Work social 98734 723947 9:00 7:30
23
Network Dynamics and Simulation Science Laboratory So far: a typical family’s day Carpool Home WorkLunchWork Carpool Bus Shopping Car Daycare Car School time Bus
24
Network Dynamics and Simulation Science Laboratory Overlapping families’ days create a social network
25
Network Dynamics and Simulation Science Laboratory Successive refinement of synthetic data Gives us a generative model for contacts More powerful than traditional encapsulated agents Note: each byte of data / person adds ~300 MB to the database IDGender House hold Activities Activity Locations Activity Times Contacts Contact Duration 1M1 School shop 27 43 8:00 3:00 2,3,4 836, 289 5:20 0:45................................................ 3 x 10 8 F1.2 x 10 8 Work social 98734 723947 9:00 7:30
26
Network Dynamics and Simulation Science Laboratory Using data for purposes other than intended Possibly the only epidemiological model that has been calibrated using automobile traffic counts! (Because the same activity model generates both transportation demand and contact networks)
28
Network Dynamics and Simulation Science Laboratory Home Activities adapt to situation & generate network changes
29
Network Dynamics and Simulation Science Laboratory Derive disease interaction from social network Interactions only need to get a few things right: Susceptibility Infectivity as a function of time since exposure
30
Network Dynamics and Simulation Science Laboratory Modeling pandemic influenza Nobody knows what pandemic flu will look like Assume something like seasonal flu, but with less immunity Create several “flu” bugs in siico –Moderate (10% attack rate) –Strong (20 - 25% attack rate) –Catastrophic (> 50% attack rate) For each, fix other characteristics: –Incubation period: 2-3 days –Infectious period: 2-5 days
31
Network Dynamics and Simulation Science Laboratory Resolution, fidelity, and accuracy are different Resolution describes level of aggregation, e.g. individuals vs populations Fidelity describes the completeness of the representation’s features, e.g. age vs (age, gender, income, household size, education) Accuracy describes the correctness of features and correlations e.g. is mixing by age derived from social network correct? “Validity” (always for a particular question) depends on all 3.
32
Effect of changes in social networks (above) on disease dynamics (below)
33
Network Dynamics and Simulation Science Laboratory Characterizing the resulting network
34
Degree Distribution, location-location
35
Degree Distribution, people-people
36
Sensitivity to parameters
38
Network Dynamics and Simulation Science Laboratory Assortative Mixing Static people - people projection is assortative –by degree (~0.25) –but not as strongly by age, income, household size, … This is Like other social networks Unlike –technological networks, –Erdos-Renyi random graphs –Barabasi-Albert networks
39
Removing high degree people useless
40
Removing high degree locations better
41
Network Dynamics and Simulation Science Laboratory Summary Complex systems models are hungry for detail (= data) Privacy & extrapolation require “synthetic” data, combining observations (declarative), statistical models, and simulation results (procedural) Validity of synthetic data depends on resolution, fidelity, accuracy, and the question it is intended to answer
43
Network Dynamics and Simulation Science Laboratory When is this model simpler? Notation: x and y are states of a component at time t and t+1 1.Components’ states are updated independently: # parameters 2.Interactions are pairwise independent: # parameters
44
Network Dynamics and Simulation Science Laboratory When is this model simpler? 3.Most components do not interact directly: # parameters 4.Only one state transition, S I, is affected by interactions: # parameters
45
Architecture
46
Network Dynamics and Simulation Science Laboratory Computational Resources Demonstration experiment –8 experiments (exp ids: 1083 to 1090) –24 cells with 200 days and 25 reps Computations performed –291 million contacts * 200 days * 25 reps * 24 cells = 34.92 quadrillion transmission evaluations Time Requirements –Single processor: 2 years 340 days –Small cluster (10 nodes, 4 cores): 26 days 18 hours –Current IDAC cluster: > 3 hours
47
Network Dynamics and Simulation Science Laboratory Example Located Synthetic Population
48
Example Route Plans second person in household first person in household
49
Network Dynamics and Simulation Science Laboratory Time Slice of a Typical Family’s Day
50
Network Dynamics and Simulation Science Laboratory Original research contributions are solicited covering a variety of topics including but not limited to: data gathering in complex system research, data mining of large set of data: methods and tools, dealing with personal data in complex system research, data sharing of proprietary data for research purposes; legal aspects of the use of proprietary data for research purposes. characteristics of data used in complex system research in the fields of biology, biomedicine, consumer preferences, finance, social networks, sociology, traffic problems and telecommunication.
51
Network Dynamics and Simulation Science Laboratory How much does detail matter? Interaction picture: –Dynamics of outbreak depend on topology –How and how much? –What differences in network topology are relevant to prevention/mitigation What statistics capture difference? Answer staring us in the face (see above): –Overall attack rate is a function of the topology of the network Other measures for other questions –Attack rate by transmissibility as function of edges retained –Vulnerability of a subset as function of edges retained –Distribution of vulnerabilities as function of edges retained How much does detail matter?
52
Network Dynamics and Simulation Science Laboratory Edge deletion in a graph RTI synthesized poultry farm network In collaboration with Upenn, studying outbreaks National network, essentially complete graph –Distribution of weights Attack rate as function of edges retained Attack rate by transmissibility as function of edges retained Vulnerability of a subset as function of edges retained Distribution of vulnerabilities as function of edges retained
53
Network Dynamics and Simulation Science Laboratory Model comparison Compare outcomes of same scenarios –Compare distributions of outcomes of similar scenarios –Compare distributions of summary statistics of outcomes of similar scenarios –Compare distributions of answers to questions about similar scenarios Compare
54
Network Dynamics and Simulation Science Laboratory Adds up to serious informatics challenge Managing the refinement process Integrating various data sources & simulations Curating the database Providing HPC services Providing analysis support
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.