Simulation Science Laboratory Modeling Disease Transmission Across Social Networks DIMACS seminar February 7, 2005 Stephen Eubank Virginia Bioinformatics Institute Virginia Tech
Simulation Science Laboratory Variations on a Theme I. Estimating a Social Network II. Varieties of Social Networks III. Characterizing Networks for Epidemiology
Simulation Science Laboratory Translation Compute structural properties of very large graphs –Which ones? Are local properties enough? Structural properties should be robust –How? need efficient algorithms Generate constrained random graphs –for experiment Chung-Lu, Reed-Molloy, MCMC –for analysis preserve independence as much as possible
If not uniform mixing, what? Homogenous Isotropic ?... ~ 2~ 2~ 2~ 2 N2 alternative networks ODE model Network model
Simulation Science Laboratory Do Local Constraints Fix Global Properties? N vertices ~ 2 N 2 graphs (non-identical vertices few symmetries) E edges ~ N 2E graphs Degree distribution ?? graphs Clustering coefficient ?? graphs What additional constraints ?? graphs equivalent w.r.t. epidemics?
Simulation Science Laboratory Estimating a social network Synthetic population Survey (diary) based activity templates Iterative solution to a large game –Assigning locations for activities (depends on travel times) –Planning routes –Estimating travel times (depends on activity locations)
Simulation Science Laboratory Example Synthetic Household
Example Route Plans second person in household first person in household
Estimating Travel Times by Microsimulation 7.5 meter 1 lane cellular automaton grid cells intersection with multiple turn buffers (not internally divided into grid cells) single-cell vehicle multiple-cell vehicle
Simulation Science Laboratory Typical Family’s Day Carpool Home WorkLunchWork Carpool Bus Shopping Car Daycare Car School time Bus
Simulation Science Laboratory Others Use the Same Locations time
Simulation Science Laboratory Time Slice of a Social Network
Simulation Science Laboratory Home Activities Adapt to Situation
Example: Smallpox Response Efficacy # deaths per initial infected by day 100
Simulation Science Laboratory Part II: Varieties of Social Networks Definition of vertex –People –Concepts (location, role in society, group) Definition of edge –Effective contact –Proximity Weights –Edges: Interaction strength / probability of transmission –Vertices: “importance” Time dependence Directionality
Simulation Science Laboratory A Social Network: multipartite labeled graph People (8.8 million) Vertex attributes: age age household size household size gender gender income income … …
Simulation Science Laboratory A Social Network: bipartite labeled graph Vertex attributes: (x,y,z) (x,y,z) land use land use … … Locations (1 million)
Simulation Science Laboratory A Social Network: bipartite labeled graph Edge attributes: activity type: shop, work, school activity type: shop, work, school (start time 1, end time 1) (start time 1, end time 1) probability of transmitting probability of transmitting
Simulation Science Laboratory A Social Network: projection onto people
Simulation Science Laboratory A Social Network: projection onto people [t1,t2][t2,t3][t3,t4][t4,t5]
Simulation Science Laboratory A Social Network: projection over time
Simulation Science Laboratory Dendrogram: actual path disease takes
Simulation Science Laboratory A Social Network: bipartite labeled graph
Simulation Science Laboratory A Social Network: projection onto locations
Simulation Science Laboratory A Social Network: projection onto locations t3t4t2
Simulation Science Laboratory A Social Network: projection over time
Simulation Science Laboratory Disease Dynamics & Scenario Determine Relevant Projections People projection: edge if people co-located –communicable disease + vaccination/isolation Location projection: directed edge if travel between locations –contamination, quarantine Time dependence: almost periodic –Important time scales set by disease dynamics: Infectious period Duration of contact for transmission
Example: Person-person graph
Person-person graph (~ dendrogram with p transmission = 1)
Dendrogram with p transmission << 1
Geographic spread
Simulation Science Laboratory Characterizing EpiSims Networks Degree distributions Pointwise clustering: ratio of # triangles to # possible Assortative mixing by degree, age, … Shortest path length distribution Expansion
Degree Distribution, location-location
Degree Distribution, people-people
Sensitivity to parameters
Simulation Science Laboratory Assortative Mixing in EpiSims Graphs Static people - people projection is assortative –by degree (~0.25) –but not as strongly by age, income, household size, … This is Like other social networks Unlike –technological networks, –Erdos-Renyi random graphs –Barabasi-Albert networks
Removing high degree people useless
Removing high degree locations better
Simulation Science Laboratory Clustering coefficient vs degree
Simulation Science Laboratory Characterizing Networks for Epidemiology Question: how to change a network to reduce [casualties]? Constraints: –Don’t know ahead of time where outbreak begins –Minimize impact on other social functions of network –Don’t know true network, only estimated one –Incorporate dependence on pathogen properties Optimization: –Propose edge/vertex removal based on measurable (local) properties –Quickly estimate effect of new structure How does propagation depend on structure?
Simulation Science Laboratory Suggested Metric N k (i) = Number of distinct people connected to person i by a (shortest) path of length k “k-betweenness”, “pointwise k-expansion” Important k values are related to ratio of incubation to response times Shortest path vs any path: depends on probability of transmission –Given N 1 (i),..., N k (i), can construct analog for non-shortest path of length k xAssumes static graph, but expect graph to change Simple cases incorporate intuitively important properties –For k=1, N 1 (i) = d(i) –For k=2, includes degree distribution, clustering, assortativity by degree
Simulation Science Laboratory Comparison to “usual suspects” xHarder to measure in real networks xDifficult to work with analytically Perturbative expansions (say, around tree-like structure) are lacking a small parameter to expand in Describes how clustering should be combined with degree Degree alone determines neither vulnerability nor criticality Betweenness is global, sensitive to small changes Usual statistics don’t incorporate time scales naturally
Simulation Science Laboratory Degree alone determines neither vulnerability nor criticality Same degree distribution Different assortative mixing by degree Introduce index case uniformly at random, what color (degree) is vulnerable? Top graph: degree 1, 80% of the time Bottom graph: degree 4, 80% of the time Critical vertex
Simulation Science Laboratory Use depends on how disease is introduced Introduction uniformly distributed, consider distribution over all people: mean, variance, … Introduction concentrated on specific part of graph, consider distribution over k-neighborhood Introduction by malicious agent, consider worst case or tail
Simulation Science Laboratory Conclusion Progress on many fronts, but plenty more to be done: Estimating large social networks Building efficient, scalable simulations Understanding structure of social networks Determining how structure affects disease spread