1 / 30

DAVA: Distributing Vaccines over Networks under Prior Information

DAVA: Distributing Vaccines over Networks under Prior Information. Yao Zhang, B . Aditya Prakash Department of Computer Science Virginia Tech. SDM, Philadelphia, April 24, 2014. Motivation: Epidemiology. Virus spreads over contact networks SIR model [Anderson+ 1991]

altessa
Download Presentation

DAVA: Distributing Vaccines over Networks under Prior Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DAVA: Distributing Vaccines over Networks under Prior Information Yao Zhang,B. Aditya Prakash Department of Computer Science Virginia Tech SDM, Philadelphia, April 24, 2014

  2. Motivation: Epidemiology • Virus spreads over contact networks • SIR model [Anderson+ 1991] • Susceptible-Infectious-Recovered • Weights pij: propagation prob. from i to j • Recovered prob. δ for each node • (models mumps-like infections) Zhang and Prakash, SDM2014

  3. Motivation: Social Media • Meme/Rumor spreads over friendship networks • E.g.: Twitter following network • Independent cascade model (IC) [Kempe+ KDD2003] • Each node has only one chance to infect its neighbors • Special case of SIR model Zhang and Prakash, SDM2014

  4. Immunization • Centers for Disease Control (CDC) cares about containing epidemic diseases • E.g: ~400 million dollars used for vaccines for children in 2013 • Twitter tries to stop rumor spread • E.g.: rumors of victims after the Boston Marathon bombs in 2013 How to choose best nodes to vaccinate (remove)? Zhang and Prakash, SDM2014

  5. Immunization Pre-emptiveimmunization (choose nodes before the epidemic starts) • Acquaintance strategy [Cohen+ 2003] • pick a random person, immunize one of its neighbors at random • Netshield [Tong+ 2010] • Minimize the epidemic threshold (point when the virus takes-off) Good for baseline strategies Zhang and Prakash, SDM2014

  6. In reality Pre-emptive immunization (choose nodes before the epidemic starts) • Acquaintance strategy [Cohen+ 2003] • Netshield[Tong+ 2010] Typically the epidemic has already started! • More realistic intervention • Which nodes to vaccinate now? • We call it Data-Aware Immunization ? this paper Zhang and Prakash, SDM2014

  7. Outline • Motivation • Problem Definition • Complexity • Our Proposed Methods • Experiments • Conclusion Zhang and Prakash, SDM2014

  8. Data-Aware Vaccination Problem Problem: Given a set of infected nodes anda contact graph, howto distribute k vaccines (node removal) to minimize the expected number of infected nodes at the end of the epidemic? D D Best solution A A E E B B 1 vaccine? F F C C Remove A, save {A, D}; Remove B, save {B}; Remove C, save {C}; pij =1 for all edges Zhang and Prakash, SDM2014

  9. Outline • Motivation • Problem Definition • Complexity • Our Proposed Methods • Experiments • Conclusion Zhang and Prakash, SDM2014

  10. Complexity of DAV See paper for details • NP-hard • Reduce from Maximum K-Intersection Problem (MaxKI: maximizing the intersection of k subsets) • MaxKIis NP-Complete [Vinterbo 2004] • Approximation algorithm? • Not submodular • Actually, DAV ishard to approximate within an absolute error! Zhang and Prakash, SDM2014

  11. Outline • Motivation • Problem Definition • Complexity • Our Proposed Methods • assume IC model and undirected graph • Experiments • Conclusion Zhang and Prakash, SDM2014

  12. 1: Simplify - Merging infected nodes • Idea: merge all the infected nodes into a single ‘super infected’ node I Merged Graph Original Graph Super node I A A pA pA Equivalent B pX B pB pY pC Logical-OR pB=1-(1-pX)(1-pY) pC C C Zhang and Prakash, SDM2014

  13. 2: DAVA-Tree Algorithm: Idea • Select nodes with the largest “benefit” • : the expected number of saved nodes after removing set S on graph G • Benefit of adding additional node j into S: # of saved nodes after adding j into S Merged Infected Node Additional number of saved nodes when adding node j into S Benefit: 5 Benefit: 4 pij =1for all edges Benefit: 2 Zhang and Prakash, SDM2014

  14. DAVA-Tree Alg.: Optimal on Trees For any set S: • Fact 1: the chosen nodes in the optimal set must be neighbors of infected node I Merged Infected Node • Fact 2: the benefit of each such node is independent of the rest of the set S Benefit: 2 Benefit: 5 pij =1for all edges Benefit: 4 Linear Time DAVA-tree algorithm: Select top k node from I’s neighbors with the max. benefit Zhang and Prakash, SDM2014

  15. 3: General Case – Arbitrary Graphs • Idea • We have the optimal algorithm for a tree • Extract a spanning tree, then run DAVA-tree • What kind of tree? • Minimum spanning tree Optimal on MST by DAVA-tree Optimal solution MST pij =1 for all edges Zhang and Prakash, SDM2014

  16. 3: General Case – Arbitrary Graphs • Idea • We have the optimal algorithm for a tree • Build a spanning tree first • What kind of tree? • Minimum spanning tree Software engineering We propose to use dominator tree u dominates v every path from I to v contains u 4 dominates 8,9,10,11 pij =1 for all edges Zhang and Prakash, SDM2014

  17. Dominator Tree u dominates v AND every other dominator of v dominates u uis immediate dominator of v • Fact 1: the optimal solution should be among the children of root I in the dominator tree for any arbitrary graph • Fact 2:(for special case, k = 1, p = 1) running DAVA-tree on the dominator tree gives the optimal solution Dominator tree: add an edge between every such u and v Optimal from DAVA-tree Optimal solution Linear time [Buchsbaum, Tarjan 1998] pij =1 for all edges Dominator Tree Merged Graph Zhang and Prakash, SDM2014

  18. Weighting the dominator tree • Weighting the dominator tree • #P-complete • Our solution: maximum propagation path probability between nodes I and v (using Dijkstra’s algorithm) w1 p1 p3 w3 p6 w6 Dominator Tree Merged Graph Zhang and Prakash, SDM2014

  19. DAVA algorithm Merged Graph (pij=1 for all edges) Step: 1. T = Build a dominator tree 2. v = Run DAVA-tree on T with budget=1 3. Remove v from G 4. Goto Step 1 until |S|=k |S|=2 Iteration=1 Dominator Tree Zhang and Prakash, SDM2014

  20. DAVA algorithm Merged Graph Step: 1. T = Build a dominator tree 2. v = Run DAVA-tree on T with budget=1 3. Remove v from G 4. Goto Step 1 until |S|=k Remove selected node O(k(|E|+ |V|log|V|)) Too slow for large networks! Dominator tree |S|=2 Iteration=2 Iteration=1 Zhang and Prakash, SDM2014

  21. DAVA-fast: a faster algorithm Merged Graph Step: 1. T = Build a dominator tree 2. S = Run DAVA-tree on T with budget=k |S|=2 • In practice, the performance of DAVA-fast is very close to DAVA • Time complexity: subquadratic! • DAVA-fast: O(|V|log|V|+|E|) Dominator tree Zhang and Prakash, SDM2014

  22. Extending to SIR model • See the paper Zhang and Prakash, SDM2014

  23. Outline • Motivation • Problem Definition • Complexity • Our Proposed Methods • Experiments • Conclusion Zhang and Prakash, SDM2014

  24. Experiments • Virus Propagation Model • IC and SIR • Settings (See more settings in the paper) • Randomly uniformly chosen initial infected nodes • Baseline Algorithms • RANDOM: randomly uniformly chosen healthy nodes • DEGREE: choose nodes with top weighted degrees • PAGERANK: choose nodes with top pageranks • NETSHIELD • state-of-the-art pre-emptive immunization algorithm to minimize the epidemic threshold of the graph [Tong+ ICDM 2010] • Assumes no data is given before the epidemic starts Zhang and Prakash, SDM2014

  25. Experiments: datasets Datasets are chosen from different domains • Social media (IC model) • OREGON: AS router graph • STANFORD: hyperlink network • GNUTELLA: peer-to-peer network • BRIGHTKITE: friendship network • Epidemiology (SIR model) • PORTLANDand MIAMI: large urban social-contact graph used in national smallpox modeling studies [Eubank+, 2004] Zhang and Prakash, SDM2014

  26. Experiments: Quality PORTLAND (SIR model) GNUTELLA (IC model) Higher is better DAVA consistently outperforms the baseline algorithms. Further DAVA-fast performs almost as well as DAVA. (See more results in the paper) Zhang and Prakash, SDM2014

  27. Experiments: Scalability did not finish within 10 hours Running time(sec.) Lower is better Zhang and Prakash, SDM2014

  28. Outline • Motivation • Problem Definition • Complexity • Our Proposed Methods • Experiments • Conclusion Zhang and Prakash, SDM2014

  29. Conclusion Graph with infected nodes Data-Aware Vaccination problem Given: Graph and Infected nodes Find: ‘best’ nodes for immunization • Complexity • NP-hard • Hard to approximate within an absolute error • DAVA-tree • Optimal solution on the tree • DAVA and DAVA-fast • Merging infected nodes • Build a dominator tree, and run DAVA-tree • Running time: subquadratic • DAVA: O(k(|E|+ |V|log|V|)) • DAVA-fast: O(|E|+|V|log|V|) Merged graph Dominator tree Zhang and Prakash, SDM2014

  30. Any Questions? Graph with infected nodes Code at: http://people.cs.vt.edu/~yaozhang Merged graph Yao Zhang B. Aditya Prakash Dominator tree Thanks for the support of NSF (Grant No. IIS-1353346). Zhang and Prakash, SDM2014

More Related