spotting culprits in epidemics how many and which ones n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Spotting Culprits in Epidemics: How many and Which ones? PowerPoint Presentation
Download Presentation
Spotting Culprits in Epidemics: How many and Which ones?

Loading in 2 Seconds...

play fullscreen
1 / 27

Spotting Culprits in Epidemics: How many and Which ones? - PowerPoint PPT Presentation


  • 98 Views
  • Uploaded on

Spotting Culprits in Epidemics: How many and Which ones?. B. Aditya Prakash Virginia Tech Jilles Vreeken University of Antwerp Christos Faloutsos Carnegie Mellon University. IEEE ICDM Brussels December 11, 2012. Contagions. Social collaboration Information Diffusion Viral Marketing

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Spotting Culprits in Epidemics: How many and Which ones?' - neka


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
spotting culprits in epidemics how many and which ones

Spotting Culprits in Epidemics: How many and Which ones?

B. Aditya Prakash Virginia Tech

JillesVreekenUniversity of Antwerp

Christos FaloutsosCarnegie Mellon University

IEEE ICDM Brussels

December 11, 2012

contagions
Contagions
  • Social collaboration
  • Information Diffusion
  • Viral Marketing
  • Epidemiology and Public Health
  • Cyber Security
  • Human mobility
  • Games and Virtual Worlds
  • Ecology
  • Localized effects: riots…
virus propagation
Virus Propagation
  • Susceptible-Infected (SI) Model

[AJPH 2007]

β

CDC data: Visualization of the first 35 tuberculosis (TB) patients and their 1039 contacts

Diseases over contact networks

Prakash, Vreeken, Faloutsos 2012

outline
Outline
  • Motivation---Introduction
  • Problem Definition
  • Intuition
  • MDL
  • Experiments
  • Conclusion

Prakash, Vreeken, Faloutsos 2012

culprits problem definition
Culprits: Problem definition

2-d grid

Q: Who started it?

Prakash, Vreeken, Faloutsos 2012

culprits problem definition1
Culprits: Problem definition

2-d grid

Q: Who started it?

Prior work: [Lappas et al. 2010, Shah et al. 2011]

Prakash, Vreeken, Faloutsos 2012

outline1
Outline
  • Motivation---Introduction
  • Problem Definition
  • Intuition
  • MDL
  • Experiments
  • Conclusion

Prakash, Vreeken, Faloutsos 2012

culprits exoneration
Culprits: Exoneration

Prakash, Vreeken, Faloutsos 2012

culprits exoneration1
Culprits: Exoneration

Prakash, Vreeken, Faloutsos 2012

who are the culprits
Who are the culprits
  • Two-part solution
    • use MDL for number of seeds
    • for a given number:
      • exoneration = centrality + penalty
  • Running time =
    • linear! (in edges and nodes)

NetSleuth

Prakash, Vreeken, Faloutsos 2012

outline2
Outline
  • Motivation---Introduction
  • Problem Definition
  • Intuition
  • MDL
    • Construction
    • Opitimization
  • Experiments
  • Conclusion

Prakash, Vreeken, Faloutsos 2012

modeling using mdl
Modeling using MDL
  • Minimum Description Length Principle == Induction by compression
  • Related to Bayesian approaches
  • MDL = Model + Data
  • Model
    • Scoring the seed-set

Number of possible |S|-sized sets

En-coding integer |S|

Prakash, Vreeken, Faloutsos 2012

modeling using mdl1
Modeling using MDL
  • Data: Propagation Ripples

Infected Snapshot

Original Graph

Ripple R1

Ripple R2

Prakash, Vreeken, Faloutsos 2012

modeling using mdl2
Modeling using MDL
  • Ripple cost
  • Total MDL cost

Ripple R

How the ‘frontier’ advances

How long is the ripple

Prakash, Vreeken, Faloutsos 2012

outline3
Outline
  • Motivation---Introduction
  • Problem Definition
  • Intuition
  • MDL
    • Construction
    • Opitimization
  • Experiments
  • Conclusion

Prakash, Vreeken, Faloutsos 2012

how to optimize the score
How to optimize the score?
  • Two-step process
    • Given k, quickly identify high-quality set
    • Given these nodes, optimize the ripple R

Prakash, Vreeken, Faloutsos 2012

optimizing the score
Optimizing the score
  • High-quality k-seed-set
    • Exoneration
  • Best single seed:
    • Smallest eigenvector of Laplaciansub-matrix
    • Analyze a Constrained SI epidemic
  • Exonerate neighbors
  • Repeat

Prakash, Vreeken, Faloutsos 2012

optimizing the score1
Optimizing the score
  • Optimizing R
    • Get the MLE ripple!
  • Finally use MDL score to tell us the best set
  • NetSleuth: Linear running time in nodes and edges

Ripple R

Prakash, Vreeken, Faloutsos 2012

outline4
Outline
  • Motivation---Introduction
  • Problem Definition
  • Intuition
  • MDL
  • Experiments
  • Conclusion

Prakash, Vreeken, Faloutsos 2012

experiments
Experiments
  • Evaluation functions:
    • MDL based
    • Overlap based

(JD == Jaccard distance)

How far are they?

Closer to 1 the better

Prakash, Vreeken, Faloutsos 2012

experiments of seeds
Experiments: # of Seeds

One Seed

Two Seeds

Three Seeds

experiments quality mdl and jd
Experiments: Quality (MDL and JD)

One Seed

Two Seeds

Ideal = 1

Three Seeds

Prakash, Vreeken, Faloutsos 2012

experiments quality jaccard scores
Experiments: Quality (Jaccard Scores)

One Seed

Two Seeds

NetSleuth

Closer to diagonal, the better

True

Three Seeds

Prakash, Vreeken, Faloutsos 2012

experiments scalability
Experiments: Scalability

Prakash, Vreeken, Faloutsos 2012

outline5
Outline
  • Motivation---Introduction
  • Problem Definition
  • Intuition
  • MDL
  • Experiments
  • Conclusion

Prakash, Vreeken, Faloutsos 2012

conclusion
Conclusion
  • Given:Graph and Infections
  • Find: Best ‘Culprits’
  • Two-part solution
    • use MDL for number of seeds
    • for a given number:

exoneration = centrality + penalty

  • NetSleuth:
    • Linear running time

in nodes and edges

Prakash, Vreeken, Faloutsos 2012

any questions
Any Questions?

B. AdityaPrakash

http://www.cs.vt.edu/~badityap

Prakash, Vreeken, Faloutsos 2012