Discovering Important Nodes through Graph Entropy - PowerPoint PPT Presentation

dexter-bonner
discovering important nodes through graph entropy n.
Skip this Video
Loading SlideShow in 5 Seconds..
Discovering Important Nodes through Graph Entropy PowerPoint Presentation
Download Presentation
Discovering Important Nodes through Graph Entropy

play fullscreen
1 / 17
Download Presentation
Discovering Important Nodes through Graph Entropy
117 Views
Download Presentation

Discovering Important Nodes through Graph Entropy

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Discovering Important Nodes through Graph Entropy Jitesh Shetty, Jafar Adibi [KDD’ 05] Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei, Liang Date: 2008/09/18

  2. Outline • Introduction • Order In Networks • Graph Entropy • Experimental Result • Conclusions

  3. Introduction • A new challenge in the area of Link Discovery and Social Network Analysis • To exploit communication pattern information and text information within knowledge discovery processes • such as discovery of hidden organizational structure and selection of interesting prominent members

  4. Introduction • Email logs • Prime importance and relevance in the study of information flow in an organization • Evidence database for law enforcement and intelligence organizations to detect hidden groups in an organization which are engaged in illegal activities • Graph entropy • To determine the most prominent interesting people

  5. Order In Networks • A graph model might not be the best representation of organizations • Such as drug dealers, terrorist organization, threat groups • Usually ignore their hierarchy • They are composed of leaders and followers

  6. Order In Networks • Example

  7. Graph Entropy (1/6) • To find prominent people in a network • Need to aggregate links between them and discover which node has the most effect on network • Entropy model can identify an entity that most effect on the graph entropy • Transform the problem space into a multigraph • Each node represents an entity, each link represents action between entities

  8. Graph Entropy (2/6)

  9. Graph Entropy (3/6) • Let G = (V, E) be a graph. P is the probability distribution on the vertex set V(G) • P(AemailB) =

  10. Graph Entropy (4/6) • A great concern in LD domain is that elements of data are not independent • Ex: link AsendemailtoB and link BsendemailtoC are dependent to each other, means B may forward A’s email to C • Three approach to discover dependency • Examine the similarity of emails • check

  11. Graph Entropy (5/6) 3. Exploitation of Markov Blanket type of model • Assume an event(link) between two nodes is only dependent to those node’s events

  12. Graph Entropy (6/6)

  13. Experiment • Enron Email Dataset • 151 users, mostly senior management of Enron • contains 252,759 email messages • Almost all users use folders to organize their emails

  14. Experiment

  15. Experiment • Created an Enron dictionary • Normalized all emails using porter stemming algorithm • Compare the vectors using Jaccards Algorithm • Ordered emails based on the time stamp

  16. Experiment

  17. Conclusions • Defined and addressed the problem of important nodes and finding closed group around them • Using event based entropy to find influential nodes in a graph and exhibit entropy model can act as a good means for detecting influential nodes