1 / 25

UBLF:An Upper Bound Based Approach to Discover Influential Nodes in Social Networks

IEEE ICDM 2013. UBLF:An Upper Bound Based Approach to Discover Influential Nodes in Social Networks. Authors: C. Zhou, P. Zhang, J. Guo, X. Zhu, L. Guo Presenter: Peng Zhang , Chinese Academy of Sciences December 7-10, 2013 , Dallas, Texas. Content. Background Problem Formulation

sorcha
Download Presentation

UBLF:An Upper Bound Based Approach to Discover Influential Nodes in Social Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IEEE ICDM 2013 UBLF:An Upper Bound Based Approach to Discover Influential Nodes in Social Networks Authors: C. Zhou, P. Zhang, J.Guo, X. Zhu, L. Guo Presenter: Peng Zhang, Chinese Academy of Sciences December 7-10, 2013, Dallas, Texas

  2. Content • Background • Problem Formulation • Related work • Our solution • Experiments • Conclusion

  3. Background • Social networks are popularly used • Viral marketing • Information dissemination • Technology/Idea transfers • Influence propagation • Influence maximization • Community detection • Influence inference • Early warning of public opinion • Link Prediction/Friends Recommendation • Partner Recommendation/Social Cooperation/Team Formation

  4. Problem Formulation • Given a directed social graph G=(V,E), a budgetk, and a stochastic propagation model M, finding k nodes, such that the expected spread of the influence can be maximized [Kemp KDD’03] • Challenges: • How to measure the objective function M(S)? • How to find the optimal solution, i.e., the subset k of the most influential nodes?

  5. Problem Formulation b • How to measure the influence M(S)? • Stochastic propagation models • IC model • LT model • Other propagation models: e.g. continuous time IC or LT models • Monte Carlo (MC) simulation • Exact calculation under IC and LT is #P-hard (Chen, KDD’ 10). .3 c .1 .3 .1 .2 .1 e a .3 .4 f .2 .4 .1 .4 .3 h .1 d .2 .1 .2 .4 g I .4 .1 IC propagation model #P-hard

  6. Greedy Algorithm • How to find a subset k containing the most influential nodes • Influence maximization under both IC and LT models isNP-hard . (Kemp, KDD’03) • Property 1: M(S)is monotone: • Property 2: M(S)is submodular: The set cover problem

  7. Greedy Algorithm • Advantage: Performance guarantee of 1− 1/e =63% • Disadvantage: Heavy computation cost • Inner loop: M(S)needs many Monte-Carlo simulations • Outer loop:time complexity of O(Nk), where N is network size

  8. Improvement direction (I): Heuristic algorithms • Heuristic algorithms • ShortestPath: Kimura and Saito (PKDD’06) “Tractable models for information diffusion in social networks” • DegreeDiscount: Chen et al. (KDD'09) “Efficient influence maximization in social networks” • MIA: Chen et al. (KDD'10) “Scalable influence maximization for prevalent viral marketing in large-scale social networks” • DAG: Chen et al. (ICDM’10) “Scalable influence maximization in social networks under the linear threshold model” • SIMPATH: Goyal et al. (ICDM’11)“SIMPATH: An Efficient Algorithm for Influence Maximization under the Linear Threshold Model” e d f g Shortest Path from a to c Node 2’s degree will shrink 2 • Advantage: faster than the Greedy algorithm • Disadvantage: no performance guarantee 5 DegreeDiscount

  9. Improvement direction (II): Advanced greedy • Advanced greedy algorithms • CELF: Leskovec et al. (KDD'07) “Cost-effective outbreak detection in networks” • Goyal et al. (WWW’11) “CELF++: optimizing the greedy algorithm for influence maximization in social networks” Greedy algorithm reward d a b b a c e c d e

  10. Improvement direction (II): Advanced greedy • Advanced greedy algorithms • CELF: Leskovec et al. (KDD'07) “Cost-effective outbreak detection in networks” • Goyal et al. (WWW’11) “CELF++: optimizing the greedy algorithm for influence maximization in social networks” Greedy algorithm reward d a b b a c e c d e

  11. Improvement direction (II): Advanced greedy • Advanced greedy algorithms • CELF: Leskovec et al. (KDD'07) “Cost-effective outbreak detection in networks” • Goyal et al. (WWW’11) “CELF++: optimizing the greedy algorithm for influence maximization in social networks” CELF algorithm Greedy algorithm reward reward d a d a b b b a b a c e c e c c d d e e

  12. e c b d Improvement direction (II): Advanced greedy • Advanced greedy algorithms • CELF: Leskovec et al. (KDD'07) “Cost-effective outbreak detection in networks” • Goyal et al. (WWW’11) “CELF++: optimizing the greedy algorithm for influence maximization in social networks” Greedy algorithm CELF algorithm reward reward d a d a b b a b a e c e c c d e

  13. e b Improvement direction (II): Advanced greedy • Advanced greedy algorithms • CELF: Leskovec et al. (KDD'07) “Cost-effective outbreak detection in networks” • Goyal et al. (WWW’11) “CELF++: optimizing the greedy algorithm for influence maximization in social networks” Greedy algorithm CELF algorithm reward reward d a d a b b d a b a e • Advantage: by setting up an upper bound, CELF reduces the Monte-Carlo calls and improves the greedy algorithm by up to 700 times • Disadvantage: needs N Monte Carlo simulations to initialize the upper bound, where N is the network size. c e c c d c e

  14. Our work • Motivation • Can we initialize the upper bounds without actually computing the MC simulations ? UBLF algorithm CELF algorithm UBLF algorithm

  15. The upper bound of M(S) Local view Global view How many heads? Proposition 2 establishes a relationship among the activation probabilties in time t and t+1.

  16. The upper bound of M(S) M(S) is bounded by a sum of series. In what condition the series convergent? and what is the limit? Too hard! Its aera? But we know its upper bound!

  17. The upper bound of M(S) Convergent condition:the total influence to or from any node is less than 1. Under condition (14), we get a tractable upper bound. +……=

  18. Our UBLF algorithm • CELF: the first round is time-consuming, needs full MC simulations. • UBLF: the first round is analytical calculated.

  19. Monte-Carlo Simulation Node 1 is selected! (only 1 time MC simulation) Our work: An example for UBLF

  20. Experiments • Data collection • Ca-GrQc • Digger • Ca-HepPh • Email-Enron • Benchmark • CELF • Degree • DegreeDiscount • PageRank • Random Statistics of datas

  21. Experiments • Comparison results (Numbers of MC simulations) Observation: The total MC calls of UBLF is significantly reduced compared to CELF.

  22. Experiments • Comparison results (Influence spread) Observations: The spreads of UBLF and CELF are completely identical, which explains again that UBLF and CELF share the same logic in selecting nodes.

  23. Experiments • Comparison results (Time cost) Observation: UBLF is 2-5 times faster than CELF.

  24. Conclusions Background Problem Formulation Greedy Algorithm Heuristic algorithms: DegreeDiscount, PageRank, et al. Advanced greedy algorithms: CELF, CELF++ UBLF Comparisons

  25. Email: zhangpeng@iie.ac.cn Questions ?

More Related