Lecture on Reverse Sampling: Models of Influence Diffusion and Greedy Algorithm

Lecture 2-7 Reverse Sampling Ding-Zhu Du University of Texas at Dallas

Outline • Greedy • Reverse Sampling

Models of Influence Diffusion • Two basic classes of probabilistic diffusion models: • thresholdand cascade • General operational view: • A social network is represented as a directed graph, with each person (customer) as a node. • Nodes start either active or inactive. • An active node may trigger activation of neighboring nodes • Monotonicity assumption: active nodes never deactivate.

Influence Maximization Problem • Influence spread of node set S: σ(S) • expected number of active nodes at the end of diffusion process, if set S is the initial active set. • Problem Definition (by Kempe et al., 2003): (Influence Maximization). Given a directed and edge-weighted social graph G = (V,E, p), a diffusion model m, and an integer k ≤ |V |,find a set S ⊆ V , |S| = k, such that the expected influence spread σm(S) is maximum.

Known Results • Bad news: NP-hard optimization problem for both IC and LT models. • Good news: • σm(S) is monotone and submodular. • We can use Greedy algorithm! • Theorem: The resulting set S activates at least (1-1/e) (>63%) of the expected number of nodes that any size-k set could activate .

Disadvantage • Lack of efficiency. • Computing σm(S) is # P-hard under both IC and LT models. • Selecting a new vertex u that provides the largest marginal gain σm(S+u) - σm(S), which can only be approximated by Monte-Carlo simulations (10,000 trials). • Assume a weighted social graph as input. • How to learn influence probabilities from history?

What’s running time? • Let rbe the number of samplings for computing σm(S+u) - σm(S). • It runs k iterations. • Each iteration requires estimating the expected spread of O(n) node sets S+u. • Each estimation of expected spread takes measurements on r graphs, and each measurement needs O(m) time. • Total running time O(kmnr).

Theorem

Comments • Waste time on sampling because every randomly generated graph is used only once for a value of objective function

Outline • Greedy • Reverse Sampling Analysis: part 1-sampling part 2-submodular max part3-parameter

Smart Way • Step 1. Randomly generates ƟRR sets. • Step 2. Find k nodes to hit maximum number of RR sets.

What’s RR Sets?

Lemma 1

Lemma 2

Multiplicative Chernoff bound

Proof of Lemma 2

Step 2. Max Coverage Given a collection C of subsets of a set E, find a subset S of E, with |S|<k, to maximize the number of subsets in C hit(covered) by S . Subsets in C = RR set S = seed set

Step 2. Max Coverage Given a collection C of subsets of a set E, find a subset S of E, with |S|<k, to maximize the number of subsets in C hit(covered) by S .

Submadular Function Max

Greedy Algorithm

Performance Ratio Theorem (Nemhauser et al. 1978)

Theorem Proof

Outline • Greedy • Reverse Sampling Analysis: part 1-sampling part 2-submodular max part3-parameter estimation

Lemma 2

Breath-first search (BFS) • For generation of RR set, a randomized BFS is employed.

Lemma 3

Improvement

References

A New Springer Journal ComputationalSocial Networks Editor-in-Chief: Ding-Zhu Du My T. Thai Welcome to Submit Papers

THANK YOU!

Markov's inequality Proof.

Theorem

Multiplicative Chernoff bound

Lecture on Reverse Sampling: Models of Influence Diffusion and Greedy Algorithm

Lecture on Reverse Sampling: Models of Influence Diffusion and Greedy Algorithm

Presentation Transcript

Lecture 7: Sampling

CHM 410/1410 Lecture 2 Environmental Sampling

Lecture 7b: Sampling

Survey Sampling - 2

Lecture 7 Term 2

Chapter 7 Sampling and Sampling Distributions

Lecture 10. Random Sampling and Sampling Distributions

Lecture – 7 Sample Design and Sampling Procedure

Lecture 7 Reverse faults and folds I

Chapter 7 Sampling

Lecture 2 Outline (Ch. 7)

Eco 100 Lecture 7-2

Lecture 7: Features Part 2

September 7, 2006 Lecture 2

Chapter 7 Sampling and Sampling Distributions

7-1 and 7-2 Sampling Distribution Central Limit Theorem

Sampling Distributions 2-2

Lecture-4 Sampling Methods 2. Stratified Random Sampling. Engr. Dr . Attaullah Shah

Week 7 Lecture 1+2

Lecture 2 Sampling Techniques

Lecture 7 Reverse faults and folds I