Random walk on graphs and its algorithmic applications
1 / 67

Random Walk on Graphs and its Algorithmic Applications - PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Random Walk on Graphs and its Algorithmic Applications. Shengyu Zhang Winter School, ITCSC@CUHK, 2009. Random walk on graphs. On an undirected graph G: Starting from vertex v 0 Repeat for a number of steps: Go to a random neighbor. Simple but powerful. Road map: Random walk. Parameters

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Random Walk on Graphs and its Algorithmic Applications

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Random Walk on Graphs and its Algorithmic Applications

Shengyu Zhang

Winter School, ITCSC@CUHK, 2009

Random walk on graphs

On an undirected graph G:

  • Starting from vertex v0

  • Repeat for a number of steps:

    • Go to a random neighbor.

  • Simple but powerful.

Road map: Random walk




  • k-SAT

  • st-connectivity

Hitting time

  • PageRank

  • Approximate counting

  • Error-reduction

Mixing time

Road map: Quantum walk

Short intro to math model of quantum mechanics



Element Distinctness

Discrete QW

Formula Evaluation

Continuous QW

PART I. Random Walk

Key parameter 1: Hitting time

Hitting time

  • Recall the process of random walk on a graph G.

    • Starting vertex v0

    • Repeat for a number of steps:

      • Go to a random neighbor.

  • Hitting time: H(i,j) = expected time to visit j (for the first time), starting at i










(n/2)-complete graph

Undirected graphs

  • Complete graph

    • H(i,j) = n-1 (i≠j)

  • Line:

    • H(i,j) = j2-i2 (i<j)

    • In particular, H(0,n-1) = (n-1)2.

  • General graph:

    • H(i,j) = O(n3).

Algorithm 1: 2-SAT

k-SAT: satisfiability of k-CNF formula

  • n variables x1, …, xn∊{0,1}

  • m clauses, each being OR of k literals

    • Literal: xi or ¬xi

    • e.g. (k=3): (¬x1)٧x5 ٧x7

  • 3-CNF formula: AND of these m clauses

    • e.g. ((¬x1)٧x5 ٧x7)٨(x2 ٧(¬x5)٧ (¬x7))٨(x1 ٧x7 ٧x8)

  • 3-SAT Problem: Given a 3CNF formula, decide whether there is an assignment of variables s.t. the formula evaluates to 1.

    • For the above example, Yes: x5=1, x7=0, x1=1

P vs. NP

  • P: problems that can be easily solved

    • “easily”: in polynomial time

  • NP: problems that can be easily verified.

    • Formally: ∃ a polynomial time verifier V, s.t. for any input x,

      • If the answer is YES, then ∃y s.t. V(x,y) = 1

      • If the answer is NO, then ∀y, V(x,y) = 1

  • The question of TCS: Is P = NP?

    • Intuitively, no. NP should be much larger.

      • It’s much easier to verify (with help) than to solve (by yourself)

      • mathematical proof, appreciation of good music/food, …

    • Formal proof? We don’t know yet.

      • One of the 7 Millennium Problems by CMI.①

① http://www.claymath.org/millennium/P_vs_NP/


  • k-SAT is NP-complete, for any k ≥ 3.

  • NP-complete:

    • In NP

    • All other problems in NP can be reduced to it in poly. time

  • NP-complete problems are the hardest ones in NP.

  • 3-SAT is in NP:

    • witness --- satisfying assignment A

    • Verification: evaluate formula with variables assigned by A

  • [Cook-Levin] 3-SAT is NP-complete.

How about 2-SAT?

  • While 3-SAT is the hardest in NP, 2-SAT is solvable in polynomial time.

  • Here we present a very simple randomized algorithm, which has polynomial expected running time.

Algorithm for 2-SAT

  • 2SAT: each clause has two variables/negations

  • Alg [Papadimitriou]:

    • Pick any assignment

    • Repeat O(n2) time

      • If all satisfied, done

      • Else

        • Pick any unsatisfied clause

        • Pick one of the two literals each with ½ probability, and flip the assignment on that variable

(x1∨x2)∧(x2∨¬x3) ∧(¬x4∨x3) ∧(x5∨x1)

x1, x2, x3, x4, x5

0, 1, 0, 1, 0



  • (x1∨x2)∧(x2∨¬x3) ∧(¬x4∨x3) ∧(x5∨x1)

    • x1, x2, x3, x4, x5

    • 0, 1, 0, 1, 0

  • If unsatisfiable: never find an satisfying assignment

  • If satisfiable, there exists a satisfying assignment x

    • If our initially picked assignment x’ is satisfying, then done.

    • Otherwise, for any unsatisfied clause, at least one of the two variables is assigned a value different than that in x

    • Randomly picking one of the two variables and flipping its value increases # correct assignments by 1 w.p. ≥ ½

Analysis (continued)

  • Consider a line of n+1 points, where k represents “we’ve assigned k variables correctly”

    • “correctly”: the same way as x

  • Last slide: Randomly picking one of the two variables and flipping its value increases # correct assignments by 1 w.p. ≥ ½

  • Thus the algorithm is actually a random walk on the line of n+1 points, with Pr[going right] ≥ ½.

    • Hitting time (i → n): O(n2)

  • So by repeating this flipping process O(n2) steps, we’ll reach x with high probability.

Algorithm 2: st-connectivity


  • Problem: Given a graph G and two vertices s and t on it, decide whether they are connected.

  • BFS/DFS (starting at s) solves the problem in linear time.

  • But uses linear space as well.

  • Question: Can we use much less space?

Simple random walk algorithm

  • A randomized algorithm takes O(log n) space.

    • Starting at s, perform the random walk for O(n3) steps. If ever see t, output YES and stop.

    • output NO.

  • Why it works? H(s,t) = O(n3).

    • If s can reach t, then we should see it within O(n3) time.

Key parameter 2: Mixing time


  • Now let’s study the probability distribution of the particle after a long time.

  • What do we observe?

    • The distribution converges touniform…

    • wherever it starts.

  • Q: Does this hold in general?
















  • Well, Yes and No.

  • Consider the following cases

    • on undirected graphs.

  • Case 1: The graph is unconnected.

  • Case 2: The graph is bipartite.

  • [Thm] For any connectednon-bipartite graph, and any starting point, the random walk converges.




Converges to what?

  • In the previous triangle example: uniform.

  • In general?

  • As a result of the convergence, the distribution doesn’t change by the matrix

    • If the particle is on the graph according to the distribution, then further random walk will result in the same distribution.

      • We call it the stationary distribution.

    • [Fact] It’s unique.

stationary distribution

  • [Fact] It’s the following distribution: π(v) = d(v)/2m

    • d(v) = degree of v, i.e. # of neighbors.

    • m: |E|, i.e. # of edges.

  • [Proof] Consider one step of walk:

    π’(v) = ∑u: (u,v)∊E p(u)∙[1/d(u)]

    = ∑u: (u,v)∊E [d(u)/2m]∙[1/d(u)]

    = ∑u: (u,v)∊E 1/2m

    = d(v)/2m

    = π(v)

    So π is the stationary distribution.

  • For regular graphs, π is the uniform distribution.

Proportional to degree

  • Note: the stationary distribution π(v) = d(v)/2mis proportional to the degree of v.

  • What’s the intuition? The more neighbors you have, the more chance you’ll be reached.

  • We will see another natural interpretation shortly.

Speed of the convergence

  • We’ve seen that random walk converges to the stationary distribution.

  • Next question: how fast is the convergence?

  • Let’s define the mixing time as min {T: ∥pi(T) - π∥ ≤ε}where pi(T): the distribution in time T, starting at i.∥∙∥: some norm.

  • We’ll see the reason of mixing later. Now let’s first see an example you run into everyday.

Algorithm 3: PageRank


  • Google gives each webpage a number for its “importance”.

    • [IT] Google: 10, Microsoft: 9, Apple: 9

    • [Media] NYTimes: 9, CNN: 10, sohu: 8, newsmth: 7

    • [Sports] NBA: 7, CBA: 7, CFA: 7

    • [University] MIT: 9, CUHK: 8, …

      … Tsinghua, Pku, Fudan, IIT(B): 9

  • When you search for something by making a query, a large number of related webpages are retrieved.

    • What webpages to retrieve? Information Retrieval. That’s an orthogonal issue.


  • How to give this big corpus to you?

  • Search engines rank them based on the “importance”, and give them in descending order.

    • Thus presumably the first page contains the 10 most important webpages related to your query.

  • Question: How to rank?

  • PageRank: Use the vast link structure as in indicator of an individual page’s value.

Reference system

  • Webpage A has a link to webpage BA thinks B is useful.

    • Think of it as A writing a recommendation letter for B.

  • So a webpage with a lot of other pages pointing to it is probably important.

    • A guy getting a lot of letters is strong.

  • Further, pointers from pages that are themselves important bear more weight.

    • Letters by Noga, László, Sasha, Avi, Andy, … mean a lot.

  • But the importance of those pages also need to be calculated… we have a recurrence equation.


  • If page A has a lot of links, then each link means less.

    • Ok, you get Einstein’s letter, but you know what, last year everyone on the market got his letter.

  • So let’s assume that page A’s reference weight is divided evenly to all pages B that A links to.

  • Recurrence equation: R(A) = ∑B: B→A R(B)/d(B)where d(B) = # pages C that B links to

Sink issue

  • R(A) = ∑B: B→A R(B)/d(B) has a problem.

  • There are some “sink” pages that contain no links to other pages.

    • Sinks accumulate weights without giving out.

    • The recurrence equation only has solutions with weight on sinks, losing the original intension of indicating the importance of all pages.

  • To handle this, we modify the recursion:R(A) = (1-α)/N + α∑B: B→A R(B)/d(B)

    • Force each page to have a (1- α)-fraction of weights (evenly) going to all pages.

    • So each A also receives a weight of (1- α)/N from all pages

    • α: around 0.85

Random Walk view

  • Note that the recursion

    R(A) = (1-α)/N + α∑B: B→A R(B)/d(B)

    is exactly the random walk on the graph, where at each point A, we

    • w/ prob. α, follow a random link;

    • w/ prob. (1-α), go to a random page.

  • Question: How to solve this recurrence equation?

    • # webpages: ~ 30 billion (and counting…)


  • Recall: the random walk converges to the stationary distribution!

    • It’s a bit different since it’s random walk on directed graphs, but this PageRank matrix has all good properties we need so that the random walk also converges to the solution.

  • Algorithm: start from any distribution, run a few iterations of random walk, and output the result.

    • Google: 50-100 iterations, need a few days.

  • That should be close to the stationary distribution, which serves as indictor of the importance of pages.


  • We’ll see a bit math behind the mixing story.

Mathematics behind the mixing

  • [Eigenvalue decomposition] A symmetric matrix MNN can be written as M = ∑i=1,…,nλiξiξiT

    • λi: eigenvalue. Order them: |λ1| ≥ |λ2| ≥ |λ3| ≥ … ≥ |λn|

    • ξi: (column) eigenvector, i.e. Mξi = λiξi

  • The eigenvectors are orthonormal.

    • ∥ξi∥2=1

    • ξi, ξj = 0, for all i≠j

  • M2 = (∑i=1,…,nλiξiξiT)(∑j=1,…,nλjξjξjT)

    = ∑i,jλiξiξiT λjξjξjT = ∑i,jλi λj ξi,ξi ξjξjT

    = ∑iλi2ξjξiT

  • Mt = ∑iλitξjξiT

Random walk in matrix form

  • A: adjacency matrix. (Aij = 1 if (i,j) ∊E; 0 o.w.)

  • P: probability transition matrix.

    • Pij = 1/di if (i,j) ∊E; 0 o.w.

    • P = D-1A, where D = diag(d1, …, dn). (di: deg(i))

    • For any distribution p, PTp is the distribution after one step of random walk

  • N = D-1/2AD-1/2 = D1/2PD-1/2

    • Nij = 1/(didj)1/2 if (i,j) ∊E; 0 o.w.

    • N is symmetric, so N can be written as N = ∑i=1,…,nλiξiξiT

  • Random Walk for t steps:

    • Pt = (D-1/2ND1/2)t = D-1/2NtD1/2 = D-1/2 (∑iλitξiξiT)D1/2= ∑iλitD-1/2ξiξiTD1/2

Why mixing? Why to π?

  • [Thm] For connected non-bipartite graph, N has

    • λ1 = 1 and ξ1 = π1/2 = ((d1/2m)1/2, …, (dn/2m)1/2) is the square root version of the stationary distribution (so that the l2 norm is 1).

    • |λ2| < 1. (So is all other λi’s.)

  • Random Walk for t steps: Pt = ∑i λitD-1/2ξiξiTD1/2

  • First item: λ1tD-1/2ξ1ξ1TD1/2 = D-1/2(1/2m)[(didj)1/2]ijD1/2= (1/2m)[dj]ij

    • For any starting distribution p, (1/2m)[dj]ijTp = π.

Speed of the convergence

  • All other items: since |λi|<1, λitD-1/2ξiξiTD1/2→0!

    • Speed of convergence depends on how close |λ2| is to 1.

  • PageRank matrix: |λ2| = α (≈ 0.85)

  • Expander: 1-|λ2| = Ω(1)

The rest of the talk

  • Only ideas are given.

  • Many details are omitted.

  • I may cheat a bit to illustrate the main steps.

Algorithm 4: Approximately counting

Algorithm 4: Approximately counting

  • Task: estimate the size of an exponentially large set V

    • Example: Given G, count # of perfect matchings.

  • Approach: find a chain V0⊆V1⊆…⊆Vm=V

    • |V0| is easy to compute;

    • m = poly(n) layers;

    • Each ratio |Vi+1|/|Vi| = poly(n) is easy to estimate.

  • Then |V| = |V0| (|V1|/|V0|) (|V2|/|V1|)… (|Vn|/|Vn-1|) can be estimated.

  • Question: How to estimate |Vi+1|/|Vi|?

Estimate the ratio

  • Estimate by random sampling:

    • Generate a random element uniformly distributed in Vi+1, see how often it hits Vi.

    • Question: How to uniformly sample Vi+1?

  • By random walk

    • We construct a regular graph with vertex set Vi+1

    • The algorithm runs efficiently (i.e. in poly(n) time) if the walk converges to uniform rapidly (i.e. in poly(n) time).

    • It’s the case in perfect matching counting problem.

Algorithm 5: Error Reduction with efficient randomness

Error reduction

  • Task: Reduce the error of a randomized algorithm A.

    • Error: Prr∈{0,1}^m[r is bad for A] = 1/3→ ε?

  • Naïve approach:

    • Draw k random strings r1, …, rk,

    • Run algorithm A using r1, …, rk and get k answers

    • Output the majority of the answers

  • [Fact] the error drops to 2-Θ(k) . --- Chernoff’s bound

  • Randomness complexity: mk.

  • Question: Can we reduce the error prob w/o using too many additional random bits?



Expander graph

  • Expander:Eigenvalue gap is large --- Ω(1)

    • Random walk converges very fast --- O(log n).

    • Highly connected --- diameter O(log n)

    • Large boundary --- any subset has lots of edges going out

    • Could be sparse --- ∃ constant degree expanders.

  • We know how to construct them explicitly.

Algorithm for error reduction

  • New Algorithm A’:

    • Construct an expander with V = {0,1}m

    • Start from a random vertex r1

    • Perform a random walk (r1, r2,…, rk) of length k

    • Random algorithm A using r1, r2,…, rk and get k answers

    • Output the majority of the answers

  • Thm: the error prob of A’ is also 2-Θ(k)

  • How many random bits used? m + O(k)

    • The expander is of constant degree.

  • Highly dependent! Why it works?



Expander: large boundary

  • For simplicity, consider the one-side error case:

    • Algorithm A’ wrong ↔ Whole walk in B.

  • Expander: Every set has a large boundary.

  • At every step, it walks outside B w.p. Ω(1).

  • Pr[k steps never out] = 2-Θ(k)


{0,1}m: all

random strings



PART II. Quantum Walk









Quantum mechanics in one slide



| 





Physical System

Unit Vector

α,β: amplitudes

| 




Unitary Matrix

A classical bit



A quantum bit (qubit)


Tensor Product

Measure by |0and|1:

- get 0 w.p. |α|2; system →|0;

- get 1 w.p. |β|2; system →|1.

State space for 2 bits:

all combinations {00, 01, 10, 11}


State space for 2 qubits:

the space span{|00,|01,|10,|11}


Quantum walk

  • Many things become quite tricky.

  • Even the definition of quantum walk.

  • We’ll ignore the formal definitions here, but only present some results.

Type 1: Discrete Quantum Walk










Coin register

  • The most natural quantum counterpart of random walk is the following: each vertex v gives its neighbors (equally) some “mass” so that the l2 norm of the total mass is still 1.

  • However, simply accumulating all mass from neighbors does not work

    • The operator is not unitary.

  • To overcome this, we need a register to remember where the mass is from.

  • It amounts to add a coin register.

    • Flip the quantum coin: |v|c →|vH|c, where H =

    • Walk accordingly: |v|c →|nc(v)|c, where nc(v) is the cth neighbor of v.



General graph

  • Given any random walk on an undirected graph G, Szegedy proposed a method to transform it into a quantum walk

  • Set hitting time:

    • A a target set T⊆ V is hidden in the graph

    • Starting from the uniform distribution, what’s the expected time to hit some point in T?

  • Classical set hitting time = O((ε∆)-1)

    • ε: density of the target set, i.e. |T|/|V|

    • ∆: eigenvalue gap of matrix P

  • [Szegedy] Quantum set hitting time = O(√Classical set hitting time) = O((ε∆)-1/2)

Quantum algorithm for Element Distinctness

  • Element Distinctness: To decide whether all input integers are distinct.

  • [Ambainis] Discrete quantum walk on the following graph: (r = n2/3)

    V = {S⊆[n]: |S| = r}; E = {(S,T): |S∩T| = r-2}

    • Eigenvalue gap: ∆ = 1/r

    • Target set density: ε = r2/n2

  • Quantum running time = setup time + hitting time = O(r)+ O((ε∆)-1/2) = O(r + n/√r) = O(n2/3).

Type 2: Continuous Quantum Walk

Continuous quantum walk on a graph

  • Given an undirected graph G, and a Hermitian matrix H that respects the graph structure

    • Hermitian: (HT)* = H.

    • Respect G: Hij = 0 if (i,j)∉E.

  • The dynamics ψ(t) = ∑jαi(t)|ifor H is governed by the Schrodinger equation i(d/dt) αi(t) = ∑j:(i,j)∊EHijαj(t),

    which has solution |ψ(t) = eiHt |ψ(0).


Formula Evaluation

  • Grover Search: find a marked point in an n-point set in time O(n1/2).

    • Equivalently: evaluate OR in time O(n1/2).

  • Generalizations:

General AND-OR-NOT Formula

(General Game Tree)

(Balanced) AND-OR Tree

Formula evaluation

  • Motivation:

    • a natural generalization of OR --- Grover search

    • well studied subject in TCS

    • matching lower bound known long ago

    • game: min-max tree.

      • Same up to log(n)





  • [FGG, 07] An O(n1/2) algorithm for Balanced AND­OR Tree.

    • Method: scattering theory…

    • Not really understandable for computer scientists…

  • [CRSZ, A, FOCS’07] An O(n1/2+o(1)) algorithm for General AND­OR­NOT Formula.

    • Method: phase estimation + quantum walk.

    • Simpler algorithm, simpler proof.

    • For special cases like Balanced AND­OR Tree: O(n1/2)

Classical implications

  • [OS, STOC’03] Any formula f of size n has polynomial threshold function thr(f) = O(n1/2)?

  • Our result:

    Q( f ) ≤n1/2+o(1)

  • [KlivansServedio, STOC’01; KOS, FOCS’02]

    • Class C of Boolean functions has thr(f) ≤ r for all fC, ⇒C can be learned in time nO(r)

      (in both PAC or adversarially generated examples)

  • [Implication] Formulas are learnable in time 2

  • This is very interesting because studies of quantum algorithms solve a purely classical open problem!

  • This is not uncommon in quantum computing!


thr( f ) ≤

deg( f ) ≤

Definition of thr



x1 x2

x3 x4 x5

Sketch of the algorithm


  • For the formula f and an input x, construct a graph Gx

  • Let A be the adjacency matrix of Gx; consider A’s spectrum.

:0 :1

:0 :1 :1

Key observation

  • Red: function value at a vertex

  • green: amplitude of 0-eigenvector

  • Function values and amplitudes of 0-eigenvector propagate in the same way.
















If a function evaluates to 0,

then  an 0-eigenvector

If a function evaluates to 1,

then 0-eigenvector has no support on the root.

Phase Estimation

  • Thus qualitatively, it’s enough to know whether there is a 0-eigenvector with support on root.

  • Phase estimation: Given an unitary operator U (to use) and an eigenvector |ψ, find the phase θ in the corresponding eigenvalue eiθ.

    • Unitary matrix: all eigenvalues have l2-norm 1

  • For eiHt, the phase is nothing but the corresponding eigenvalue of H.

  • Thus use phase estimation, we can find an eigenvalue of H.

    • Depending on whether it’s 0, output the answer to f(x).

The quantum walk is here!

A highlevel structure of algorithm

  • We need to carefully choose weight hij on edges (i,j) s.t. the above qualitative connection also works quantitatively.

    • For f(x) = 0,

      •  an 0-eigenvector with Ω(1) support on root.

    • For f(x) = 1,

      • eigenvector with support on root has eigenvalue Ω(n-1/2)

  • Algorithm: Starting from the root, use quantum walk to carry out the Phase Estimation, output the answer depending on whether the phase is 0.


  • Random walk is a simple but powerful tool.

  • There will sure be more important algorithmic applications to be found.

  • Theories of quantum walk have been rapidly developed in the past couple of years.

  • A lot of fundamental issues yet to be better understood.

  • Significant breakthrough algorithms using quantum walk are ahead!


  • For random walk on graphs:

    • Lovász, Random Walks on Graphs: A Survey, Combinatorics, 353-398, 1996.

    • Chung, Spectral Graph Theory, American Mathematical Society, 1997.

  • For quantum walk:

    • Quantum computing in general:

      • Textbook by Nielsen and Chuang

      • Lecture notes by Vazirani, by Ambainis, and by Childs

    • Quantum walk:

      • Surveys by Kempe and by Santha.

      • All above lectures have nice chapters for quantum walk.


  • Login