1 / 67

# Random Walk on Graphs and its Algorithmic Applications - PowerPoint PPT Presentation

Random Walk on Graphs and its Algorithmic Applications. Shengyu Zhang Winter School, ITCSC@CUHK, 2009. Random walk on graphs. On an undirected graph G: Starting from vertex v 0 Repeat for a number of steps: Go to a random neighbor. Simple but powerful. Road map: Random walk. Parameters

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Random Walk on Graphs and its Algorithmic Applications

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## Random Walk on Graphs and its Algorithmic Applications

Shengyu Zhang

Winter School, ITCSC@CUHK, 2009

### Random walk on graphs

On an undirected graph G:

• Starting from vertex v0

• Repeat for a number of steps:

• Go to a random neighbor.

• Simple but powerful.

Parameters

/Properties

Algorithms

• k-SAT

• st-connectivity

Hitting time

• PageRank

• Approximate counting

• Error-reduction

Mixing time

Short intro to math model of quantum mechanics

Types

Algorithms

Element Distinctness

Discrete QW

Formula Evaluation

Continuous QW

## PART I. Random Walk

Key parameter 1: Hitting time

### Hitting time

• Recall the process of random walk on a graph G.

• Starting vertex v0

• Repeat for a number of steps:

• Go to a random neighbor.

• Hitting time: H(i,j) = expected time to visit j (for the first time), starting at i

j

i

0

1

2

n-1

i

j

(n/2)-line

(n/2)-complete graph

### Undirected graphs

• Complete graph

• H(i,j) = n-1 (i≠j)

• Line:

• H(i,j) = j2-i2 (i<j)

• In particular, H(0,n-1) = (n-1)2.

• General graph:

• H(i,j) = O(n3).

## Algorithm 1: 2-SAT

### k-SAT: satisfiability of k-CNF formula

• n variables x1, …, xn∊{0,1}

• m clauses, each being OR of k literals

• Literal: xi or ¬xi

• e.g. (k=3): (¬x1)٧x5 ٧x7

• 3-CNF formula: AND of these m clauses

• e.g. ((¬x1)٧x5 ٧x7)٨(x2 ٧(¬x5)٧ (¬x7))٨(x1 ٧x7 ٧x8)

• 3-SAT Problem: Given a 3CNF formula, decide whether there is an assignment of variables s.t. the formula evaluates to 1.

• For the above example, Yes: x5=1, x7=0, x1=1

### P vs. NP

• P: problems that can be easily solved

• “easily”: in polynomial time

• NP: problems that can be easily verified.

• Formally: ∃ a polynomial time verifier V, s.t. for any input x,

• If the answer is YES, then ∃y s.t. V(x,y) = 1

• If the answer is NO, then ∀y, V(x,y) = 1

• The question of TCS: Is P = NP?

• Intuitively, no. NP should be much larger.

• It’s much easier to verify (with help) than to solve (by yourself)

• mathematical proof, appreciation of good music/food, …

• Formal proof? We don’t know yet.

• One of the 7 Millennium Problems by CMI.①

① http://www.claymath.org/millennium/P_vs_NP/

### NP-completeness

• k-SAT is NP-complete, for any k ≥ 3.

• NP-complete:

• In NP

• All other problems in NP can be reduced to it in poly. time

• NP-complete problems are the hardest ones in NP.

• 3-SAT is in NP:

• witness --- satisfying assignment A

• Verification: evaluate formula with variables assigned by A

• [Cook-Levin] 3-SAT is NP-complete.

• While 3-SAT is the hardest in NP, 2-SAT is solvable in polynomial time.

• Here we present a very simple randomized algorithm, which has polynomial expected running time.

### Algorithm for 2-SAT

• 2SAT: each clause has two variables/negations

• Pick any assignment

• Repeat O(n2) time

• If all satisfied, done

• Else

• Pick any unsatisfied clause

• Pick one of the two literals each with ½ probability, and flip the assignment on that variable

(x1∨x2)∧(x2∨¬x3) ∧(¬x4∨x3) ∧(x5∨x1)

x1, x2, x3, x4, x5

0, 1, 0, 1, 0

1

### Analysis

• (x1∨x2)∧(x2∨¬x3) ∧(¬x4∨x3) ∧(x5∨x1)

• x1, x2, x3, x4, x5

• 0, 1, 0, 1, 0

• If unsatisfiable: never find an satisfying assignment

• If satisfiable, there exists a satisfying assignment x

• If our initially picked assignment x’ is satisfying, then done.

• Otherwise, for any unsatisfied clause, at least one of the two variables is assigned a value different than that in x

• Randomly picking one of the two variables and flipping its value increases # correct assignments by 1 w.p. ≥ ½

### Analysis (continued)

• Consider a line of n+1 points, where k represents “we’ve assigned k variables correctly”

• “correctly”: the same way as x

• Last slide: Randomly picking one of the two variables and flipping its value increases # correct assignments by 1 w.p. ≥ ½

• Thus the algorithm is actually a random walk on the line of n+1 points, with Pr[going right] ≥ ½.

• Hitting time (i → n): O(n2)

• So by repeating this flipping process O(n2) steps, we’ll reach x with high probability.

## Algorithm 2: st-connectivity

### st-Connectivity

• Problem: Given a graph G and two vertices s and t on it, decide whether they are connected.

• BFS/DFS (starting at s) solves the problem in linear time.

• But uses linear space as well.

• Question: Can we use much less space?

### Simple random walk algorithm

• A randomized algorithm takes O(log n) space.

• Starting at s, perform the random walk for O(n3) steps. If ever see t, output YES and stop.

• output NO.

• Why it works? H(s,t) = O(n3).

• If s can reach t, then we should see it within O(n3) time.

Key parameter 2: Mixing time

### Convergence

• Now let’s study the probability distribution of the particle after a long time.

• What do we observe?

• The distribution converges touniform…

• wherever it starts.

• Q: Does this hold in general?

v

u

w

v1

v2

v3

v4

v5

v1

v2

v3

v4

v5

v6

### Convergence

• Well, Yes and No.

• Consider the following cases

• on undirected graphs.

• Case 1: The graph is unconnected.

• Case 2: The graph is bipartite.

• [Thm] For any connectednon-bipartite graph, and any starting point, the random walk converges.

v

u

w

### Converges to what?

• In the previous triangle example: uniform.

• In general?

• As a result of the convergence, the distribution doesn’t change by the matrix

• If the particle is on the graph according to the distribution, then further random walk will result in the same distribution.

• We call it the stationary distribution.

• [Fact] It’s unique.

### stationary distribution

• [Fact] It’s the following distribution: π(v) = d(v)/2m

• d(v) = degree of v, i.e. # of neighbors.

• m: |E|, i.e. # of edges.

• [Proof] Consider one step of walk:

π’(v) = ∑u: (u,v)∊E p(u)∙[1/d(u)]

= ∑u: (u,v)∊E [d(u)/2m]∙[1/d(u)]

= ∑u: (u,v)∊E 1/2m

= d(v)/2m

= π(v)

So π is the stationary distribution.

• For regular graphs, π is the uniform distribution.

### Proportional to degree

• Note: the stationary distribution π(v) = d(v)/2mis proportional to the degree of v.

• What’s the intuition? The more neighbors you have, the more chance you’ll be reached.

• We will see another natural interpretation shortly.

### Speed of the convergence

• We’ve seen that random walk converges to the stationary distribution.

• Next question: how fast is the convergence?

• Let’s define the mixing time as min {T: ∥pi(T) - π∥ ≤ε}where pi(T): the distribution in time T, starting at i.∥∙∥: some norm.

• We’ll see the reason of mixing later. Now let’s first see an example you run into everyday.

## Algorithm 3: PageRank

### PageRank

• Google gives each webpage a number for its “importance”.

• [IT] Google: 10, Microsoft: 9, Apple: 9

• [Media] NYTimes: 9, CNN: 10, sohu: 8, newsmth: 7

• [Sports] NBA: 7, CBA: 7, CFA: 7

• [University] MIT: 9, CUHK: 8, …

… Tsinghua, Pku, Fudan, IIT(B): 9

• When you search for something by making a query, a large number of related webpages are retrieved.

• What webpages to retrieve? Information Retrieval. That’s an orthogonal issue.

### Ranking

• How to give this big corpus to you?

• Search engines rank them based on the “importance”, and give them in descending order.

• Thus presumably the first page contains the 10 most important webpages related to your query.

• Question: How to rank?

• PageRank: Use the vast link structure as in indicator of an individual page’s value.

### Reference system

• Webpage A has a link to webpage BA thinks B is useful.

• Think of it as A writing a recommendation letter for B.

• So a webpage with a lot of other pages pointing to it is probably important.

• A guy getting a lot of letters is strong.

• Further, pointers from pages that are themselves important bear more weight.

• Letters by Noga, László, Sasha, Avi, Andy, … mean a lot.

• But the importance of those pages also need to be calculated… we have a recurrence equation.

### Furthermore

• If page A has a lot of links, then each link means less.

• Ok, you get Einstein’s letter, but you know what, last year everyone on the market got his letter.

• So let’s assume that page A’s reference weight is divided evenly to all pages B that A links to.

• Recurrence equation: R(A) = ∑B: B→A R(B)/d(B)where d(B) = # pages C that B links to

### Sink issue

• R(A) = ∑B: B→A R(B)/d(B) has a problem.

• There are some “sink” pages that contain no links to other pages.

• Sinks accumulate weights without giving out.

• The recurrence equation only has solutions with weight on sinks, losing the original intension of indicating the importance of all pages.

• To handle this, we modify the recursion:R(A) = (1-α)/N + α∑B: B→A R(B)/d(B)

• Force each page to have a (1- α)-fraction of weights (evenly) going to all pages.

• So each A also receives a weight of (1- α)/N from all pages

• α: around 0.85

### Random Walk view

• Note that the recursion

R(A) = (1-α)/N + α∑B: B→A R(B)/d(B)

is exactly the random walk on the graph, where at each point A, we

• w/ prob. (1-α), go to a random page.

• Question: How to solve this recurrence equation?

• # webpages: ~ 30 billion (and counting…)

### Algorithm

• Recall: the random walk converges to the stationary distribution!

• It’s a bit different since it’s random walk on directed graphs, but this PageRank matrix has all good properties we need so that the random walk also converges to the solution.

• Algorithm: start from any distribution, run a few iterations of random walk, and output the result.

• Google: 50-100 iterations, need a few days.

• That should be close to the stationary distribution, which serves as indictor of the importance of pages.

### Next

• We’ll see a bit math behind the mixing story.

### Mathematics behind the mixing

• [Eigenvalue decomposition] A symmetric matrix MNN can be written as M = ∑i=1,…,nλiξiξiT

• λi: eigenvalue. Order them: |λ1| ≥ |λ2| ≥ |λ3| ≥ … ≥ |λn|

• ξi: (column) eigenvector, i.e. Mξi = λiξi

• The eigenvectors are orthonormal.

• ∥ξi∥2=1

• ξi, ξj = 0, for all i≠j

• M2 = (∑i=1,…,nλiξiξiT)(∑j=1,…,nλjξjξjT)

= ∑i,jλiξiξiT λjξjξjT = ∑i,jλi λj ξi,ξi ξjξjT

= ∑iλi2ξjξiT

• Mt = ∑iλitξjξiT

### Random walk in matrix form

• A: adjacency matrix. (Aij = 1 if (i,j) ∊E; 0 o.w.)

• P: probability transition matrix.

• Pij = 1/di if (i,j) ∊E; 0 o.w.

• P = D-1A, where D = diag(d1, …, dn). (di: deg(i))

• For any distribution p, PTp is the distribution after one step of random walk

• N = D-1/2AD-1/2 = D1/2PD-1/2

• Nij = 1/(didj)1/2 if (i,j) ∊E; 0 o.w.

• N is symmetric, so N can be written as N = ∑i=1,…,nλiξiξiT

• Random Walk for t steps:

• Pt = (D-1/2ND1/2)t = D-1/2NtD1/2 = D-1/2 (∑iλitξiξiT)D1/2= ∑iλitD-1/2ξiξiTD1/2

### Why mixing? Why to π?

• [Thm] For connected non-bipartite graph, N has

• λ1 = 1 and ξ1 = π1/2 = ((d1/2m)1/2, …, (dn/2m)1/2) is the square root version of the stationary distribution (so that the l2 norm is 1).

• |λ2| < 1. (So is all other λi’s.)

• Random Walk for t steps: Pt = ∑i λitD-1/2ξiξiTD1/2

• First item: λ1tD-1/2ξ1ξ1TD1/2 = D-1/2(1/2m)[(didj)1/2]ijD1/2= (1/2m)[dj]ij

• For any starting distribution p, (1/2m)[dj]ijTp = π.

### Speed of the convergence

• All other items: since |λi|<1, λitD-1/2ξiξiTD1/2→0!

• Speed of convergence depends on how close |λ2| is to 1.

• PageRank matrix: |λ2| = α (≈ 0.85)

• Expander: 1-|λ2| = Ω(1)

### The rest of the talk

• Only ideas are given.

• Many details are omitted.

• I may cheat a bit to illustrate the main steps.

## Algorithm 4: Approximately counting

### Algorithm 4: Approximately counting

• Task: estimate the size of an exponentially large set V

• Example: Given G, count # of perfect matchings.

• Approach: find a chain V0⊆V1⊆…⊆Vm=V

• |V0| is easy to compute;

• m = poly(n) layers;

• Each ratio |Vi+1|/|Vi| = poly(n) is easy to estimate.

• Then |V| = |V0| (|V1|/|V0|) (|V2|/|V1|)… (|Vn|/|Vn-1|) can be estimated.

• Question: How to estimate |Vi+1|/|Vi|?

### Estimate the ratio

• Estimate by random sampling:

• Generate a random element uniformly distributed in Vi+1, see how often it hits Vi.

• Question: How to uniformly sample Vi+1?

• By random walk

• We construct a regular graph with vertex set Vi+1

• The algorithm runs efficiently (i.e. in poly(n) time) if the walk converges to uniform rapidly (i.e. in poly(n) time).

• It’s the case in perfect matching counting problem.

## Algorithm 5: Error Reduction with efficient randomness

### Error reduction

• Task: Reduce the error of a randomized algorithm A.

• Error: Prr∈{0,1}^m[r is bad for A] = 1/3→ ε?

• Naïve approach:

• Draw k random strings r1, …, rk,

• Run algorithm A using r1, …, rk and get k answers

• Output the majority of the answers

• [Fact] the error drops to 2-Θ(k) . --- Chernoff’s bound

• Randomness complexity: mk.

• Question: Can we reduce the error prob w/o using too many additional random bits?

B

{0,1}m

### Expander graph

• Expander:Eigenvalue gap is large --- Ω(1)

• Random walk converges very fast --- O(log n).

• Highly connected --- diameter O(log n)

• Large boundary --- any subset has lots of edges going out

• Could be sparse --- ∃ constant degree expanders.

• We know how to construct them explicitly.

### Algorithm for error reduction

• New Algorithm A’:

• Construct an expander with V = {0,1}m

• Start from a random vertex r1

• Perform a random walk (r1, r2,…, rk) of length k

• Random algorithm A using r1, r2,…, rk and get k answers

• Output the majority of the answers

• Thm: the error prob of A’ is also 2-Θ(k)

• How many random bits used? m + O(k)

• The expander is of constant degree.

• Highly dependent! Why it works?

B

{0,1}m

### Expander: large boundary

• For simplicity, consider the one-side error case:

• Algorithm A’ wrong ↔ Whole walk in B.

• Expander: Every set has a large boundary.

• At every step, it walks outside B w.p. Ω(1).

• Pr[k steps never out] = 2-Θ(k)

Non-expander

{0,1}m: all

random strings

B

Boundary

## PART II. Quantum Walk

1

1

|1

|1

0

0

|0

|0

### Quantum mechanics in one slide

Physics

Math

| 

1

α|0+β|1

(|α|2+|β|2=1)

β

Physical System

Unit Vector

α,β: amplitudes

| 

0

α

Evolution

Unitary Matrix

A classical bit

Measurement

Projection

A quantum bit (qubit)

Composition

Tensor Product

Measure by |0and|1:

- get 0 w.p. |α|2; system →|0;

- get 1 w.p. |β|2; system →|1.

State space for 2 bits:

all combinations {00, 01, 10, 11}

Classical:

State space for 2 qubits:

the space span{|00,|01,|10,|11}

Quantum:

### Quantum walk

• Many things become quite tricky.

• Even the definition of quantum walk.

• We’ll ignore the formal definitions here, but only present some results.

Type 1: Discrete Quantum Walk

µ

1

1

1

p

1

1

2

¡

### Coin register

• The most natural quantum counterpart of random walk is the following: each vertex v gives its neighbors (equally) some “mass” so that the l2 norm of the total mass is still 1.

• However, simply accumulating all mass from neighbors does not work

• The operator is not unitary.

• To overcome this, we need a register to remember where the mass is from.

• It amounts to add a coin register.

• Flip the quantum coin: |v|c →|vH|c, where H =

• Walk accordingly: |v|c →|nc(v)|c, where nc(v) is the cth neighbor of v.

V

T

### General graph

• Given any random walk on an undirected graph G, Szegedy proposed a method to transform it into a quantum walk

• Set hitting time:

• A a target set T⊆ V is hidden in the graph

• Starting from the uniform distribution, what’s the expected time to hit some point in T?

• Classical set hitting time = O((ε∆)-1)

• ε: density of the target set, i.e. |T|/|V|

• ∆: eigenvalue gap of matrix P

• [Szegedy] Quantum set hitting time = O(√Classical set hitting time) = O((ε∆)-1/2)

### Quantum algorithm for Element Distinctness

• Element Distinctness: To decide whether all input integers are distinct.

• [Ambainis] Discrete quantum walk on the following graph: (r = n2/3)

V = {S⊆[n]: |S| = r}; E = {(S,T): |S∩T| = r-2}

• Eigenvalue gap: ∆ = 1/r

• Target set density: ε = r2/n2

• Quantum running time = setup time + hitting time = O(r)+ O((ε∆)-1/2) = O(r + n/√r) = O(n2/3).

Type 2: Continuous Quantum Walk

### Continuous quantum walk on a graph

• Given an undirected graph G, and a Hermitian matrix H that respects the graph structure

• Hermitian: (HT)* = H.

• Respect G: Hij = 0 if (i,j)∉E.

• The dynamics ψ(t) = ∑jαi(t)|ifor H is governed by the Schrodinger equation i(d/dt) αi(t) = ∑j:(i,j)∊EHijαj(t),

which has solution |ψ(t) = eiHt |ψ(0).

¬

### Formula Evaluation

• Grover Search: find a marked point in an n-point set in time O(n1/2).

• Equivalently: evaluate OR in time O(n1/2).

• Generalizations:

General AND-OR-NOT Formula

(General Game Tree)

(Balanced) AND-OR Tree

### Formula evaluation

• Motivation:

• a natural generalization of OR --- Grover search

• well studied subject in TCS

• matching lower bound known long ago

• game: min-max tree.

• Same up to log(n)

max

min

min

### Breakthroughs

• [FGG, 07] An O(n1/2) algorithm for Balanced AND­OR Tree.

• Method: scattering theory…

• Not really understandable for computer scientists…

• [CRSZ, A, FOCS’07] An O(n1/2+o(1)) algorithm for General AND­OR­NOT Formula.

• Method: phase estimation + quantum walk.

• Simpler algorithm, simpler proof.

• For special cases like Balanced AND­OR Tree: O(n1/2)

### Classical implications

• [OS, STOC’03] Any formula f of size n has polynomial threshold function thr(f) = O(n1/2)?

• Our result:

Q( f ) ≤n1/2+o(1)

• [KlivansServedio, STOC’01; KOS, FOCS’02]

• Class C of Boolean functions has thr(f) ≤ r for all fC, ⇒C can be learned in time nO(r)

(in both PAC or adversarially generated examples)

• [Implication] Formulas are learnable in time 2

• This is very interesting because studies of quantum algorithms solve a purely classical open problem!

• This is not uncommon in quantum computing!

~

thr( f ) ≤

deg( f ) ≤

Definition of thr

[BBCMdW, JACM’01]

n1/2+o(1)

x1 x2

x3 x4 x5

### Sketch of the algorithm

• {AND,OR,NOT}⇔{NAND}

• For the formula f and an input x, construct a graph Gx

• Let A be the adjacency matrix of Gx; consider A’s spectrum.

:0 :1

:0 :1 :1

### Key observation

• Red: function value at a vertex

• green: amplitude of 0-eigenvector

• Function values and amplitudes of 0-eigenvector propagate in the same way.

0

≠0

1

=0

1

1

0

0

≠0

0

≠0

1

=0

1

=0

If a function evaluates to 0,

then  an 0-eigenvector

If a function evaluates to 1,

then 0-eigenvector has no support on the root.

### Phase Estimation

• Thus qualitatively, it’s enough to know whether there is a 0-eigenvector with support on root.

• Phase estimation: Given an unitary operator U (to use) and an eigenvector |ψ, find the phase θ in the corresponding eigenvalue eiθ.

• Unitary matrix: all eigenvalues have l2-norm 1

• For eiHt, the phase is nothing but the corresponding eigenvalue of H.

• Thus use phase estimation, we can find an eigenvalue of H.

• Depending on whether it’s 0, output the answer to f(x).

The quantum walk is here!

### A highlevel structure of algorithm

• We need to carefully choose weight hij on edges (i,j) s.t. the above qualitative connection also works quantitatively.

• For f(x) = 0,

•  an 0-eigenvector with Ω(1) support on root.

• For f(x) = 1,

• eigenvector with support on root has eigenvalue Ω(n-1/2)

• Algorithm: Starting from the root, use quantum walk to carry out the Phase Estimation, output the answer depending on whether the phase is 0.

### Summary

• Random walk is a simple but powerful tool.

• There will sure be more important algorithmic applications to be found.

• Theories of quantum walk have been rapidly developed in the past couple of years.

• A lot of fundamental issues yet to be better understood.

• Significant breakthrough algorithms using quantum walk are ahead!

### Reference

• For random walk on graphs:

• Lovász, Random Walks on Graphs: A Survey, Combinatorics, 353-398, 1996.

• Chung, Spectral Graph Theory, American Mathematical Society, 1997.

• For quantum walk:

• Quantum computing in general:

• Textbook by Nielsen and Chuang

• Lecture notes by Vazirani, by Ambainis, and by Childs

• Quantum walk:

• Surveys by Kempe and by Santha.

• All above lectures have nice chapters for quantum walk.