Graph Algorithms

Graph Algorithms

圖形G是由兩個集合V和E所構成的，可以寫成G=(V,E)。其中V是非空的由有限個頂點(Vertex)所構成的集合，E則是由頂點對所構成的集合，這些頂點對叫做邊(edge)。V(G)和E(G)各表示組成G的頂點集合和邊集合。依據E中邊之型態，所組成之圖形又可分成下列兩種：圖形G是由兩個集合V和E所構成的，可以寫成G=(V,E)。其中V是非空的由有限個頂點(Vertex)所構成的集合，E則是由頂點對所構成的集合，這些頂點對叫做邊(edge)。V(G)和E(G)各表示組成G的頂點集合和邊集合。依據E中邊之型態，所組成之圖形又可分成下列兩種： • E中之邊沒有方向性，亦即（Ｖ1,Ｖ2）和（Ｖ2,Ｖ1）表示相同的邊，如此構成之圖形稱作無向圖形（undirected graph）。 • E之邊沒有方向性，頂點對（邊）以＜Ｖ1,Ｖ2＞表示，其中Ｖ２表頭（head），Ｖ1表尾（tail）；很自然地，＜Ｖ2,Ｖ1＞和，＜Ｖ1,Ｖ2＞是完全不同的兩個邊。以此種邊集合構成之圖形稱作有向圖形（directed graph，又稱digraph）。

complete graph：各種頂點的組合均存在之圖形稱作complete graph。無向圖形若有n個頂點，則最大可能之邊組合有（）＝　　　　種，而有向圖形將有兩倍於此為n(n-1)種組合；含有n個頂點之complete graph將有如上數目之邊。 • subgraph：若Ĝ為Ｇ之subgraph，則Ĝ為一graph，且Ｖ(Ĝ)＜V(V)，E(G)＜E(G)。 • path：由圖形中頂點所構成的序列Ｖp,Ｖ1,Ｖ2,…ＶN,Ｖq，若其中(Ｖp,Ｖ1), (Ｖ1,Ｖ2),…，(ＶN,Ｖq)均為圖形中之邊，則此序列稱做一path；一path中edge之數目稱做該path之長度；在有向圖形之情況下，則要求＜Ｖp,Ｖ1＞, ＜Ｖ1,Ｖ2＞,…，＜ＶN,Ｖq＞均為有向edge，而此path又稱做directed path。 • Simple path：在上面之定義中，其除了Ｖp和Ｖq之外，其他vertex均不重複出現的path稱之。在simple path中，若Ｖp=Ｖq，則稱之為cycle或circuit。一graph若有cycle則稱cyclic，反之則可稱做acyclic。 • 一path若不為simple，則其必有相交之情況。

Ｅ1 Ｅ2 Ｖ1 Ｖ2 Ｖ1 Ｖ2 (有向) (無向) • 頂點和邊之關係,以“adjacent”和“incident”描述之：Ｖ1和Ｖ2為adjacent Ｖ1 adjacent fromＶ2 Ｅ1 incident onＶ1(Ｖ2) Ｖ2 adjacent toＶ1 Ｅ2 incident toＶ1(Ｖ2)

相連（Connected） • connected of two vertices：若一圖形中之兩頂點間存在任何path，則稱他們為connected的。在有向圖形中，若兩點頂間存在自其中某頂點到另一頂點和回來之有向path，則稱此兩頂點為strongly connected。 • connected of a graph：若一圖形其所有頂點對均為6 connected，則此graph為connected：相似地，若一有向圖形所有頂點對間均為strongly connected，該圖形亦為strongly connected。 • degree (of a vertex)：表示一個vertex所adjacent之其他vertices之個數。在有向圖形中，degree又分成in-degree和out-degree，其中前者表示該vertex所adjacent from之vertex個數，而後者表示所adjacent to之vertex個數。

1 1 ▼ ▲ 2 3 2 4 5 6 7 3 G2 G3 G1 H2 5 H1 1 6 7 3 2 8 4 G4 1 2 3 4 • connected component：對無向圖形而言，其connected component（或component）表示其孤立的connected的子圖，這可能有好幾個。在有向圖形中，strongly connected component表示一盡量伸展而仍然保持strongly connected 的子圖（不一定要孤立）。以下四個圖形G1到G4前三個均只有一個component（G3沒有strongly connected component），而G4有兩個component。

1 4 2 3 路徑長度為n之鄰接矩陣

Warshall’s algorithm 在找出路徑矩陣p時，首先要找到a2，然後a3，一直到an，最後再將a1，a2，…，an加起來得到Sn，再由Sn得到p。 Warshall’s algorithm: 1. p←a（把鄰接矩陣a拷貝至p） 2. for（k=1；k<=n；k++） for（i=1；i<=n；i++） for（j=1；j<=n；j++） p[i][j]=p[i][j] | p[i][k]&p[k][j]；

1 2 3 5 4 edge vertex • Representations of graphs：undirected graph • An undirected graph G have five vertices and seven edges • An adjacency-list representation of G • The adjacency-matrix representation of G 1 2 5 / 2 1 5 3 4 / 3 2 4 / 4 2 5 3 / 5 4 1 1 / 1 2 3 4 5 1 0 1 0 0 1 2 1 0 1 1 1 3 0 1 0 1 0 4 0 1 1 0 1 5 1 1 0 0 0

1 2 4 / 2 5 / 3 6 5 / 4 2 / 5 4 / 6 6 / 1 2 3 • Representations of graphs：directed graph • An directed graph G have six vertices and eight edges • An adjacency-list representation of G • The adjacency-matrix representation of G 4 5 6 1 2 3 4 5 6 1 0 1 0 1 0 0 2 0 0 0 0 1 0 3 0 0 0 0 1 1 4 0 1 0 0 0 0 5 0 0 0 1 0 0 6 0 0 0 0 0 1

r s t u r s t u (b) (a) 0 1 0 • The operation of BFS on an undirected graph 1 x v w y x v w y Q w r s Q 0 1 1 r s t u r s t u (c) (d) 0 1 2 0 1 2 1 1 2 2 2 x v w y x v w y Q Q x r v t t x 2 1 2 2 1 2 r s t u r s t u (e) (f) 0 0 1 1 2 2 3 3 1 1 3 2 2 2 2 x x v w y v w y Q Q u y x v v u 3 3 2 2 2 3

r s t u r s t u (g) (h) 0 0 1 1 2 2 3 3 1 1 3 2 2 2 3 2 x x v w y v w y Q u y y 3 3 3 r s t u (i) 0 1 2 3 1 3 2 2 x v w y Q

Breadth-first search： • Initially, vertices are colored white. • Discovered vertices are colored green. • When done, discovered vertices are colored black. • d[u] stores the distance from s to u. • is a predecessor to u on its shortest path. • Q is a first-in first-out queue.

BFS(G,s) for each vertex u in V[G] – {s} 1. do color[u] <- white 2. • Algorithm： • Complexity： d[u] <- infinity 3. pi[u] <- NIL 4. 5. color[s] <- gray 6. d[s] <- 0 7. pi[s] <- NIL 8. Q <- {s} 9. While Q .ne. {} 10. do u <- head[Q] 11. for each v in Adj[u] 12. do if color[v] = white 13. then color[v] <- gray 14. d[v] <- d[u]+1 15. Pi[v] <- u 16. Enqueue(Q,v) 17. Dequeue(Q) 18. color[u] <- black

Properties of Breadth-first search： • After execution of BFS, all nodes reachable from the source s are colored black. • After execution of BFS, d[v] is the distance of a shortest path from the source s to v for vertices v reachable from s. • After execution of BFS, if v is reachable from s, then one of the shortest paths to v passes through the edge ( ) at the end. • After execution of BFS, the edges( ) for v reachable from s from a breadth-first tree.

u 1 s v • Lemma1： • G=(V,E)：a directed or undirected graph. • s：an arbitrary vertex • ： the shortest-path distance from s to v. Then for any Proof：

Lemma2： • G=(V,E)：a directed or undirected graph. • s：an arbitrary vertex • ： the distance from s to u computed by the algorithm Suppose that BFS is run on G from s. Then on termination, for each vertex , the value d[v] computed by BFS satisfies Proof： By induction on the number of times a vertex is placed in the queue Q. Basis： when s is placed in Q. for all Induction Step： Consider a white vertex v discovered during the search from a vertex u. Inductive hypothesis implies By lemma1, From then on, d[v] wont be changed again.

head tail Then and for i=1,2,…,r-1. • Lemma3： • Q：<v1,v2,…,vr>~ the queue when BFS executes on a graph G=(V,E). Proof： By induction on the number of queue operations. Basis： when Q has only s. Induction Step： 1>： after dequeue： v2 becomes the new head in Q. by inductive hypothesis. 2>：after enqueue a new vertex v into the Q. Let vr+1 be v. neighbors Note that vi’s adjacency list is being scanned. Thus, And The rest , for I=1,…,r-1, remain unaffected.

Thm： • 1. During the execution of BFS on G=(V,E), BFS discovers every vertex that is reachable from s, and on termination • 2. For any vertex reachable form s, one of the shortest paths from s to v is the shortest path form s to followed by the edge

Proof： If v unreachable from s, By BFS, v is never discovered. Let By induction on k, we want to prove for each there is exactly on point during The execution of BFS at which 1. V is grayed 2. D[v]=k 3. if then 4. v is inserted into Q. Basis： k=0, Vk={s} ~ clear ! Induction Step： until BFS terminates. Once u is inserted into Q, d[u] and never change. By lemma 3, grayed Let , then by lemma 2.

The monotonicity property, with and the inductive hypothesis implies that v must be discovered after all vertices in Vk-1 are enqueued, if v is discovered at all. • Thm： Since a path of k edges from s to v, such that At some point u must be the head in Q. Then u’s neighbors are scanned and v is discovered. Line 13 grays v, line 14 sets d[v]=d[u]+1=k. Line 15 sets Line 15 enqueues v. Thus, the inductive hypothesis holds. If then Thus, we can obtain a shortest path from s to v by taking a shortest path from s to then traversing the edge

Depth-first search： • Nodes are initially white • Nodes become green when first discovered • Nodes become black when they are done • d[v] records when v is first discovered • F[v] records when v is done • d[u] < f[u]

Discovery time (a) (b) v w v w u u 1/ 1/ 2/ y z y z x x (c) (d) v w v w u u 1/ 2/ 1/ 2/ 3/ 4/ 3/ y z y z x x

(e) (f) v w v w u u 1/ 2/ 1/ 2/ B B 4/ 3/ 4/5 3/ y z y z x x (g) v w u (h) v w u 1/ 2/ 1/ 2/7 B B 4/5 3/6 4/5 3/6 y z x y z x

(i) v w u (j) v w u 1/ 2/7 1/8 2/7 B B F F 4/5 3/6 4/5 3/6 y z x y z x (k) v w u (l) v w u 1/8 2/7 9/ 1/8 2/7 9/ B F B C F 4/5 3/6 4/5 3/6 y z x y z x

(m) v w u (n) v w u 1/8 2/7 9/ 1/8 2/7 9/ B C B F C F B 4/5 3/6 10/ 4/5 3/6 10/ y z x y z x (o) (o) v w v w u u 1/8 2/7 9/ 1/8 2/7 9/12 B B C C F F B B 4/5 3/6 10/11 4/5 3/6 10/11 y z y z x x

(u,v) • Black edges: if u is connected to an ancestor v in a depth- first tree. (eg self-loop) • Forward edges: if u is connected to an descendant v in a depth-first tree. • Cross edges: if u is not connected to an ancestor v in the same depth-first tree. OR if v is not connected to an ancestor u in the same depth-first tree. OR if u and v belong to different depth-first trees.

DFS(G) for each vertex u in V[G] do color[u]  white pi[u]  NIL time  0 for each vertex u in V[G] do if color[u] = white then DFS-Visit(u) O(V) O(E)

DFS-Visit(u) color[u]  gray d[u]  time  time + 1 for each v in Adj[u] do if color[v] = white then pi[v]  u DFS-Visit(v) color[u]  black f[u]  time  time + 1 Discovery time Finishing time

The running time of DFS is O(V+E) • After execution of DFS, all nodes are colored black • After execution of DFS, the edges( ) form a collection of depth-first tree, called a depth-first forest.

Edge Classification • 1. Tree edges( u, v ): u was discovered first using ( u,v ) • 2. Back edges( u, v ): v is an ancestor of u in the DFS tree • 3. Forward edges( u, v ): v is a descendent of u, not a tree edge • 4. Cross edges( u, v ): Other edges • Example • In a depth-first search of an undirected graph, every edge is either a tree edge or a back edge

(a) t z s y 11/16 3/6 2/9 1/10 B F C B 14/15 4/5 7/8 12/13 C C C u w v x (b) s t z v u w y x 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ( s (z (y (x x) y) (w w) z) s) (t (v v) (u u) t)

(c) s t C B F z C u v B C y w C x

Thm6: In any DFS of a graph G=( V,E ), for any two vertices u,v, exactly one of the following 3 conditions holds: (1). The intervals [d[u], f[u]] and [d[u], f[u]] are disjoint, (2). The interval [d[u], f[u]] is contained entirely within the interval [d[v], f[v]], and u is a descendant of v in the depth-first tree, (3). or as above with v as a descendant of u

Pf: 1. If d[u] < d[v] case 1: d[v] < f[u] : So v is finished before finishing u. Thus (3) holds case 2: f[u] < d[v] d[u] < f[u] < d[v] < f[v]  (1) holds 2. If d[v] < d[u]: Similarly, by switching the roles of u and v v was discovered while u was still gray. • v is a descendant of u, and all v’s outgoing edges are explored

Cor7: v is a proper descendant of vertex u in the depth-first forest for a graph G iff d[u] < d[v] < f[v] < f[u] • Thm8: In a DF forest of G=( V,E ), vertex v is a descendant of u iff at the time d[u] that the search discovers u, v can be reached from u along a path consisting entirely of white vertices

u v w w v u Descendant of u • Pf: “” v: descendant of u “” Assume at time d[u], v is reachable from u along a path of white vertices, but v does not become a descendant of u in DF tree Thus, d[u] < d[v] < f[w] <= f[u] Thm6 implies [d[v], f[v]] is contained entirely in [d[u], f[u]] By Cor7, v is a descendant of u. d[u] < d[w], by the above cor. Thus w is white at d[u] f[w] <= f[u] v must be discovered after u is discovered, but before w is finished.

Thm 9: In a DFS of an undirected graph G, every edge of G is either a tree edge or a back edge Pf: Let and suppose d[u] < d[v]. Then v must be discovered and finished before finishing u • If (u, v) is explored first in the direction from u to v, then (u, v) becomes a tree edge • If (u, v) is explored first in the direction from v to u, then (u, v) is a back edge, since u is still gray at the time (u, v) is first explored

undershorts socks 17/18 11/16 watch 9/10 12/15 pants shoes 13/14 1/8 shirt belt tie 2/5 6/7 jacket 3/4 socks undershorts pants shoes watch shirt belt tie jacket • Topological sort • 定義：A topological sort of a dag G=(V, E) is a linear ordering of all its vertices. (dag: Directed acyclic graph)如 edge(u, v), u appears before v in the ordering

undershorts socks 17/18 11/16 watch 9/10 12/15 pants shoes 13/14 1/8 shirt belt tie 2/5 6/7 jacket 3/4 socks undershorts pants shoes watch shirt belt tie jacket 17/18 11/16 12/15 13/14 9/10 1/8 6/7 2/5 3/4 • TOPOLOGICAL-SORT(G): (V+E) 1. Call DFS(G) to compute finishing times f[v] for each vertex v (V+E) 2. As each vertex is finished, insert it onto the front of a linked list O(1) 3. Return the linked list of vertices

Lemma 23.10 A directed graph G is acyclic iff DFS(G) yields no back edges. pf:  Suppose there is a back edge (u,v), v is an ancestor of u. Thus there is a path from v to u and a cycle exists.  Suppose G has a cycle c. We show DFS(G) yields a back edge. Let v be the first vertex to be discovered in c, and (u,v) be the preceding edge in c. At time d[v], there is a path of white vertices from v to u. By the white-path thm., u becomes a descendant of v in the DF forest. (u,v)is a back edge.

Thm 23.11 TOPOLOGICAL-SORT(G) produces a topological sort of G pf: Suppose DFS is run to determinate finishing times for vertices. It suffices to show that for any pair of u,v , if there is an edge from u to v, then f[v]<f[u]. When (u,v) is explored by DFS(G), v cannot be gray. Therefore v must be either white or black. 1. If v is white, it becomes a descendant of u, so f[v]<f[u] 2. If v is black, then f[v]<f[u]

Strongly connected components: • A strongly connected component of a directed graph G(V,E) is a maximal set of vertices U  V s.t. for every pair u, vU, u and v are reachable from each other. • Given G=(V,E), define GT=(V,ET), where ET={(u,v): (v,u)E}Given a G with adjacency-list representation, it takes O(V+E) to create GT. • G and GT have the same strongly connected components.

a b c d 13/14 11/16 1/10 8/9 G: 12/15 3/4 2/7 5/6 cd e f g h abe h a b c d 13/14 11/16 1/10 8/9 fg GT: 12/15 3/4 2/7 5/6 e f g h

StronglyConnectedComponents(G) 1. Call DFS(G) to compute finishing times f[u] for each vertex u 2. Compute GT 3. Call DFS(GT), but in the main loop of DFS, consider the vertices in order of decreasing f[u] 4. Output the vertices of each tree in the depth-first forest of step 3 as a separate strongly connected component • Time: (V+E)

w u v • Lemma 12: If 2 vertices are in the same strongly connected component, then no path between them ever leaves the strongly connecter component. pf: Assume that: u, v in the same component w is a vertex on the path from u to v uw wvu  u, w are in the same component

Thm 13 In any depth-first search, all vertices in the same strongly connected component are placed in the same depth-first tree pf: By lemma 12 and thm 8, every vertex in the strongly connected component becomes a descendant of r in the depth-first tree r: the first discovered vertex in the component

(u): forefather of u  the vertex w such that u w and f[w] is maximized • f[u]  f[(u)] (*) • ((u) )= (u), for any u,vV, u  v implies f[(v)]  f[(u)] {w: v w}{w: u w} (1) u  v  w (2) u  w’ Since u (u)  f[((u))]  f[(u)] By (*), we have f[(u)]  f[((u))] Thus, f[(u)]= f[((u))] and (u)= ((u)), since only one vertex can be finished at a time. w’ w

(u) all white u • Thm 14 In a directed graph G=(V,E), the forefather (u) of any vertex uV in any depth-first search of G is an ancestor of u. pf: • (u)=u : trivial, u is reachable from itself. • (u)  u : at time d[u], (u) can be (i) black (ii) gray (iii) white (i) If (u) is black, then f[(u)]<f[u]. impossible!! (ii) If (u) is gray, then (u) is an ancestor of u. Done!! (iii) Claim (u) can’t be white: 1. If every intermediate vertex is white, then (u) becomes a descendant of u. But then f[(u)] < f[u]. 2. If some intermediate vertex is nonwhite, t must be gray, since there is no blackwhite edge. Then there is a white path from t to (u). So (u) is a descendant of t. Thus, f[t] > f[(u)]  Since (u) should have the maximum finishing time (u) t u Last nonwhite vertex on this path

Corollary 15 In any DFS of a directed graph G, vertices u and (u), for all uV, lie in the same strongly connected component. pf: u (u), by definition. (u) u, some (u) is an ancestor of u, by Thm 14

Graph Algorithms