F96943167 施信瑋 F97943070 方劭云 R98943086 莊舜翔 R98943090 曹蕙芳

Special Topics on Graph Algorithms Finding the Diameter in Real-World GraphsExperimentally Turning a Lower Bound into an Upper Bound F96943167 施信瑋 F97943070 方劭云 R98943086 莊舜翔 R98943090 曹蕙芳 R98943088 周邦彥 R98921072 金蘊 R99921040 林國偉 R99942061 葉書豪

Outline Introduction R98943086 莊舜翔 R98943090 曹蕙芳 R98943088 周邦彥 Previous Work R99921040 林國偉 R99942061 葉書豪 R98943086 莊舜翔 Finding the Diameter in Real-World Graphs F96943167 施信瑋 F97943070 方劭云 Other Related Topics Conclusion and Future Work R98921072 金蘊

Diameter • The length of the "longest shortest path" between any two vertices in a graph or a tree • Given a connected graph G = (V,E) with n=|V| vertices and m=|E| edges • the diameter D is Max d(u,v) for u,v in V, where d(u,v) denotes the distance between node u and v 3 3 2 2 3 3 1 3 1 2 2 5 5 3 3 5 2 4 A Tree, D = 13 A Graph, D = 9 3

Diameter of a Tree • The diameter of a tree can be computed by applying double-sweep algorithm: • 1. Choose a random vertex r, run a BFS at r, and find a vertex a farthest from r • 2. Run a BFS at a and find a vertex b farthest from a • 3. Return D = d(a,b) 0 8 r 3 3 2 2 3 11 3 3 2 10 1 1 3 5 2 2 3 11 3 3 5 5 5 13 b 6 8 a 8 0 a 4

Diameter of a Graph • Double-sweep algorithm might not correctly compute the diameter of a graph • It provides a lower bound instead 7 0 r 2 2 8 3 3 3 5 b 2 1 1 3 3 5 3 2 2 3 2 5 5 6 3 5 5 3 3 3 6 5 1 3 2 2 2 5 0 4 4 4 7 a 6 a 5 3 2 4 D = 9 5

Naïve Algorithm • Perform n breadth-first searches (BFS) from each vertex to obtain distance matrix of the graph • Θ(n(n+m)) time and Θ(m) space • By using matrix multiplication, the distance matrix can be computed in O(M(n)logn) time and Θ(n2) space [Seidel, ACM STC’92] • M(n): the complexity for matrix multiplication involving small integers only (O(n2.376))  Is too slow for massive graphs and has a prohibitive space cost

All Pairs Shortest Path • Compute the distances between all pairs of vertices without resorting to matrix products • [Feder, ACM STC’91]: Θ(n3 / logn) time and O(n2) space • [Chan, ACM-SIAM’06]: O(n2(loglogn)2 / logn) time and O(n2) space  Still too slow and space consuming for massive graphs

All Pairs Almost Shortest Path (1/2) • Compute almost shortest paths between all pairs of vertices [Dor, ECCC’97] • Additive error 2  • Treat high-degree vertices and low-degree vertices separately

All Pairs Almost Shortest Path (2/2) w’ • Additive error 2: apasp2 • O(min(n3/2m1/2, n7/3)logn) time and Θ(n2) space  Still too expensive w v u w’ w v u

Self-checking Heuristics • Too expensive to obtain the exact value or accurate estimations of the diameter for massive graphs  Empirically establish some lower and upper bounds by executing a suitable small number of BFS • L≦ D≦ U • Obtain the actual value of D for G when L = U  Self-checking heuristics

Self-checking Heuristics • No guarantee of success for every feasible input, BUT • 1) It requires few BFSes in practice, and thus its complexity is linear [Magnien, JEA’09] • 2) An empirical upper bound is possible • 3) Large graphs can be analyzed • since BFS has a good external-memory implementation [Mayer, AESA’02] and works on graphs stored in compressed format [Vigna, IWWWC’04]

A Comparing Work • “Fast Computation of Empirically Tight Bounds for the Diameter of Massive Graphs” [Magnien, JEA’09] • Various bounds to confine the solution range • Trivial bounds • Double sweep lower bound • Tree upper bound • Iterative algorithm to obtain the actual diameter

Trivial Bounds • The eccentricity of any vertex v gives trivial bounds of the diameter: ecc(v) ≤ D ≤ 2•ecc(v) • Trivial bounds can be computed in Θ(m) space and time, where m is the number of edges in the graph • D ≤ 2•ecc(v) • If D > 2•ecc(v), then max(ecc(v)) > 2•ecc(v) • We can choose a center point in the diameter that contradicts the derived inequality • Therefore, D ≤ 2•ecc(v)

Double Sweep Lower Bound • On chordal graphs, AT-free graphs, and tree graphs, if a vertex v is chosen such that d(u, v) = ecc(u) for a vertex u, then D = ecc(u) (i.e. v is among the vertices which are at maximal distance from u) [Corneil’01, Handler’73] • The diameter may therefore be computed by a BFS from any node u and then a BFS from a node at maximal distance from u, thus in Θ(m) space and time, where m is the number of edges. • Generally, the value obtained in this way may different from the diameter, but still better than trivial lower bounds

Double Sweep Lower Bound: An Example 2 0 2 1 1 1 1 1 D = 2 2 2 2 2 actual diameter 0 2 2 1 D = 4

Tree Upper Bound • The diameter of any spanning connected subgraph of G is larger than or equal to the diameter of G • Tree diameter can be obtain in Θ(m) time and space [Handler’73], where m is the number of edges in G • Spanning trees of G, are good candidates for obtaining an upper bound • A tree upper bound is the diameter of a BFS tree from a vertex • It is always better than the corresponding trivial upper bound

Tree Upper Bound: An Example 2 1 3 0 1 2 4 1 D’ = 5 0 3 4 1 actual diameter 5 2 5 2 D = 4

Tighten the Bounds • Iteratively choosing different initial vertices for tighter bounds (for tree upper bounds) • Random tree upper bound (rtub) • Iterate the tree upper bound from random vertices • Highest degree tree upper bound (hdtub) • Consider vertices in decreasing order of degrees when iterating the algorithm

The Iterative Algorithm • Iterate the double sweep lower bound and highest degree tree upper bound until the difference between the best bounds obtained is lower than or equal to a given threshold value • Multiple choices for this threshold value • Depending factors: the graph considered, the desired quality of the bounds, or even set the threshold to be a given precision (e.g. D’-D/D<p) • All heuristics have a Θ(m) time complexity, and a Θ(m+n)space complexity. • Does the tree upper bound eventually converge to the exact diameter?

Possibly Unmatching Upper Bound • No guarantee of obtaining the exact diameter as all the tree upper bounds may be strictly larger than D • E.g. if G is a cycle of n vertices, its diameter is n/2 and the tree upper bound is n-1 which ever vertex one starts from • Is there an algorithm that provides more matching upper bounds? D’ = 5 D = 3

The Fringe Algorithm • Fringe method is used to improve the upper bound U and possibly match the lower bound L obtained by the double sweep method

The Fringe Algorithm • An unweighted, undirected and connected graph G=( V, E ) • For any vertex Tu denotes an unordered BFS-tree Eccentricity ecc(u) is the height of Tu => 2* ecc(u) ≧ diam(G)

The Fringe Algorithm • Proof 2* ecc(u) ≧ diam(G) => ecc(u) ≧ diam(G)/2 1) if ecc(u) < diam(G)/2, diam(G) ≡d(a,b) d(u,v) < diam(G)/2, for all then d(u,a)<diam(G)/2 d(u,b)<diam(G)/2 => d(u,a)+d(u,b)< d(a,b) contradiction!!! ∴ 2* ecc(u) ≧ diam(G) diameter b a u diameter

The Fringe Algorithm • Tu denotes an unordered BFS-tree Tu is a subgraph of G • , , , => let , so diam (Tu ) U

The Fringe Algorithm • The fringe of u, denote F(u), as the set of vertices such that U |F(U)| = 3

The Fringe Algorithm U B(u) = max {ecc(A), ecc(B), ecc(C)} A A B B C C BFS(A) =>ecc(A) BFS(B) =>ecc(B) BFS(C) =>ecc(C)

The Fringe Algorithm • The fringe of u, denote F(u), as the set of vertices such that

The Fringe Algorithm • Lemma. U(u) ≧D, where D is the diameter of G

The Fringe Algorithm • Case 1 : |F(u)| = 1 => • Case 2 : |F(u)| > 1 , B(u)=2ecc(u) => • Case 3 : |F(u)| > 1 , B(u)=2ecc(u)-1 => • Case 4 : |F(u)| > 1 , B(u)<2ecc(u)-1 =>

The Fringe Algorithm • Case 1 : |F(u)| = 1 U

The Fringe Algorithm • Case 2 : |F(u)| > 1 , B(u)=2ecc(u) • ecc(u) = 3 , diam(Tu) = 6 diameter upper bound = 6 • B(u) provides lower bound => if B(u) = 2 * ecc(u) ∴ diameter = diam(Tu) U

The Fringe Algorithm • Case 3 : |F(u)| > 1 , B(u)=2ecc(u)-1 • Non-leave node upper bound = 2ecc(u)-2 • Leave node upper bound = 2ecc(u) • if B(u) = 2ecc(u)-1 =>diameter= 2ecc(u)-1 U d(a,u) ≦ ecc(u)-1 d(b,u) ≦ ecc(u)-1 b a

The Fringe Algorithm • Case 4 : |F(u)| > 1 , B(u)<2ecc(u)-1 • Non-leave node upper bound = 2ecc(u)-2 • Leave node upper bound = 2ecc(u) • if B(u) < 2ecc(u)-1 =>diameter≦ 2ecc(u)-2 U d(a,u) ≦ ecc(u)-1 d(b,u) ≦ ecc(u)-1 b a

The Fringe Algorithm • The fringe algorithm correctly computes an upper bound for the diameter of the input graph G, using at most |F(u)|+3 BFS.

The Fringe Algorithm • Let r,a,and b be the vertices identified by double sweep(using two BFSes) • Find the vertex u that is halfway along the path connecting a and b inside the BFS-tree Ta • Compute the BFS-tree Tu and its eccentricity ecc(u) • If |F(u)|>1,find the BFS-tree Tz for each and compute B(u) • If B(u)=2ecc(u)-1,return 2ecc(u)-1 • If B(u)<2ecc(u)-1,return 2ecc(u)-2 • Return the diameter(Tu)

Example(1/2) x1 … xp When number of P is large !! We choose X1 as r choose A ,B, x1 as b y1->A =4 y1->B =4 y1->x1 =4 diameter = 4 B * DS x1->A = 3 x1->B = 3 x1->y1 = 4 Choose y1 as a Diameter=6 row=3 B A Wrong !!! y1 column=6

Example(2/2) x1 … xp Case 1 : IV.B(u)<2ecc(u)-1 6 < (2*4) -1 return 2ecc(u)-2 diameter = 6 Case 2 : IV.B(u)=2ecc(u) 6 = (2*3) return 2ecc(u) diameter = 6 Case 1 : III. ecc(u) = 4 |F(u)|>1 B(u)=6 Case 2 : III. ecc(u) = 3 |F(u)|>1 B(u)=6 II. Find a vertex u that is halfway along the path connecting a and b • Fringe • I. Use DS to find • a and b • x1 as a • y1 as b row=3 y1 column=6 u

A Bad Case for Fringe r a

A Bad Case for Fringe b u a

A Bad Case for Fringe • Ecc(u) = 3 • B(u) = 3 • B(u) < 2ecc(u) – 1(5) • return 2ecc(u) – 2(4) • Real diameter = 3 • ∴ Fringe fail !!! u F(u)

Experimental Results (1/2) • Implemented inC on a 2.93Ghz Linux workstation with 24 GB memory • 44 real-word graphs are tested • each with 4000 ~ 50 million nodes, 20000 ~ 3000 million edges • Real diameter is found by exhaustive search to check the obtained upper bounds 43

Experimental Results (2/2) • The proposed method generates the tightest upper bound for the 7 mismatches, compared with the approaches in previous work 44

Finding the Diameter on Weighted Graphs • Consider a large complete graph with edge weight be 1 except for only one edge • The eccentricities of most points are 1 • However, the diameter of the graph is larger than 1 • The fringe algorithm may not efficiently find tight diameter bounds for weighted graphs 1 1 1 1 1 1.5

Minimum Diameter Spanning Trees • Minimum diameter spanning tree (MDST) problem Given a graph G=(V,E) with edge weight Find a spanning tree T for G such that is minimized 3 3 2 2 1 1 2 2 2 4 1 1 Diameter=3 Diameter=5 MDST

Outline Introduction R98943086 莊舜翔 R98943090 曹蕙芳 R98943088 周邦彥 Previous Work R99921040 林國偉 R99942061 葉書豪 R98943086 莊舜翔 Finding the Diameter in Real-World Graphs Other Related Topics F97943070 方劭云 Geometric MDST MDST F96943167 施信瑋 Conclusion and Future Work R98921072 金蘊

Geometric MDST • Geometric MDST (GMDST) Given a set of n pointsin the Euclidean space, find a spanning tree connecting these points so that the length of its diameter is minimum • GMDST corresponds to finding an MDST on a complete graph with edge weight being the Euclidean distance between two points

Monopolar and Dipolar • A spanning tree is said to be monopolar if there exists a point (called monopole) s.t. all remaining points are connected to it • A spanning tree is said to be dipolar if there exists two points (called dipole) s.t. all remaining points are connected to one of the two points in the dipole dipole monopole A dipolar spanning tree A monopolar spanning tree

F96943167 施信瑋 F97943070 方劭云 R98943086 莊舜翔 R98943090 曹蕙芳

F96943167 施信瑋 F97943070 方劭云 R98943086 莊舜翔 R98943090 曹蕙芳

Presentation Transcript