Chapter 5: Tree Constructions

Chapter 5: Tree Constructions • Breadth-First Search (BFS) • layer-based using Dijkstra’s algorithm • update-based using the Bellman-Ford algorithm • Distributed Depth-First Search (DFS) • Minimum spanning trees (MST) • Matroid Problems • solutions using synchronous upcasting yield faster alternatives to MST

Breadth First Search tree construction • recall that the flooding algorithm can be used to construct a BFS tree for a synchronous model • lower bounds for BFS algorithms • message complexity = Ω(|edges|) • time complexity = Ω(diameter) • flood algorithm may not construct a BFS tree for the asynchronous model • disallow large messages so we can focus on time/message tradeoffs

Layer-synchronized BFS construction (Dijkstra’s algorithm) • build tree layer by layer – at each stage, add all vertices which are adjacent to a vertex from a previously-constructed layer • r0 initiates construction by issuing phases with one new layer constructed at each phase • at each phase p + 1, assume the tree has already been constructed on p layers (denoted by Tp); then do the following: • r0 generates a pulse message and broadcasts it on Tp • each vertex v in Tpupon receiving pulse sends an exploration message Ex to all of its neighbors (except its parent) • for each vertex w, upon receiving Ex for the first time picks one neighbor, v, to be its parent (parent(w) = v) and sends Ack to this parent • if vertex w has already selected a parent: upon receipt of an Ex message, it replies with Ack as well as parent(w) • each leaf v of Tp collects Ack messages on its Ex messages; if an Ack from vertex w with parent(w) = v arrives, v adds w to its set: child(v)

Dijkstra’s algorithm (cont’d.): • algorithm description (cont’d.) • once a leaf v in Tphas received all of its Ack messages, it upcasts Ack to its parent in Tp; these Ack messages are then convergecast on Tp back to r0 • once this convergecast terminates at r0, it may begin the next phase • termination detection • each Ack message has a new field (initially set to 0) which will indicate if any new vertices were added to the tree in the current phase • a vertex v sets new(v) = 1 if any new vertices have responded to its Ex message by joining the tree as children • OR of these new bits is convergecast on the tree • if a phase ends with r0receiving new(v) = 0 in each Ack message from its children, then the next layer explored by the leaves in the current phase is empty and the tree is complete • inefficiencies: • certain Ex messages can be avoided – if only the left subtree of a node is unexplored, we still send Ex messages to the right subtree as well • some of the Ack messages can be omitted

Dijkstra’s algorithm (cont’d.): • complexities • (lemma 5.2.1) after phase p is completed, the variables parent and child correspond to a legal BFS tree spanning Γ0 (r0) = p-neighborhood of r0 • time = O(diam2(G)) • message = O(n * diam(G) + |E|) • analysis • time • time(phase p) = 2p + 2; broadcast and convergecast take p time units each; exploration takes two time units • for 1  p  diam(G), then • p time(phase p) = p 2p + 2 = O(diam2(G))

Dijkstra’s algorithm (cont’d.) • analysis (cont’d.) • message • assume p  0; let Vp be the set of vertices in T at layer p; let Epbe the internal edges of Vp; and let Ep,p+1 be the edges connecting Vp to Vp+1 • at phase p, exploration messages are sent only over Ep and Ep,p+1; and the edges of Tp are traversed twice, giving message(phase p) = O(n) + O(|Ep|) + O(| Ep,p+1 |) • for 1  p  diam(G), then • p message(phase p) = pO(n) + O(|Ep|) + O(| Ep,p+1|) = = O(n * diam(G) + |E|)

Update-based BFS construction (distributed Bellman-Ford algorithm) • modified flooding algorithm to ensure that a BFS tree is constructed in the asynchronous model • algorithm: • each vertex keeps a variable L(v) (initially set to ), its distance to the root • as flooding progresses, each vertex v sends L(v) to its neighbor w along with the flooded message • if a vertex w receives L(v) from its neighbor v and L(v) + 1 < L(w), then w chooses v as its parent and sets L(w) = L(v) + 1 • if this change occurs, then w also informs all of its other neighbors of its new (shorter) path to the root

Bellman-Ford distributed algorithm (cont’d.) • complexities • time = O(diam(G)) • message = O(n*|edges|) • analysis • synchronous – complexities are the same as in the flooding algorithm; once a vertex changes L(v) from , it won’t change it again • asynchronous • time: • assume d  1 • at d time units into the execution, each vertex v at distance d from the root has already received a L(d-1) message from some neighbor • v will then set L(v) = d and choose a parent w such that L(w) = d – 1 • induction on d gives O(diam(G))

Bellman-Ford distributed algorithm (cont’d) • asynchronous model analysis (cont’d) • message: • for a vertex v, the first value it assigns to L(v) is at most n-1 (the longest possible path in the network) • L(v) then changes at most n-2 times • each change to L(v) results in v sending messages on each of its outgoing edges • thus each v sends at most n*degree(v) messages • total messages = v n*degree(v) = O(n*|edges|)

Distributed Depth-First Search • general overview • algorithm • begin at some source vertex, r0 • when reaching any vertex v • if v has unvisited neighbors, then visit them • otherwise, return to parent(v) • when we reach the parent of some vertex v such that parent(v) = NULL, then we terminate since v = r0 • DFS defines a tree, with r0 as the root, which reaches all vertices in the graph • “back edges” = graph edges not in tree • sequential time complexity = O(|edges|)

Distributed DFS (cont’d.) • distributed version = token-based • the token traverses the graph in a depth-first manner using the algorithm described above • complexities • message = time = (|edges|) • note that edges are not examined from both endpoints; when edges (v,w) is examined by v, w then knows that v has been visited • analysis • message: • lower bound of (|edges|) to explore every edge

Distributed DFS (cont’d.) • analysis (cont’d.) • time: • ensure that vertices visited for the first time know which of their neighbors have/have not been visited; thus we make no unnecessary vertex explorations • algorithm: freeze the DFS process; inform all neighbors of v that v has been visited; get Ack messages from those neighbors; restart DFS process • additional time cost each time a vertex is first visited = O(1) • only edges of the DFS tree are traversed • therefore, time complexity = O(n)

Minimum spanning trees (MST) • evaluate the spanning tree by total weight • subgraph: • let G’ be a subgraph of the graph G with a set of edges E’ and weight function w( ); • then w(G’) = eE’w(e) • then define the MST of a tree T as a spanning tree TM which minimizes w(TM) • MST problem • given a weighted graph G = (V,E,w), compute an MST for G • edges are assumed to be distinct, thus yielding an unique MST for G • if not unique, such weights can be created using vertex identifiers • however in anonymous networks without distinct edge weights or distinct index identifiers, no distributed algorithm exists for computing an MST with a bounded number of messages

MST (cont’d.) • in the worst case, distributed MST construction requires • (|E|) messages for weighted n-vertex graphs • (n logn) messages for arbitrary n-vertex graphs • definitions • an MST fragment is a tree T in G where  MST TMof G such that T is a subtree of TM • edge e = (v,w) is an outgoing edge of fragment T if either v or w (but not both) belongs to T • MWOE(T) = minimum weight outgoing edge of fragment T • blue rule: • given fragment T and e = MWOE(T) create T’ = T {e} • lemma 5.5.6 - T’ is a fragment as well

MST (cont’d.) • Prim’s algorithm (distributed version) • works by repeatedly applying the blue rule to each resulting T’ and each resulting e’ = MWOE(T’), as above, to yield the MST for G • works with both asynchronous and synchronous models • algorithm • let vertex r0be the source as well as first fragment T • use pulse messages broadcast on the current fragment T to synchronously add the MWOE(T) – each vertex in T sends its MWOE • convergecast the MWOE’s (each vertex sends the minimum it has seen) towards r0 • the MWOE is then selected by r0 and broadcast on the tree • complexities • time = message = O(n2)

MST (cont’d.) • synchronous GHS algorithm • Prim’s algorithm is still fairly sequential • GHS (distributed version of Kruskal’s algorithm) is less sequential and thus more efficient • Kruskal’s algorithm • each vertex v is initially a fragment • at each step, the MWOE of all fragments is selected and added to the tree, thus merging the two fragments it touches • when a single fragment remains, it is the MST for T • sequential – n-1 steps still needed

MST (cont’d.) • GHS algorithm overview • works with synchronous model • vertices are partitioned into fragments, with each fragment Fi being a rooted tree • each fragment has an identifier (possibly the identifier of its root) • each vertex in a fragment knows its parent, children and the identifier of the fragment • works in phases, each with input of the fragment structure from the previous phase and output of larger fragments • description of each phase: • all vertices of a fragment F cooperate to find the MWOE(F) – carried out as in Prim; it is assumed that each vertex knows which of its edges is outgoing • a Request_to_merge message is sent over e = MWOE(F) to fragment F’, carrying F’s identifier • the two fragments then combine (possibly with several other fragments if MWOE(F’)  MWOE(F)) into a larger fragment

MST (cont’d.) • description of each phase (cont’d.) • once connected, the two fragments (now one) proceed as follows • assume fragments F1 and F2, where e = MWOE(F1) = MWOE(F2) • assume e = (v1,v2) where v1F1and v2 F2 • the root of the new fragment is chosen as the higher identifier of the two vertices v1 and v2, say v1 • the new root, v1, broadcasts a New_fragment message throughout the combined fragment F’ informing all vertices of its identifier (the new identifier of F’) • each vertex updates its identifier and root entries and the direction of its fragment edges to point to its new parent (the vertex which sent the message) – thus now “pointing” towards the new root of F’ • each vertex then updates its neighbors of its fragment identifier

MST (cont’d.) • complexities • message = O(|E| log n) • time = O(n log n)

Chapter 5: Tree Constructions