1 / 62

Lecture 11 - PowerPoint PPT Presentation

Lecture 11. Graph Algorithms. Definitions. Graph is a set of vertices V, with edges connecting some of the vertices (edge set E). . An edge can connect two vertices. An edge between vertex u and v is denoted as (u, v).

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'Lecture 11' - roza

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Lecture 11

Graph Algorithms

Graph is a set of vertices V, with edges connecting some of the vertices (edge set E).

An edge can connect two vertices.

An edge between vertex u and v is denoted as (u, v).

• In a directed graph (digraph) all edges have directions. In an undirected graph, an edge does not have a direction.

• Unless otherwise mentioned a graph is undirected

B

E

F

C

D

Real life examples where graphs are useful

• Networks:

• Routers are vertices

• Shortest path between vertices.

Constraint representation:

If A is there, B can not be there, etc.

A

B

B

E

E

F

F

C

C

D

D

A vertex v is adjacent to vertex u, if there is an edge (u, v), e.g., A, B

In an undirected graph, existence of edge (u, v) means both u and v are adjacent to each other.

In a digraph, existence of edge (u, v) does not mean u is adjacent to v.

E is adjacent to B, but B is not adjacent to E

B

E

F

C

D

An edge may have a weight, e.g., distance of a highway

0.4

0.5

0.1

0.1

0.3

0.2

A path in a graph is a sequence of vertices w1 w2 …..wP such that consecutive vertices wi wi+1 have an edge between them, i.e., wj+1 is adjacent to wj

ABE is a path

A path in a graph is simple if all vertices are distinct, except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

Length of a path is the number of edges in the path.

What is the length of a simple path of path ABE?

2

A cycle is a path of length at least 1 such that the first and the last vertices are equal. Example?

BCDEB

A cycle is simple if the path is simple, e.g., BCDEB

For undirected graph, we require a cycle to have distinct edges.

Length of a cycle is the number of edges in the cycle.

What is the length of a simple cycle of P vertices?

P

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

A

B

B

E

E

F

F

C

C

D

D

Connected Graphs

A graph is connected if there is a path from every vertex to every other vertex.

A directed graph with this property is strongly connected.

If a directed graph is not strongly connected, but underlying undirected graph is connected then the directed graph is weakly connected.

Connected

Weakly connected, cant go from E to A

Subgraphs and Components except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

A subgraph of a graph is a graph which has a subset of vertices and a subset of edges of the original graph.

• A component is a subgraph which satisfies two properties

• is connected, and

• Maximal with respect to this property.

``Connected ‘’ will be replaced by ``strongly connected’’ for digraphs.

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

A

B

B

E

E

F

F

C

C

D

D

H

G

Entire graph is a component

Two components:

ABCDEF

GH

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

B

E

F

C

D

3 components

BCDE

A

F

B except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

E

C

D

Complete Graph

A graph which has edge between any vertex pair is complete, e.g., BCDE

A complete digraph has directed edges between any two vertices.

How many edges can a complete graph of N vertices have? N(N-1)/2

Representation of Graphs except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

Let there be N vertices,0, 1,….N-1

Declare a N x N array A(adjacency matrix)

A[j][k] = 1 if there is an edge (j, k)

= 0 otherwise

Storage?

O(N2)

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

B

E

F

C

D

A B C D E F

A 0 1 0 0 0 0

B 1 0 1 0 1 0

C 0 1 0 1 0 0

D 0 0 1 0 1 0

E 0 1 0 1 0 1

F 0 0 0 0 1 0

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

B

E

F

C

D

A B C D E F

A 0 1 0 0 0 0

B 0 0 1 0 1 0

C 0 0 0 1 0 0

D 0 0 0 0 1 0

E 0 0 0 0 0 1

F 0 0 0 0 0 0

What will be a the structure of A for an undirected graph? except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

A[j][k] = A[k][j]

symmetric

Weighted graph

A[j][k] = weight of edge (j, k) if edges have weights

= a very large or very small value if edge (j, k) does not exist

Adjacency List except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

If the graph is complete or almost complete, then adjacency matrix representation is fine, otherwise O(N2) storage is used even though there are fewer edges

Adjacency list is a more efficient storage.

Every vertex has a linked list of the vertices which are adjacent to it.

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

B

E

F

C

D

A B

B A,C,E

C B,D

D C,E

E B,D,F

F E

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

B

E

F

C

D

A B

B C,E

C D

D E

E F

F

Storage: O(V+E) except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

However, problem with adjacency list is that one may have to traverse the entire link list corresponding to a vertex in order to locate an edge.

Sparse and Dense Graphs except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

Sparse graphs have (V) edges.

Dense graphs have (V2) edges.

(V)

What is the storage in adjacency list for sparse graphs?

dense graphs?

dense graphs?

(V2)

(V2)

(V2)

Adjacency list is better for sparse graphs, adjacency matrix for dense graphs

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

B

E

F

C

D

Degree Relations

Number of edges incident from a vertex is the degree of the vertex in a graph.

1

Deg(A) ?

3

Deg(B) ?

Number of edges ending at a vertex is the indegree of the vertex in a digraph.

Number of edges originating from a vertex is the outdegree of the vertex in a digraph.

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

B

E

F

C

D

Indeg(A) ?

0

1

Indeg(B) ?

Outdeg(A) ?

1

2

Outdeg(B) ?

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

A

B

B

E

E

F

F

C

C

D

D

For a graph,

Sum of degrees of all vertices = 2. Number of edges

Sum of degrees ?

12

Number of edges ?

6

For a digraph, sum of indegrees of all vertices = sum of outdegrees of all vertices = sum of edges

Sum of indegrees ?

6

Sum of outdegrees ?

6

Graph Traversal except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

• Starts from a vertex s (source), and discovers all vertices which are reachable from s

• A vertex v is reachable from s if there is a path from s to v.

Start from a vertex v

Go to all vertices adjacent to v

For each of these adjacent vertices, go to all vertices adjacent to these

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

B

E

F

C

D

Start from C

Go to B, D

From B go to A, E

From E go to F

Vertices are discovered in order of increasing shortest path lengths from start vertex

Order of discovery starting from C: B,D,A,E,F

Shortest path lengths from C: B,D(1), A,E(2), F(3)

We use a FIFO queue in the search which is initially empty. except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

Initially all vertices are colored white.

When a vertex is ``discovered’’, it is colored gray and put in the FIFO queue

First source is discovered, then the vertices adjacent to source, etc.

Remove vertices from the queue in FIFO order, discover its undiscovered neighbors and put them in the FIFO queue

When all vertices adjacent to a vertex v have been discovered, the algorithm finishes processing v, and colors it black.

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

A

A

B

B

B

E

E

E

F

F

F

C

C

C

D

D

D

Discover source C, enqueue it

Queue = C

Dequeue C

Discover B, D, make them gray

Queue = B,D

Blacken C

Dequeue B, Discover A E, make them gray

Queue = D,A, E, Blacken B

Dequeue D, nothing new to discover

Queue = A,E, Blacken D

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

A

B

B

E

E

F

F

C

C

D

D

Dequeue A, Nothing new to discover

Queue = E, Blacken A

Dequeue E, Discover F, make it gray

Queue = F, Blacken E

Dequeue F, Nothing new to discover

Queue = empty, Blacken F

Notation: except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

d[u] is the length of the shortest path from s to u

color[u] is the color of vertex u

pred[u] is the predecessor of u in the search

BFS is used to measure the shortest paths by maintaining d[u]

BFS generates a tree by maintaining the predecessors

pred(C ) = Null, pred(B ) = pred(D ) = C ,

pred(A ) = B, pred(E ) = B, pred(F ) = E

Pseudocode except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

BFS(G,s)

{

For each v in V, {color[v]=white; d[u]= INFINITY; pred[u]=NULL}

color[s] = gray; d[s]=0;

Queue = {s};

While Queue is nonempty

{

u = Dequeue[Q];

For each v in Adj[u], { except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

if (color[v] = white) /*if v is discovered*/

{

color[v] = gray; /*Discover v*/

d[v] = d[u] + 1; /*Set distance of v*/

pred[v] = u; /*Set pred of v*/

Enqueue(v); /*put v in Queue*/

} }

Color[u] = black; /*done with u*/

}

}

Work out predecessors and distance labels in he previous example.

Complexity Analysis except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

• A vertex is visited once.

• Thus the while loop is executed at most V times.

Complexity of operations inside the for loop is constant.

We want to compute the number of times the for loop is executed.

For each vertex v the for loop is executed at most (deg v + 1) times.

The factor 1 comes as for a 0 degree vertex we need a constant complexity

Thus the for loop is executed except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.v (deg v + 1) times

This equals V + 2E

Initialization complexity is V

Thus overall we have complexity V + V + 2E, i.e. O(V+E)

Depth First Search except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

Another graph traversal process.

We want to visit all rooms in a castle.

Start from a room

Move from room to room till you reach an undiscovered room

Draw a graffiti in each undiscovered room

Once you reach a discovered room take a door which you have not taken before.

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

B

E

F

C

D

Start from C

Go to B

From B go to A

From B go to E

From E go to F

Go to D

Will have 3 possible colors for a vertex: except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

white for an undiscovered vertex

gray for a discovered vertex

black for a finished vertex

Will store predecessor

Will store 2 numbers for each vertex (timestamps)

When we first discover a vertex store a counter d[u]

When you finish off store another f[u]

d[u] is not the distance

Pseudocode except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

DFS(G)

{

For each v in V, {color[v]=white; pred[u]=NULL}

time=0;

For each u in V

If (color[u]=white) DFSVISIT(u)

}

time is a global variable

DFSVISIT(u) except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

{

color[u]=gray;

d[u] = ++time;

For each v in Adj(u) do

If (color[v] = white)

{

pred[v] = u;

DFSVISIT(v);

}

color[u] = black; f[u]=++time;

}

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

A

A

A

B

B

B

B

E

E

E

E

F

F

F

F

C

C

C

C

D

D

D

D

Discover source C, d[C] = 1

Discover B, make it gray

d[B] = 2

Discover A, make it gray

d[A] = 3

Finish A, f[A] = 4, Blacken A

Discover E, make it gray, d[E]=5

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

A

A

B

B

B

E

E

E

F

F

F

C

C

C

D

D

D

Discover F, d[F] = 6

Finish F, f[F]=7

Discover D, d[D]=8

Finish D, f[D]=9, Finish E f[E] = 10,

Finish B, f[B] = 11, Finish C, f[C] = 12

pred[B]=C, pred[A]=B, pred[E]=B, pred[F]=E, pred[D] = E

Complexity Analysis except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

Note down example from board

There is only one DFSVISIT(u) for each vertex u.

Let us analyze the complexity of a DFSVISIT(u)

Ignoring the recursion calls the complexity is O(deg(u)+1)

We consider the recursive calls in separate DFSVISIT(v)

Initialization complexity is O(V)

Overall complexity is O(V + E)

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

B

E

F

C

D

DFS Tree Structure

Consider a directed graph.

Observe that if u is predecessor of v in DFS, there is an edge (u, v) in the graph.All such edges are predecessor edges or tree edges.

The predecessor edges constitute an acyclic graph (DFS tree)

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

B

E

F

C

D

If there is a path from v to u in the DFS tree, then u is an ancestor of v in DFS tree and v is a descendant of u.

B is ancestor of F

F is descendant of B

A except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

B

E

F

A

C

C

D

B

An edge (u, v) in the graph such that v is an ancestor of u is a ``back edge’’, e.g., edge CD

An edge (u, v) where v is a direct descendant of u is a forward edge, e.g., BE

An edge (u, v) where u is not an ancestor nor descendant of v is a cross-edge.

Do cross edges exist in undirected graphs?

No

Are all edges either forward, cross and back?

No in digraph (e.g., AC), yes in graph

Other edges are called tree edges

2,5 except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

1,6

3,4

7,10

8,9

A

C

B

D

E

DFS starts from A, order A,B,C,E,D

Forward edge: AB, BC, ED

Tree edge: AC

Cross edge: DB

Backward edge: CA

d[A] = 1, d[B] = 2, d[C]=3, f[C]=4, f[B]=5, f[A]=6, d[E]=7,d[D]=8,f[D]=9,f[E]=10

Relation between timestamps and ancestry except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

u is an ancestor of v if and only if [d[u], f[u]] [d[v], f[v]]

u is a descendant of v if and only if [d[u], f[u]][d[v], f[v]]

u and v are not related if and only if [d[u], f[u]] and [d[v], f[v]] are disjoint

These relations are called parenthesis lemma

Applications of DFS except possibly the first and the last one, e.g., ABE, but ABEB is not a simple path, unless otherwise stated paths are simple.

DFS can be used to find out whether a graph or a digraph contains a cycle.

Consider a digraph. It has a cycle if and only if the graph has a back edge. The same holds for graphs.

Run DFS

Check the nature of every edge (How do you know whether an edge is a back edge or not?)

If there is a back edge, then the graph has a cycle.

Complexity?

O(V + E)

Now we show that a digraph has a cycle if and only if there is a back edge.

If there is a back edge there is a cycle.

We show that if there is a cycle, there is a back edge.

Consider an edge (u, v) in a digraph. If it is a back edge, then f[u] f[v]. Otherwise (for tree, forward, cross edges) f[u] > f[v]

We show it as follows. For tree, back and forward edges, the result follows intuitively as well as from the parenthesis lemma.

For cross edge (u, v) note that the intervals [d[u], f[u]] and [d[v], f[v]] are disjoint.

When we were processing u, v was not white otherwise (u, v) will be a predecessor edge

Thus processing v started before processing u.

Thus d[v] < d[u].

Since the intervals are disjoint, this means f[v] < f[u].

Now we show that if there is a cycle, there is a back edge.

Suppose there is no back edge. and [d[v], f[v]] are disjoint.

Move along any path. All edges are tree forward or cross edges. Thus the finish times decrease monotonically.

Hence we don’t come back to the same vertex. Thus there is no cycle.

Topological Sort and [d[v], f[v]] are disjoint.

A DAG (directed acyclic graph) is a digraph without any cycle.

Topological sort of DAG is ordering the vertices such that if there is an edge (u, v) then u must come before v in the order.

Application: You have a set of tasks to be completed in a factory. There are relations between some tasks such that A must be finished before B begins (Example: To build second floor you must construct first floor first, but there is no relation between electrical wiring and plumbing).

We need to order the tasks. and [d[v], f[v]] are disjoint.

We represent the tasks by vertices and there is an edge (u, v) if u must be finished before v begins.

Next we do a topological sort on them.

(There is something wrong with the task relations if the representation has a cycle. This can be detected by a DFS cycle detection. So we assume that the graph is a DAG).

Note that any ordering of the vertices, such that there is no edge from a vertex later in the order to another which is ahead in the order is a valid topological order.

Thus for any edge (u, v) f(u) > f(v).

Thus vertices ordered in decreasing order of their finish times has a topological order.

This can be attained as follows.

While running DFS, whenever a vertex is colored black add it to the front of a linked list.

Output the linked list at the end.

Complexity?

O(V+E)

Pseudocode edge.

Topological Sort(G)

{

For each v in V, {color[v]=white; pred[u]=NULL}

time=0;

For each u in V

If (color[u]=white) DFSVISIT(u)

}

Topology-DFSVISIT(u) edge.

{

color[u]=gray;

d[u] = ++time;

For each v in Adj(u) do

If (color[v] = white)

{

pred[v] = u;

DFSVISIT(v);

}

color[u] = black; f[u]=++time;

}

2,5 edge.

1,6

3,4

7,10

8,9

Go to C

Finish C

Finish B

Finish A

Go to D

Finish D

Go to E

Finish E

A

C

B

D

E

Go to B

Strong Components edge.

• A strong component of a digraph is a subgraph which is

• Strongly connected

• Maximal w.r.t. this property

How many strong components does a strongly connected digraph have?

1

A variant of DFS gives all strong components of a digraph

A edge.

B

E

F

C

D

If we start DFS from any vertex in a strongly connected component, we will finish all other vertices in the strong component before finishing this vertex,

And possibly finish a few other vertices. (leaking into other components)

Strong components:

A, B,C,D,E, F

Start DFS from B, Cover C,D,E and leak to F

If there is no leaking, we are done!

If we know the strong components apriori, we can prevent leaking by choosing the DFS order properly, e.g., an order of selection E, B then A would yield the strong components

A edge.

BCDE

F

Replace each strong component by a single vertex.

There is an edge from vertex A to vertex B if there is an edge from a vertex u in component A to another vertex v in component B.

The resulting graph is a DAG. Why?

Order the vertices of the new DAG in a reverse topological order (reverse order of the normal topological order).

Example: F, BCDE, A

Now, if you choose vertices in the above order, you are done

However, we don’t know the components ahead of time. But, the previous argument tells us that we need to use reverse topological order somehow.

The following actually works!

Run DFS

Sort the vertices in decreasing order of finish times

Reverse the digraph.

Run DFS. Each time you need to choose a vertex choose it in the sorted order.

Whenever you retrace to the main loop, you actually start a new strongly connected component.

A the previous argument tells us that we need to use reverse topological order somehow.

A

B

B

E

E

F

F

C

C

D

D

First DFS order: A,B,C,D,E,F

Sorted order: A,B,C,D,E,F

Reversed Digraph

Strong components are: A, BEDC, F

Complexity Analysis the previous argument tells us that we need to use reverse topological order somehow.

What sort would you use?

Integer sort

What is the overall complexity?

O(V+E)

First DFS: O(V+E)

Sorting: O(V)

Reversing digraph: O(V+E)

Second DFS: O(V+E)