Graphs

1 / 129

Graphs - PowerPoint PPT Presentation

Graphs. 1843. ORD. SFO. 802. 1743. 337. 1233. LAX. DFW. Outline and Reading. Graphs (§6.1) Definitions Applications Terminology Properties ADT Data structures for graphs (§6.2) Edge list structure Adjacency list structure Adjacency matrix structure.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about ' Graphs' - gefen

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Graphs

1843

ORD

SFO

802

1743

337

1233

LAX

DFW

• Graphs (§6.1)
• Definitions
• Applications
• Terminology
• Properties
• Data structures for graphs (§6.2)
• Edge list structure
• Adjacency list structure
• Adjacency matrix structure
The Position ADT models the notion of the place within a data structure where a single object is stored

It gives a unified view of diverse ways of storing data, such as

a cell of an array

a node of a linked list

Just one method:

object element(): returns the element stored at the position

Position ADT- Recall from Chpt 2
A graph is a pair (V, E), where

V is a set of nodes, called vertices

E is a collection of pairs of vertices, called edges

Vertices and edges are positions and store elements

Edges can be directed or undirected.

If all the edges are directed, the graph is called a directed graph or digraph.

Graph

849

PVD

1843

ORD

142

SFO

802

LGA

1743

337

1387

HNL

2555

1099

1233

LAX

1120

DFW

MIA

Examples of Graphs
• Airline route map
• Vertices: an airport identified by a three-letter airport code
• Edges: a flight route between two airports and specified by the mileage of the route
• Flowcharts
• These represent the flow of control in a procedure.
• Vertices: symbolic flowchart boxes and diamonds
• Edges: connecting lines
• Binary relations
• We say xRy if (x,y) ε SxS, where S is a set.
• Vertices: elements in S
• Edges: connect x and y iff xRy.
Examples of Graphs
• Computer networks
• Vertices: computers
• Edges: communication links
• Electric circuits
• Vertices: diodes, transistors, capacitors, switches, ...
• Edges: Connecting wires
• These examplessuggest the type of questions that can be asked when working with graphs:
• What is the cheapest way to fly to NY
• What route involves the least flying time?
• If one city is closed due to bad weather, does a route to NY exist?
Edge Types
• Directed edge
• ordered pair of vertices (u,v)
• first vertex u is the origin
• second vertex v is the destination
• e.g., a flight
• Undirected edge
• unordered pair of vertices (u,v)
• e.g., a flight route
• Directed graph
• all the edges are directed
• e.g., flight network
• Undirected graph
• all the edges are undirected
• e.g., route network

flight

AA 1206

ORD

PVD

849

miles

ORD

PVD

Applications
• Electronic circuits
• Printed circuit board
• Integrated circuit
• Transportation networks
• Highway network
• Flight network
• Computer networks
• Local area network
• Internet
• Web
• Databases
• Entity-relationship diagram

V

a

b

h

j

U

d

X

Z

c

e

i

W

g

f

Y

Terminology
• End vertices (or endpoints) of an edge
• U and V are the endpoints of a
• Edges incident on a vertex
• a, d, and b are incident on V
• U and V are adjacent
• Degree of a vertex
• X has degree 5
• Parallel edges
• h and i are parallel edges
• Self-loop
• j is a self-loop
Terminology (cont.)
• Path
• sequence of alternating vertices and edges
• begins with a vertex
• ends with a vertex
• each edge is preceded and followed by its endpoints
• Simple path
• path such that all its vertices and edges are distinct
• Examples
• P1=(V,b,X,h,Z) is a simple path
• P2=(U,c,W,e,X,g,Y,f,W,d,V) is a path that is not simple

V

b

a

P1

d

U

X

Z

P2

h

c

e

W

g

f

Y

Terminology (cont.)
• Cycle
• circular sequence of alternating vertices and edges
• each edge is preceded and followed by its endpoints
• Simple cycle
• cycle such that all its vertices and edges are distinct
• Examples
• C1=(V,b,X,g,Y,f,W,c,U,a,) is a simple cycle
• C2=(U,c,W,e,X,g,Y,f,W,d,V,a,) is a cycle that is not simple

V

a

b

d

U

X

Z

C2

h

e

C1

c

W

g

f

Y

Notation

n number of vertices

m number of edges

deg(v)degree of vertex v

Property 1

Sv deg(v)= 2m

Proof: each edge is counted twice

Property 2

In an undirected graph with no self-loops and no multiple edges (i.e. a simple graph)

m  n (n -1)/2

Proof: No 2 edges can have the

same endpoints. Thus, each

vertex has degree at most (n-1).

Thus, by above, 2m≤n(n-1)

What is the bound for a directed graph?

m≤n(n-1)

Proof: No edge can have the same origin and destination.

Properties

Example

• n = 4
• m = 6
• deg(v)= 3
An iterator abstracts the process of scanning through a collection of elements

Methods of the ObjectIterator ADT:

object object()

boolean hasNext()

object nextObject()

reset()

Extends the concept of Position by adding a traversal capability

Implement with an array or singly linked list

An iterator is typically associated with an another data structure

We can augment the Stack, Queue, Vector, List and Sequence ADTs with method:

ObjectIterator elements()

Two notions of iterator:

snapshot: freezes the contents of the data structure at a given time

dynamic: follows changes to the data structure

Iterator ADT- Recall from Chpt 2
v=vertex & e=edge

are positions

store elements (o)

Update methods

insertVertex(o)

insertEdge(v, w, o)

insertDirectedEdge(v, w, o)

removeVertex(v)

removeEdge(e)

makeUndirected(e)

reverseDirection(e)

setDirectionFrom(e,v)

setDirectionTo(e,v)

Generic methods

numVertices()

numEdges()

vertices()

edges()

See pages 294 and 295 for definitions, although most are obvious.

Color key:

Returns an iterator

Boolean

Main Methods of the Graph ADT
Accessor methods

aVertex() - returns any vertex as a graph may not have a special vertex

incidentEdges(v)

endVertices(e)

opposite(v, e)

degree(v)

See pages 294 and 295 for definitions, although most are obvious.

Color key:

Returns an iterator

Boolean

Needed for graphs with directed edges

directedEdges()

undirectedEdges()

origin(e)

destination(e)

isDirected(e)

Relating methods

inDegree(v)

outDegree(v)

inIncidentEdges(v)

outIncidentEdges(v)

Main Methods of the Graph ADT
Why are there so many methods for graphs?
• Graphs are very rich and versatile structures.
• Therefore, the number of methods is unavoidable when describing the needed methods.
• Realize that graphs support two kinds of positions - vertices and edges.
• Moreover, edges can be directed or undirected.
• Consequently, we need different methods for accessing and updating all these different positions as well as handling relationships between the different positions.
Data Structures for Graphs
• Most commonly used
• Edge list structure
• Adjacency list structure
• Adjacency matrix structure
• Each has its own strengths and weaknesses in terms of what methods are easier to use and the cost of storage space.
• The edge list and the adjacency list actually store representations of the edges.
• The adjacency matrix format only uses a placeholder for every pair of vertices.
• This implies, as we will see, that the list structures use O(n+m) space and matrix format uses O(n2) where n is number of vertices and m is number of edges.
The Sequence ADT is the union of the Vector and List ADTs

Elements accessed by

Rank, or

Position

Generic methods:

size(), isEmpty()

Vector-based methods:

elemAtRank(r), replaceAtRank(r, o), insertAtRank(r, o), removeAtRank(r)

List-based methods:

first(), last(), before(p), after(p), replaceElement(p, o), swapElements(p, q), insertBefore(p, o), insertAfter(p, o), insertFirst(o), insertLast(o), remove(p)

Bridge methods:

atRank(r), rankOf(p)

Sequence ADT Recall from Chpt 2
Applications of Sequences
• The Sequence ADT is a basic, general-purpose, data structure for storing an ordered collection of elements
• Direct applications:
• Generic replacement for stack, queue, vector, or list
• small database (e.g., address book)
• Indirect applications:
• Building block of more complex data structures
Edge List Structure

u

• Vertex sequence
• sequence of vertex objects
• Vertex object
• element
• reference to position in vertex sequence
• Edge object
• element
• origin vertex object
• destination vertex object
• reference to position in edge sequence
• Edge sequence
• sequence of edge objects
• Note: Works for both directed and undirected graphs.

a

c

b

d

v

w

z

z

w

u

v

a

b

c

d

Edge List Structure - Notes
• This is the simplest and most used data structure for a graph.
• The vertex and edge sequence could be a list, vector, or dictionary.
• If we use a vector representation, we could think of the vertices as being numbered.
• Vertex and edge objects may contain additional information, especially for graphs with some directed edges i,e, an edge vertex could have a Boolean variable indicating whether or not it was directed.
• Main feature: Provides direct access from edges to the vertices to which they are incident.
• But, accessing the edges that are incident on a vertex requires that all edges be examined.
• Obviously, performance of the various methods is very dependent upon which representation is chosen.

a

v

b

• Vertex sequence
• sequence of vertex objects
• Vertex object
• element
• reference to position in vertex sequence
• reference to incident object
• Incidence sequence for each vertex
• sequence of references to edge objects of incident edges
• Augmented edge objects
• references to associated positions in incidence sequences of end vertices

u

w

w

u

v

b

a

Adjacency List Structure - Notes
• Typically the incident sequence is implemented as a list, but could use a dictionary or a priority queue if their characteristics are useful.
• Structure provides direct access from both the edges to the vertices and from the vertices to their incident edges.
• Structure matches the performance of the edge list representation, but has improved running time when the above characteristic can be exploited - i.e. given a vertex, return an iterator of incident edges.

a

v

b

• Vertex list structure
• Augmented vertex objects
• Integer key (index) associated with vertex
• 2D adjacency array
• Reference to edge object for adjacent vertices
• Null for non nonadjacent vertices
• The “old fashioned” version just has 0 for no edge and 1 for edge
• Edge objects
• Edge list structure

u

w

2

w

0

u

1

v

b

a

Notes for Adjacency Matrix
• Historically, the adjacency matrix was the first represented as a Boolean matrix with A[i,j] = 1 if an edge existed between vertex i and j and 0 otherwise.
• The representation shown here updates this to a more object-oriented environment where you use an interface to access the structure.
• In all of these structures, additional data may be stored in vertex or edge objects. For example, if edges are being used in a problem about distances between to cities, those distances could be stored in the edge objects.

A

B

D

E

C

Depth-First Search

• Definitions (§6.1)
• Subgraph
• Connectivity
• Spanning trees and forests
• Depth-first search (§6.3.1)
• Algorithm
• Example
• Properties
• Analysis
• Applications of DFS (§6.5)
• Path finding
• Cycle finding
Subgraphs
• A subgraph S of a graph G is a graph such that
• The vertices of S are a subset of the vertices of G
• The edges of S are a subset of the edges of G
• A spanning subgraph of G is a subgraph that contains all the vertices of G

Subgraph

Spanning subgraph

A connected component of a graph G is a maximal connected subgraph of G

Connectivity

Connected graph

Non connected graph with two connected components

Trees and Forests
• A (free) tree is an undirected graph T such that
• T is connected
• T has no cycles

This definition of tree is different from the one of a rooted tree

• A forest is an undirected graph without cycles
• The connected components of a forest are trees

Tree

Forest

Spanning Trees and Forests
• A spanning tree of a connected graph is a spanning subgraph that is a tree
• A spanning tree is not unique unless the graph is a tree
• Spanning trees have applications to the design of communication networks
• A spanning forest of a graph is a spanning subgraph that is a forest

Graph

Spanning tree

Traversing a Graph
• A traversal of a graph is a systematic procedure for exploring a graph by examining all of its vertices and edges.
• Example:
• Web spider or crawler is the data collecting part of a search engine that visits all the hypertext documents on the web. (The documents are vertices and the hyperlinks are the edges).
Depth-first search (DFS) is a general technique for traversing a graph

A DFS traversal of a graph G does the following:

Visits all the vertices and edges of G

Determines whether G is connected

Computes the connected components of G

Computes a spanning forest of G

DFS on a graph with n vertices and m edges takes O(n + m ) time

DFS can be further extended to solve other graph problems

Find and report a path between two given vertices

Find a cycle in the graph

Depth-first search is to graphs what Euler tour is to binary trees

Depth-First Search
Depth-First Search
• DFS utilizes backtracking as you would if you were wandering through a maze.
• Start at a vertex in G with string; attached the string at the vertex; mark the vertex as "visited".
• Explore G by moving out along the incident edges according to some predetermined scheme for ordering and unroll the string.
• If you can move to a new vertex, mark it as "visited".
• If you hit a "visited" node, backtrack to the last vertex rolling up the string as you go and see if we can move out from there.
• The process terminates when the backtracking returns to the place where the string was attached - i.e. the start vertex.
Depth-First Search
• It\'s useful to distinguish edges by their use:
• Discovery (or tree) edges - those that are followed to discover new vertices.
• Back edges - those that lead to already visited vertices.
• For back edges, you need to backtrack to the last visited node and decide if you can move out from it without hitting a visited node.
• The discovery edges form a spanning tree of the connected components of the starting vertex - called a DFS tree.
• The back edges are also useful - assuming the DFS tree is rooted at the starting vertex, each back edge leads back from a vertex in the tree to one of its ancestors in the tree.

A

B

D

E

C

A

A

B

D

E

B

D

E

C

C

Example

unexplored vertex

A

visited vertex

A

unexplored edge

discovery edge

back edge

A

A

A

B

D

E

B

D

E

B

D

E

C

C

C

A

B

D

E

C

Example (cont.)
DFS and Maze Traversal
• The DFS algorithm is similar to a classic strategy for exploring a maze
• We mark each intersection, corner and dead end (vertex) visited
• We mark each corridor (edge ) traversed
• We keep track of the path back to the entrance (start vertex) by means of a rope (recursion stack)
Often Posed as Programming Contest Problem
• Given a maze represented as follows:

1=wall , -1=exit. Start at the bottom.

To walk systematically, we\'ll try to go N, W, E, S

Often Posed as Programming Contest Problem
• Given a maze represented as follows:

1=wall , -1=exit. Start at the bottom.

To walk systematically, we\'ll try to go N, W, E, S

Often Posed as Programming Contest Problem
• Given a maze represented as follows:

1=wall , -1=exit. Start at the bottom.

To walk systematically, we\'ll try to go N, W, E, S

Often Posed as Programming Contest Problem
• Given a maze represented as follows:

1=wall , -1=exit. Start at the bottom.

To walk systematically, we\'ll try to go N, W, E, S

Often Posed as Programming Contest Problem
• Given a maze represented as follows:

1=wall , -1=exit. Start at the bottom.

To walk systematically, we\'ll try to go N, W, E, S

Often Posed as Programming Contest Problem
• Given a maze represented as follows:

1=wall , -1=exit. Start at the bottom.

To walk systematically, we\'ll try to go N, W, E, S

Often Posed as Programming Contest Problem
• Given a maze represented as follows:

1=wall , -1=exit. Start at the bottom.

To walk systematically, we\'ll try to go N, W, E, S

Often Posed as Programming Contest Problem
• Given a maze represented as follows:

1=wall , -1=exit. Start at the bottom.

To walk systematically, we\'ll try to go N, W, E, S

Often Posed as Programming Contest Problem
• Given a maze represented as follows:

1=wall , -1=exit. Start at the bottom.

To walk systematically, we\'ll try to go N, W, E, S

Often Posed as Programming Contest Problem
• Given a maze represented as follows:

1=wall , -1=exit. Start at the bottom.

To walk systematically, we\'ll try to go N, W, E, S

Often Posed as Programming Contest Problem
• Given a maze represented as follows:

1=wall , -1=exit. Start at the bottom.

To walk systematically, we\'ll try to go N, W, E, S

The exact path taken depends on this ordering.

DFS Algorithm
• The algorithm uses a mechanism for setting and getting “labels” of vertices and edges

AlgorithmDFS(G, v)

Inputgraph G and a start vertex v of G

Outputlabeling of the edges of G in the connected component of v as discovery edges and back edges

setLabel(v, VISITED)

for all e  G.incidentEdges(v)

ifgetLabel(e) = UNEXPLORED

w G.opposite(v,e)

if getLabel(w) = UNEXPLORED

setLabel(e, DISCOVERY)

DFS(G, w)

else

setLabel(e, BACK)

AlgorithmDFS(G)

Inputgraph G

Outputlabeling of the edges of G as discovery edges and back edges

for all u  G.vertices()

setLabel(u, UNEXPLORED)

for all e  G.edges()

setLabel(e, UNEXPLORED)

for all v  G.vertices()

ifgetLabel(v) = UNEXPLORED

DFS(G, v)

Property 1

DFS(G, v) visits all the vertices and edges in the connected component of v

Property 2

The discovery edges labeled by DFS(G, v) form a spanning tree of the connected component of v

A

B

D

E

C

Properties of DFS
Setting/getting a vertex/edge label takes O(1) time

Each vertex is labeled twice

once as UNEXPLORED

once as VISITED

Each edge is labeled twice

once as UNEXPLORED

once as DISCOVERY or BACK

Method incidentEdges is called once for each vertex

DFS runs in O(n + m) time provided the graph is represented by the adjacency list structure

Recall that Sv deg(v)= 2m

Analysis of DFS
Path Finding
• We can specialize the DFS algorithm to find a path between two given vertices u and z
• We call DFS(G, u) with u as the start vertex
• We use a stack S to keep track of the path between the start vertex and the current vertex
• As soon as destination vertex z is encountered, we return the path as the contents of the stack

AlgorithmpathDFS(G, v, z)

setLabel(v, VISITED)

S.push(v)

if v= z

return S.elements()

for all e  G.incidentEdges(v)

ifgetLabel(e) = UNEXPLORED

w opposite(v, e)

if getLabel(w) = UNEXPLORED

setLabel(e, DISCOVERY)

S.push(e)

pathDFS(G, w, z)

S.pop() { e gets popped }

else

setLabel(e, BACK)

S.pop() { v gets popped }

Cycle Finding

AlgorithmcycleDFS(G, v, z)

setLabel(v, VISITED)

S.push(v)

for all e  G.incidentEdges(v)

ifgetLabel(e) = UNEXPLORED

w opposite(v,e)

S.push(e)

if getLabel(w) = UNEXPLORED

setLabel(e, DISCOVERY)

pathDFS(G, w, z)

S.pop()

else

C new empty stack

repeat

o S.pop()

C.push(o)

until o= w

return C.elements()

S.pop()

• We can specialize the DFS algorithm to find a simple cycle
• We use a stack S to keep track of the path between the start vertex and the current vertex
• As soon as a back edge (v, w) is encountered, we return the cycle as the portion of the stack from the top to vertex w

Biconnectivity

SEA

PVD

ORD

FCO

SNA

MIA

• Definitions (§6.3.2)
• Separation vertices and edges
• Biconnected graph
• Biconnected components
• Equivalence classes
• Algorithms (§6.3.2)
• Auxiliary graph
• Proxy graph
Definitions

Let G be a connected graph

A separation edge of G is an edge whose removal disconnects G

A separation vertex of G is a vertex whose removal disconnects G

Applications

Separation edges and vertices represent single points of failure in a network and are critical to the operation of the network

Example

DFW, LGA and LAX are separation vertices

(DFW,LAX) is a separation edge

Separation Edges and Vertices

ORD

PVD

SFO

LGA

HNL

LAX

DFW

MIA

Biconnected Graph
• Equivalent definitions of a biconnected graph G
• Graph G has no separation edges and no separation vertices
• For any two vertices u and v of G,there are two disjoint simple paths between u and v (i.e., two simple paths between u and v that share no other vertices or edges)
• For any two vertices u and v of G, there is a simple cycle containing u and v
• Example

PVD

ORD

SFO

LGA

HNL

LAX

DFW

MIA

Biconnected Components
• Biconnected component of a graph G
• A maximal biconnected subgraph of G, or
• A subgraph consisting of a separation edge of G and its end vertices
• Interaction of biconnected components
• An edge belongs to exactly one biconnected component
• A nonseparation vertex belongs to exactly one biconnected component
• A separation vertex belongs to two or more biconnected components
• Example of a graph with four biconnected components

ORD

PVD

SFO

LGA

RDU

HNL

LAX

DFW

MIA

Equivalence Classes
• Given a set S, a relation R on S is a set of ordered pairs of elements of S, i.e., R is a subset of SS
• An equivalence relation R on S satisfies the following properties

Reflexive: (x,x)R

Symmetric: (x,y)R  (y,x)R

Transitive: (x,y)R (y,z)R (x,z)R

• An equivalence relation R on S induces a partition of the elements of S into equivalence classes (i.e. the classes are pairwise disjoint and their union is S.
• Example (connectivity relation among the vertices of a graph):
• Let V be the set of vertices of a graph G
• Define the relationC = {(v,w) VV such that G has a path from v to w}
• Relation C is an equivalence relation
• The equivalence classes of relation C are the vertices in each connected component of graph G
Equivalence Relations

These abstract the notion of equality for numbers.

Obviously, equality for numbers is an equivalence relation.

But, there are many such equivalence relations.

Example: For positive integers a,b,c,d, define for fractions, a/b = c/d if and only if ad=bc.

Proof:

reflexive: a/b = a/b because ab=ba

symmetric: a/b = c/d implies c/d = a/b because ad=bc implies cb=da. So, c/d=a/b

transitive: a/b = c/d and c/d = e/f implies a/b = e/f (complete this proof)

Equivalence Relations
• What are the equivalence classes of the last equivalence relation?
• [1/2] = {x/y | 1/2= x/y}
• i.e. all the positive fractions we view as having 1/2 as its reduced form.
• i.e. [1/2] = {1/2, 2/4, 3/6, 4/8,...}
• Note: There is really nothing special about 1/2, [2/4], [3/6], [4/8] ... are all [1/2].

i

g

e

b

a

d

j

f

c

• Edges e and f of connected graph Gare linked if
• e = f, or
• G has a simple cycle containing e and f

Theorem:

The link relation on the edges of a graph is an equivalence relation

Proof Sketch:

• The reflexive and symmetric properties follow from the definition
• For the transitive property, consider two simple cycles sharing an edge

Equivalence classes of linked edges:

{a} {b, c, d, e, f} {g, i, j}

i

g

e

b

a

d

j

f

c

The link components of a connected graph G are the equivalence classes of edges with respect to the link relation

A biconnected component of G is the subgraph of G induced by an equivalence class of linked edges

A separation edge is a single-element equivalence class of linked edges

A separation vertex has incident edges in at least two distinct equivalence classes of linked edge (Note this does not violate the property that equivalence classes are pairwise disjoint.)

ORD

PVD

SFO

LGA

RDU

HNL

LAX

DFW

MIA

Auxiliary Graph

h

g

i

e

b

• Auxiliary graph B for a connected graph G
• Associated with a DFS traversal of G
• The vertices of B are the edges of G
• For each back edge e of G, B has edges (e,f1), (e,f2) , …, (e,fk),where f1, f2, …, fk are the discovery edges of G that form a simple cycle with e
• Its connected components correspond to the the link components of G

i

j

d

c

f

a

DFS on graph G

g

e

i

h

b

f

j

d

c

a

Auxiliary graph B

Auxiliary Graph (cont.)
• In the worst case, the number of edges of the auxiliary graph is proportional to nm

DFS on graph G

Auxiliary graph B

Proxy Graph

h

g

i

e

b

• Proxy graph F for a connected graph G
• Spanning forest of the auxiliary graph B
• Has m vertices and O(m) edges
• Can be constructed in O(n +m) time
• Its connected components (trees) correspond to the the link components of G
• Given a graph G with n vertices and m edges, we can compute the following in O(n +m) time
• The biconnected components of G
• The separation vertices of G
• The separation edges of G

i

j

d

c

f

a

DFS on graph G

g

e

i

h

b

f

j

d

c

a

Proxy graph F

Proxy Graph Constructed Using DFS

AlgorithmproxyGraph(G)

Inputconnectedgraph GOutputproxy graph F for GFempty graph

DFS(G, s) { s is any vertex of G}

for all discovery edges e of G

F.insertVertex(e)

for all vertices v of G in DFS visit order

for all back edges e= (u,v)

F.insertVertex(e)

repeat

fdiscovery edge with dest. u

F.insertEdge(e,f,)

uorigin of edge f

else

uv{ ends the loop }

until u=v

returnF

h

g

i

e

b

i

j

d

c

f

a

DFS on graph G

g

e

i

h

b

f

j

d

c

a

Proxy graph F

L0

A

L1

B

C

D

L2

E

F

• Breadth-first search (§6.3.3)
• Algorithm
• Example
• Properties
• Analysis
• Applications
• DFS vs. BFS (§6.3.3)
• Comparison of applications
• Comparison of edge labels
Breadth-first search (BFS) is a general technique for traversing a graph

A BFS traversal of a graph G

Visits all the vertices and edges of G

Determines whether G is connected

Computes the connected components of G

Computes a spanning forest of G

BFS on a graph with n vertices and m edges takes O(n + m ) time

BFS can be further extended to solve other graph problems

Find and report a path with the minimum number of edges between two given vertices

Find a simple cycle, if there is one

BFS Algorithm

AlgorithmBFS(G, s)

L0new empty sequence

L0.insertLast(s)

setLabel(s, VISITED)

i  0

while Li.isEmpty()

Li +1new empty sequence

for all v  Li.elements() for all e  G.incidentEdges(v)

ifgetLabel(e) = UNEXPLORED

w opposite(v,e)

if getLabel(w) = UNEXPLORED

setLabel(e, DISCOVERY)

setLabel(w, VISITED)

Li +1.insertLast(w)

else

setLabel(e, CROSS)

i  i +1

• The algorithm uses a mechanism for setting and getting “labels” of vertices and edges

AlgorithmBFS(G)

Inputgraph G

Outputlabeling of the edges and partition of the vertices of G

for all u  G.vertices()

setLabel(u, UNEXPLORED)

for all e  G.edges()

setLabel(e, UNEXPLORED)

for all v  G.vertices()

ifgetLabel(v) = UNEXPLORED

BFS(G, v)

L0

A

L1

B

C

D

E

F

Example

unexplored vertex

A

visited vertex

A

unexplored edge

discovery edge

cross edge

L0

L0

A

A

L1

L1

B

C

D

B

C

D

E

F

E

F

L0

L0

A

A

L1

L1

B

C

D

B

C

D

L2

E

F

E

F

L0

L0

A

A

L1

L1

B

C

D

B

C

D

L2

L2

E

F

E

F

Example (cont.)

L0

L0

A

A

L1

L1

B

C

D

B

C

D

L2

L2

E

F

E

F

Example (cont.)

L0

A

L1

B

C

D

L2

E

F

Notation

Gs: connected component of s

Property 1

BFS(G, s) visits all the vertices and edges of Gs

Property 2

The discovery edges labeled by BFS(G, s) form a spanning tree Ts of Gs

Property 3

For each vertex v in Li

The path of Ts from s to v has i edges

Every path from s to v in Gshas at least i edges

Properties

A

B

C

D

E

F

L0

A

L1

B

C

D

L2

E

F

Setting/getting a vertex/edge label takes O(1) time

Each vertex is labeled twice

once as UNEXPLORED

once as VISITED

Each edge is labeled twice

once as UNEXPLORED

once as DISCOVERY or CROSS

Each vertex is inserted once into a sequence Li

Method incidentEdges is called once for each vertex

BFS runs in O(n + m) time provided the graph is represented by the adjacency list structure

Recall that Sv deg(v)= 2m

Analysis
Applications
• We can specialize the BFS traversal of a graph Gto solve the following problems in O(n + m) time
• Compute the connected components of G
• Compute a spanning forest of G
• Find a simple cycle in G, or report that G is a forest
• Given two vertices of G, find a path in G between them with the minimum number of edges, or report that no such path exists

L0

A

L1

B

C

D

L2

E

F

DFS vs. BFS

A

B

C

D

E

F

DFS

BFS

Back edge(v,w)

w is an ancestor of v in the tree of discovery edges

Cross edge(v,w)

w is in the same level as v or in the next level in the tree of discovery edges

L0

A

L1

B

C

D

L2

E

F

DFS vs. BFS (cont.)

A

B

C

D

E

F

DFS

BFS

Directed Graphs

BOS

ORD

JFK

SFO

DFW

LAX

MIA

Outline and Reading (§6.4)
• Reachability (§6.4.1)
• Directed DFS
• Strong connectivity
• Transitive closure (§6.4.2)
• The Floyd-Warshall Algorithm
• Directed Acyclic Graphs (DAG’s) (§6.4.4)
• Topological Sorting

E

D

C

B

A

Digraphs
• A digraph is a graph whose edges are all directed
• Short for “directed graph”
• Applications
• one-way streets
• flights

E

D

C

B

A

Digraph Properties
• A graph G=(V,E) such that
• Each edge goes in one direction:
• Edge (a,b) goes from a to b, but not b to a.
• If G is simple, m < n(n-1).
• If we keep in-edges and out-edges in separate adjacency lists, we can perform listing of of the sets of in-edges and out-edges in time proportional to their size.
Digraph Application
• Scheduling: edge (a,b) means task a must be completed before b can be started

ics21

ics22

ics23

ics51

ics53

ics52

ics161

ics131

ics141

ics121

ics171

The good life

ics151

Directed DFS
• We can specialize the traversal algorithms (DFS and BFS) to digraphs by traversing edges only along their direction
• In the directed DFS algorithm, we have four types of edges
• discovery edges
• back edges
• forward edges
• cross edges
• A directed DFS starting at a vertex s determines the vertices reachable from s

E

D

C

B

A

Reachability
• DFS tree rooted at v: vertices reachable from v via directed paths

E

D

E

D

C

A

C

F

E

D

A

B

C

F

A

B

a

g

c

d

e

b

f

Strong Connectivity
• Each vertex can reach all other vertices
Strong Connectivity Algorithm
• Pick a vertex v in G.
• Perform a DFS from v in G.
• If there’s a w not visited, print “no”.
• Let G’ be G with edges reversed.
• Perform a DFS from v in G’.
• If there’s a w not visited, print “no”.
• Else, print “yes”.
• Running time: O(n+m).

a

G:

g

c

d

e

b

f

a

g

G’:

c

d

e

b

f

a

g

c

d

e

b

f

Strongly Connected Components
• Maximal subgraphs such that each vertex can reach all other vertices in the subgraph
• Can also be done in O(n+m) time using DFS, but is more complicated (similar to biconnectivity).

{ a , c , g }

{ f , d , e , b }

Transitive Closure

D

E

• Given a digraph G, the transitive closure of G is the digraph G* such that
• G* has the same vertices as G
• if G has a directed path from u to v (u  v), G* has a directed edge from u to v
• The transitive closure provides reachability information about a digraph

B

G

C

A

D

E

B

C

A

G*

If there\'s a way to get from A to B and from B to C, then there\'s a way to get from A to C.

Computing the Transitive Closure
• We can perform DFS starting at each vertex
• O(n(n+m))

Alternatively ... Use dynamic programming: the Floyd-Warshall Algorithm

Floyd-Warshall Transitive Closure
• Idea #1: Number the vertices 1, 2, …, n.
• Idea #2: Consider paths that use only vertices numbered 1, 2, …, k, as intermediate vertices:

Uses only vertices numbered 1,…,k

(add this edge if it’s not already in)

i

j

Uses only vertices

numbered 1,…,k-1

Uses only vertices

numbered 1,…,k-1

k

Floyd-Warshall’s Algorithm

AlgorithmFloydWarshall(G)

Inputdigraph G

Outputtransitive closure G* of G

i 1

for all v  G.vertices()

denote v as vi

i  i+1

G0G

for k 1 to n do

GkGk -1

for i 1 to n (i  k)do

for j 1 to n (j  i, k)do

if Gk -1.areAdjacent(vi, vk) 

Gk.insertDirectedEdge(vi, vj , k)

return Gn

• Floyd-Warshall’s algorithm numbers the vertices of G as v1 , …, vn and computes a series of digraphs G0, …, Gn
• G0=G
• Gkhas a directed edge (vi, vj) if G has a directed path from vi to vjwith intermediate vertices in the set {v1 , …, vk}
• We have that Gn = G*
• In phase k, digraph Gk is computed from Gk -1
• Running time: O(n3), assuming areAdjacent is O(1) (e.g., adjacency matrix)
Floyd-Warshall Example

BOS

v

ORD

4

JFK

v

v

2

6

SFO

DFW

LAX

v

3

v

1

MIA

v

5

Floyd-Warshall, Iteration 1

BOS

v

ORD

4

JFK

v

v

2

6

SFO

DFW

LAX

v

3

v

1

MIA

v

5

Floyd-Warshall, Iteration 2

BOS

v

ORD

4

JFK

v

v

2

6

SFO

DFW

LAX

v

3

v

1

MIA

v

5

Floyd-Warshall, Iteration 3

BOS

v

ORD

4

JFK

v

v

2

6

SFO

DFW

LAX

v

3

v

1

MIA

v

5

Floyd-Warshall, Iteration 4

BOS

v

ORD

4

JFK

v

v

2

6

SFO

DFW

LAX

v

3

v

1

MIA

v

5

BOS

Floyd-Warshall, Iteration 5

v

ORD

4

JFK

v

v

2

6

SFO

DFW

LAX

v

3

v

1

MIA

v

5

BOS

Floyd-Warshall, Iteration 6

v

ORD

4

JFK

v

v

2

6

SFO

DFW

LAX

v

3

v

1

MIA

v

5

BOS

Floyd-Warshall, Conclusion

v

ORD

4

JFK

v

v

2

6

SFO

DFW

LAX

v

3

v

1

MIA

v

5

DAGs and Topological Ordering

D

E

• A directed acyclic graph (DAG) is a digraph that has no directed cycles
• A topological ordering of a digraph is a numbering

v1 , …, vn

of the vertices such that for every edge (vi , vj), we have i < j

• Example: in a task scheduling digraph, a topological ordering a task sequence that satisfies the precedence constraints

Theorem

A digraph admits a topological ordering if and only if it is a DAG

B

C

A

DAG G

v4

v5

D

E

v2

B

v3

C

v1

Topological ordering of G

A

Topological Sorting
• Number vertices, so that (u,v) in E implies u < v

1

A typical student day

wake up

3

2

eat

study computer sci.

5

4

nap

more c.s.

7

play

8

write c.s. program

6

9

work out

make cookies for professors

10

11

sleep

Algorithm for Topological Sorting
• Note: This algorithm is different than the one in Goodrich-Tamassia
• Running time: O(n + m). How…?

MethodTopologicalSort(G)

H G // Temporary copy of G

n G.numVertices()

whileH is not emptydo

Let v be a vertex with no outgoing edges

Label v  n

n  n - 1

Remove v from H

Topological Sorting Algorithm using DFS

AlgorithmtopologicalDFS(G, v)

Inputgraph G and a start vertex v of G

Outputlabeling of the vertices of G in the connected component of v

setLabel(v, VISITED)

for all e  G.incidentEdges(v)

ifgetLabel(e) = UNEXPLORED

w opposite(v,e)

if getLabel(w) = UNEXPLORED

setLabel(e, DISCOVERY)

topologicalDFS(G, w)

else

{e is a forward or cross edge}

Label v with topological number n

n n - 1

• Simulate the algorithm by using depth-first search
• O(n+m) time.

AlgorithmtopologicalDFS(G)

Inputdag G

Outputtopological ordering of Gn G.numVertices()

for all u  G.vertices()

setLabel(u, UNEXPLORED)

for all e  G.edges()

setLabel(e, UNEXPLORED)

for all v  G.vertices()

ifgetLabel(v) = UNEXPLORED

topologicalDFS(G, v)

Campus Tour

• Review
• Adjacency matrix structure (§6.2.3)
• Kruskal’s MST algorithm (§7.3.1)
• Partition ADT and implementation (§4.2.2)
• The decorator pattern (§6.5.1)
• The traveling salesperson problem
• Definition
• Approximation algorithm (§13.4.3)
Graph Assignment
• Goals
• Learn and implement the adjacency matrix structure an Kruskal’s minimum spanning tree algorithm
• Understand and use the decorator pattern and various JDSL classes and interfaces
• Implement the adjacency matrix structure for representing a graph
• Implement Kruskal’s MST algorithm
• Frontend
• Computation and visualization of an approximate traveling salesperson tour
• Edge list structure
• Augmented vertex objects
• Integer key (index) associated with vertex
• 2D-array adjacency array
• Reference to edge object for adjacent vertices
• Null for non nonadjacent vertices

a

v

b

u

w

2

w

0

u

1

v

b

a

Kruskal’s Algorithm

AlgorithmKruskalMSF(G)

Input weighted graph G

Output labeling of the edges of a minimum spanning forest of G

Q new heap-based priority queue

forallv  G.vertices() do

l makeSet(v) { elementary cloud }

setLocator(v,l)

foralle  G.edges() do

Q.insert(weight(e),e)

while Q.isEmpty()

e Q.removeMin()

[u,v]  G.endVertices(e)

A find(getLocator(u))

B find(getLocator(v))

ifA  B

setMSFedge(e)

{ merge clouds }

union(A, B)

• The vertices are partitioned into clouds
• We start with one cloud per vertex
• Clouds are merged during the execution of the algorithm
• makeSet(o): create set {o} and return a locator for object o
• find(l): return the set of the object with locator l
• union(A,B): merge sets A and B
Example

G

G

8

8

B

B

4

4

E

E

9

9

6

6

5

5

F

1

F

1

3

3

C

C

11

11

2

2

7

H

7

H

D

D

A

A

10

10

G

G

8

8

B

B

4

4

E

E

9

9

6

6

5

5

F

F

1

1

3

3

C

C

11

11

2

2

7

7

H

H

D

D

A

A

10

10

Example (contd.)

G

G

8

8

B

B

4

4

E

E

9

9

6

6

5

5

F

F

1

1

3

3

C

C

11

11

2

2

7

H

7

H

D

D

A

A

10

10

four steps

two steps

G

G

8

8

B

B

4

4

E

E

9

9

6

6

5

5

F

F

1

1

3

3

C

C

11

11

2

2

7

7

H

H

D

D

A

A

10

10

Partition implementation

A set is represented the sequence of its elements

A position stores a reference back to the sequence itself (for operation find)

The position of an element in the sequence serves as locator for the element in the set

In operation union, we move the elements of the smaller sequence into to the larger sequence

Worst-case running times

makeSet, find: O(1)

union: O(min(nA,nB))

Amortized analysis

Consider a series of k Partiton ADT operations that includes nmakeSet operations

Each time we move an element into a new sequence, the size of its set at least doubles

An element is moved at most log2ntimes

Moving an element takes O(1) time

The total time for the series of operations is O(kn log n)

Partition Implementation
Analysis of Kruskal’s Algorithm
• Graph operations
• Methods vertices and edges are called once
• Method endVertices is called m times
• Priority queue operations
• We perform m insert operations and mremoveMin operations
• Partition operations
• We perform nmakeSet operations, 2m find operations and no more than n -1union operations
• Label operations
• We set vertex labels n times and get them 2m times
• Kruskal’s algorithm runs in time O((n + m) log n) time provided the graph has no parallel edges and is represented by the adjacency list structure
Labels are commonly used in graph algorithms

Auxiliary data

Output

Examples

DFS: unexplored/visited label for vertices and unexplored/ forward/back labels for edges

Dijkstra and Prim-Jarnik: distance, locator, and parent labels for vertices

Kruskal: locator label for vertices and MSF label for edges

The decorator pattern extends the methods of the Position ADT to support the handling of attributes (labels)

has(a): tests whether the position has attribute a

get(a): returns the value of attribute a

set(a, x): sets to x the value of attribute a

destroy(a): removes attribute a and its associated value (for cleanup purposes)

The decorator pattern can be implemented by storing a dictionary of (attribute, value) items at each position

Decorator Pattern
Traveling Salesperson Problem
• A tour of a graph is a spanning cycle (e.g., a cycle that goes through all the vertices)
• A traveling salesperson tour of a weighted graph is a tour that is simple (i.e., no repeated vertices or edges) and has minimum weight
• No polynomial-time algorithms are known for computing traveling salesperson tours
• The traveling salesperson problem (TSP) is a major open problem in computer science
• Find a polynomial-time algorithm computing a traveling salesperson tour or prove that none exists

7

D

B

4

2

5

F

2

C

8

3

6

E

A

1

Example of travelingsalesperson tour

(with weight 17)

TSP Approximation
• We can approximate a TSP tour with a tour of at most twice the weight for the case of Euclidean graphs
• Vertices are points in the plane
• Every pair of vertices is connected by an edge
• The weight of an edge is the length of the segment joining the points
• Approximation algorithm
• Compute a minimum spanning tree
• Form an Eulerian circuit around the MST
• Transform the circuit into a tour