chapter 8 graphs n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Chapter 8: Graphs PowerPoint Presentation
Download Presentation
Chapter 8: Graphs

Loading in 2 Seconds...

play fullscreen
1 / 143

Chapter 8: Graphs - PowerPoint PPT Presentation


  • 102 Views
  • Uploaded on

Chapter 8: Graphs. Objectives. Looking ahead – in this chapter, we’ll consider Graph Representation Graph Traversals Shortest Paths Cycle Detection Spanning Trees Connectivity. Objectives (continued). Topological Sort Networks Matching Eulerian and Hamiltonian Graphs Graph Coloring

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Chapter 8: Graphs' - enoch


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
objectives
Objectives

Looking ahead – in this chapter, we’ll consider

  • Graph Representation
  • Graph Traversals
  • Shortest Paths
  • Cycle Detection
  • Spanning Trees
  • Connectivity

Data Structures and Algorithms in C++, Fourth Edition

objectives continued
Objectives (continued)

Topological Sort

Networks

Matching

Eulerian and Hamiltonian Graphs

Graph Coloring

NP-Complete Problems in Graph Theory

Data Structures and Algorithms in C++, Fourth Edition

introductory remarks
Introductory Remarks

Although trees are quite flexible, they have an inherent limitation in that they can only express hierarchical structures

Fortunately, we can generalize a tree to form a graph, in which this limitation is removed

Informally, a graph is a collection of nodes and the connections between them

Figure 8.1 illustrates some examples of graphs; notice there is typically no limitation on the number of vertices or edges

Consequently, graphs are extremely versatile and applicable to a wide variety of situations

Graph theory has developed into a sophisticated field of study since its origins in the early 1700s

Data Structures and Algorithms in C++, Fourth Edition

introductory remarks continued
Introductory Remarks (continued)

Fig. 8.1 Examples of graphs: (a–d) simple graphs; (c) a complete graph K4; (e) a multigraph;

(f) a pseudograph; (g) a circuit in a digraph; (h) a cycle in the digraph

Data Structures and Algorithms in C++, Fourth Edition

introductory remarks continued1
Introductory Remarks (continued)

And, while many results are theoretical, the applications of graphs are numerous and worth consideration

First, though, we need to consider some definitions

A simple graphG = (V, E) consists of a (finite) set denoted by V, and a collection E, of unordered pairs {u, v} of distinct elements from V

Each element of V is called a vertex or a point or a node, and each element of E is called an edge or a line or a link

The number of vertices, the cardinality of V, is called the order of graph and devoted by |V|

The cardinality of E, called the size of graph, is denoted by |E|

Data Structures and Algorithms in C++, Fourth Edition

introductory remarks continued2
Introductory Remarks (continued)

A graph G = (V, E) is directed if the edge set is composed of ordered vertex (node) pairs

Now these definitions restrict the number of edges that can occur between any two vertices to one

If we allow multiple edges between any two vertices, we have a multigraph(Figure 8.1e)

Formally, a multigraph is defined as G(V, E, f) where V is the set of vertices, E the edges, and f:E →{{vi, vj} : vi,vjV and vi ≠ vj} is a function defining edges as pairs of distinct vertices

A pseudograph is a multigraph that drops the vi ≠ vj condition, allowing the graph to have loops (Figure 8.1f)

Data Structures and Algorithms in C++, Fourth Edition

introductory remarks continued3
Introductory Remarks (continued)

A path between vertices v1 and vnis a sequence of edges denoted v1, v2, …, vn-1, vn

If v1 = vn, and the edges don’t repeat, it is a circuit(Figure 8.1g); if the vertices in a circuit are different, it is a cycle(Figure 8.1h)

A weighted graph assigns a value to each edge, based on contextual usage

A complete graph of n vertices, denoted Kn, has exactly one edge between each pair of vertices (Figure 8.1c)

The edge count = = = = O

Data Structures and Algorithms in C++, Fourth Edition

introductory remarks continued4
Introductory Remarks (continued)

A subgraph of a graph G, designated G’, is the graph (V’, E’) where V’ V and E’ E

If the edges of the subgraph are defined such that eE if eE’, then the subgraph is said to be induced on its vertices V’

Two vertices are adjacentif the edge defined by them is in E

That edge is called incident with the vertices

The number of edges incident with a vertex v, is the degree of the vertex; if the degree is 0, v is called isolated

Notice that the definition of a graph allows the set E to be empty, so a graph may be composed of isolated vertices

Data Structures and Algorithms in C++, Fourth Edition

graph representation
Graph Representation

Graphs can be represented in a number of ways

One of the simplest is an adjacency list, where each vertex adjacent to a give vertex is listed

This can be designed as a table (known as a star representation) or a linked list, shown in Figure 8.2b-c on page 393

Another representation is as a matrix, which can be designed in two ways

An adjacency matrixis a |V| x |V| binary matrix where:

Data Structures and Algorithms in C++, Fourth Edition

graph representation continued
Graph Representation (continued)

An example of an adjacency matrix is shown in Figure 8.2d

The order of the vertices in the matrix is arbitrary, so there are n! possible matrices for a graph of n vertices

It is also possible to generalize an adjacency matrix definition to handle a multigraph by defining aij = number of edges between vi and vj

A second matrix representation is based on incidences, hence the name incidence matrix

An incidence matrixis a |V| x |E| binary matrix where:

Data Structures and Algorithms in C++, Fourth Edition

graph representation continued1
Graph Representation (continued)

An example of an incidence matrix is shown in Figure 8.2e

For a multigraph, many columns are the same, and a column with a single 1 represents a loop

As far as usage, the proper structure depends to a great extent on the kinds of operations that need to be done

Data Structures and Algorithms in C++, Fourth Edition

graph traversals
Graph Traversals

Like tree traversals, graph traversals visit each node once

However, we cannot apply tree traversal algorithms to graphs because of cycles and isolated vertices

One algorithm for graph traversal, called the depth-first search, was developed by John Hopcroft and Robert Tarjan in 1974

In this algorithm, each vertex is visited and then all the unvisited vertices adjacent to that vertex are visited

If the vertex has no adjacent vertices, or if they have all been visited, we backtrack to that vertex’s predecessor

This continues until we return to the vertex where the traversal started

Data Structures and Algorithms in C++, Fourth Edition

graph traversals continued
Graph Traversals (continued)

If any vertices remain unvisited at this point, the traversal restarts at one of the unvisited vertices

Although not necessary, the algorithm assigns unique numbers to the vertices, so they are renumbered

Pseudocode for this algorithm is shown on page 395

Figure 8.3 shows an example of this traversal; the numbers indicate the order in which the nodes are visited; the solid lines indicate the edges traversed during the search

Fig. 8.3 An example of application of the depthFirstSearch() algorithm to a graph

Data Structures and Algorithms in C++, Fourth Edition

graph traversals continued1
Graph Traversals (continued)

The algorithm guarantees that we will create a tree (or a forest, which is a set of trees) including the graph’s vertices

Such a tree is called a spanning tree

The guarantee is based on the algorithm not processing any edge that leads to an already visited node

Consequently, some edges are not included in the tree (marked with dashed lines)

The edges included in the tree are called forward edges; those omitted are called back edges

In Figure 8.4, we can see this algorithm applied to a digraph, which is a graph where the edges have a direction

Data Structures and Algorithms in C++, Fourth Edition

graph traversals continued2
Graph Traversals (continued)

Fig. 8.4 The depthFirstSearch() algorithm applied to a digraph

Notice in this case we end up with a forest of three trees, because the traversal must follow the direction of the edges

There are a number of algorithms based on depth-first searching

However, some are more efficient if the underlying mechanism is breadth-first instead

Data Structures and Algorithms in C++, Fourth Edition

graph traversals continued3
Graph Traversals (continued)

Recall from our consideration of tree traversals that depth-first traversals used a stack, while breadth-first used queues

This can be extended to graphs, as the pseudocode on page 397 illustrates

Figure 8.4 shows this applied to a graph; Figure 8.5 shows the application to a digraph

In both, the basic operation is to mark all the vertices accessible from a given vertex, placing them in a queue as they are visited

The first vertex in the queue is then removed, and the process repeated

No visited nodes are revisited; if a node has no accessible nodes, the next node in the queue is removed and processed

Data Structures and Algorithms in C++, Fourth Edition

graph traversals continued4
Graph Traversals (continued)

Fig. 8.5 An example of application of the breadthFirstSearch() algorithm to a graph

Fig. 8.6 The breadthFirstSearch() algorithm applied to a digraph

Data Structures and Algorithms in C++, Fourth Edition

shortest paths
Shortest Paths

A classical problem in graph theory is finding the shortest path between two nodes, with numerous approaches suggested

The edges of the graph are associated with values denoting such things as distance, time, costs, amounts, etc.

If we’re determining the distance between two vertices, say v and u, information about the distance between the intermediate vertices in the path, w, needs to be kept track of

This can be recorded as a label associated with the vertices

The label may simply be the distance between vertices, or the distance along with the current node’s predecessor in the path

Methods for finding shortest paths depend on these labels

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued
Shortest Paths (continued)

Based on how many times the labels are updated, solutions to the shortest path problem fall into two groups

In label-setting methods, one vertex is assigned a value that remains unchanged

This occurs each time we go through the vertices that remain to be processed

The main drawback to this is that we cannot process graphs that have negative weights on any edges

In label-correcting methods, any label can be changed

This means it can be applied to graphs with negative weights as long as they don’t have negative cycles (a cycle where the sum of the edges is a negative value)

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued1
Shortest Paths (continued)

However this method guarantees that after processing is complete, for all vertices the current distances indicate the shortest path

Most of these forms (both label-setting and label-correcting) can be looked at as part of the same general process, however

That is the task of finding the shortest paths from one vertex to all the other vertices, the pseudocode being on page 399

In this algorithm, a label is defined as:

label(v) = (currDist(v),predecessor(v))

Two open issues in the code are the design of the set called toBeChecked and the order new values are assigned to v

It is the design of the set that impacts both the choice of v and the efficiency of the algorithm

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued2
Shortest Paths (continued)

The distinction between label-setting and label-correcting algorithms is the way the value for vertex v is chosen

This is the vertex in the set toBeCheckedwith the smallest current distance

In considering label-setting algorithms, one of the first was developed by Edsgar Dijkstra in 1956

In this algorithm, the shortest from among a number of paths from a vertex, v, are tried

This means that a particular path may be extended by adding one more edge to it each time v is checked

However, if the path is longer than any other path from that point, it is dropped, and the other path is expanded

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued3
Shortest Paths (continued)

Since the vertices may have more than one outgoing edge, each new edge adds possible paths for exploration

Thus each vertex is visited, the new paths are started, and the vertex is then not used anymore

Once all the vertices are visited, the algorithm is done

Dijkstra’s algorithm is shown on page 400; it is derived from the general algorithm by changing the line

v=a vertex in toBeChecked;

to

v=a vertex intoBeChecked with minimal currDist(v);

It also extends the condition in the if to make permanent the current distance of vertices eliminated from the set

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued4
Shortest Paths (continued)

Notice that the set’s structure is not indicated; recall it is the structure that determines efficiency

Figure 8.7 illustrates this for the graph in part (a)

Fig. 8.7 An execution of DijkstraAlgorithm()

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued5
Shortest Paths (continued)

As a label-setting algorithm, Dijkstra’s approach may fail when negative weights are used in graphs

To deal with that, a label-correcting algorithm is needed

One of the first label-correcting algorithms was developed by Lester R. Ford, Jr. in the late 1950s

It uses the same technique as Dijkstra’s method to set the current distances, but postpones determining the shortest distance for any vertex until the entire graph is processed

While it is capable of handling graphs with negative weights, it cannot deal with negative cycles

In the algorithm, all edges are watched in an attempt to find an improvement for the current distance of the vertices

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued6
Shortest Paths (continued)

The pseudocode for the algorithm is shown on page 402

To facilitate monitoring the vertices, an alphabetic sequence can be used

That way the algorithm can go through the list repeatedly and adjust any vertex’s current distance as needed

Figure 8.8 contains an example of this; note that the graph does include negatively weighted edges

While a vertex may change its current distance during the same iteration, when done each vertex can be reached by the shortest path from the starting vertex

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued7
Shortest Paths (continued)

Data Structures and Algorithms in C++, Fourth Edition

Fig. 8.8 FordAlgorithm() applied to a digraph with negative weights

In the case of Dijkstra’s algorithm, we observed that the efficiency can be improved by the choice of data structure

This in turn impacts the way the edges and vertices are scanned

shortest paths continued8
Shortest Paths (continued)

This observation also holds for label-correcting algorithms; in particular, the FordAlgorithm()specifies no order for edge checking

In the example of Figure 8.8, the approach was to visit all adjacency lists of all vertices in each iteration

However this requires that all the edges are checked every time, which is inefficient

A more sensible organization of the vertices can reduce the number of visits per vertex

The generic algorithm on page 399 suggests an improvement by explicitly accessing toBeChecked

In the FordAlgorithm()this structure is used implicitly, and then only as the set of all vertices

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued9
Shortest Paths (continued)

So based on this, we can derive a general label-correcting algorithm, shown in pseudocode on page 403

As indicated before, the efficiency of the algorithm depends directly on the data structure used for toBeChecked

One possibility is a queue, and was the basis for one of the earliest implementations

With a queue, as a vertex, v is removed, the current distance to its neighbors is checked

If any of those distances is updated, the vertex whose distance was changed is added to the queue

While straightforward, it can sometimes reevaluate the same labels excessively

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued10
Shortest Paths (continued)

Figure 8.9 illustrates this problem for the graph of Figure 8.8a

Fig. 8.9 An execution of labelCorrectingAlgorithm(), which uses a queue

As can be seen, a number of vertices are updated multiple times

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued11
Shortest Paths (continued)

To avoid this situation, a deque can be used in place of the queue

In this approach, vertices needing to be checked for the first time are added at the end, otherwise they are placed in front

The reasoning behind this is that if a given vertex, v, is included for the first time, the vertices accessible from it have yet to be processed, so they will be processed after v

However, if v has been processed, those vertices are likely still in the list awaiting processing, so putting v in front may avoid unnecessary updates

Figure 8.10 shows the result of using a deque instead of a queue

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued12
Shortest Paths (continued)

Fig. 8.10 An execution of labelCorrectingAlgorithm(), which applies a deque

The use of a deque does suffer from one problem, however

Its worst case performance is exponential in the number of vertices

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued13
Shortest Paths (continued)

However, the average case is about 60% better than the queue version of the same algorithm

A variation of this approach uses two queues separately, rather than combined in a deque

In this variation, vertices enqueued for the first time are placed in the first queue; otherwise they are placed in the second

Vertices are then dequeued from the first queue if it is not empty; otherwise they are taken from the second

The threshold algorithm is another variation of the label-correcting method that uses two lists

Vertices are removed from the first list for processing

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued14
Shortest Paths (continued)

A vertex will be added to the end of the first list if the value of its label is below the threshold level

Otherwise it will be added to the second list

If the first list becomes empty, the threshold is modified to a value greater than the minimum label value of all vertices in the second list

Then those vertices whose labels are less than the new threshold are moved from the second list to the first list

Yet another approach is the small label first method

In this method, a vertex is placed at the front of the deque if its label is smaller than the label of the current front of the deque; otherwise it is placed at the rear

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued15
Shortest Paths (continued)
  • All-to-All Shortest Path Problem
    • Given the issues of finding the shortest path from one vertex to another, the problem of finding all shortest paths between two vertices might seem daunting
    • However, a method developed by Stephen Warshall in 1962 does it fairly easily, as long as an adjacency matrix that provides edge weights is available
    • This technique can also handle negative edge weights and the algorithm is shown on page 406
    • An example of the algorithm’s application, together with the accompanying adjacency matrix, is shown in Figure 8.11 on page 407
    • The algorithm can also detect cycles if the diagonal of the matrix is initialized to ∞ instead of 0
    • If any of the diagonal values get changed, the graph contains a cycle

Data Structures and Algorithms in C++, Fourth Edition

shortest paths continued16
Shortest Paths (continued)
  • All-to-All Shortest Path Problem(continued)
    • As it turns out, if an initial value of ∞ is not changed during processing, then one vertex cannot reach the other
    • The algorithm’s simplicity is reflected in the determination of its complexity; there are three loops executed times so it is O3
    • This is adequate for dense, near-complete graphs, but if they are sparse, it may be better to use a one-to-all method applied to each vertex
    • Generally this should be a label-setting algorithm, but recall that these types of routines cannot handle negative edge weights
    • Fortunately, there are transformations available that eliminate the negative weights while preserving the shortest paths of the original

Data Structures and Algorithms in C++, Fourth Edition

cycle detection
Cycle Detection

Numerous algorithms rely on their ability to detect cycles in graphs

Our consideration of the Warshall-Floyd algorithm in the previous example demonstrated that it can detect cycles

However, its cubic order makes it too inefficient to use in all circumstances, so other methods have to be considered

One algorithm, based on the depthFirstSearch()routine, works well for undirected graphs

The pseudocode for this is shown on page 408

Digraphs complicate matters, because the spanning subtrees might have edges between them (called side edges)

Data Structures and Algorithms in C++, Fourth Edition

cycle detection continued
Cycle Detection (continued)

If two vertices already included in a subtree are joined by a back edge, it indicates a cycle

To take this case into account, a number greater than any other assigned number generated from subsequent searches is assigned to the current vertex after its descendants have been visited

This allows us to detect cycles if a vertex is about to be joined by an edge with a vertex having a lower number

This allows us to modify the algorithm so that it now appears in pseudocode as the algorithm on page 409

Data Structures and Algorithms in C++, Fourth Edition

cycle detection continued1
Cycle Detection (continued)
  • Union-Find Problem
    • We’ve seen that the depth-first search guarantees creating a spanning tree with no cycles
    • However, a problem occurs when the depth-first search algorithm is modified to determine if a specific edge is part of a cycle
    • If the modified algorithm is applied to each edge separately, the algorithm could become O4 for dense graphs
    • This is unacceptable, and a better approach needs to be investigated
    • The basic task is to determine if two vertices are members of the same set
    • Two procedures are needed for this: first, to find the set to which a vertex v belongs, and second, to unite two sets into one if v belongs to one set and vertex w belongs to another

Data Structures and Algorithms in C++, Fourth Edition

cycle detection continued2
Cycle Detection (continued)
  • Union-Find Problem (continued)
    • This process is known as the union-find problem
    • Circular-linked lists are used to implement the sets involved in solving the union-find problem
    • The lists are identified by a vertex which is the root of the tree containing the vertices in that list
    • The vertices are numbered from 0 to - 1, which become indices to three arrays
      • root[]stores the index of a vertex identifying a set of vertices
      • next[]indicates the next vertex on a list
      • length[]indicates the number of vertices in a list
    • The circular lists are used to enable combining the lists immediately
    • This is shown in Figure 8.12

Data Structures and Algorithms in C++, Fourth Edition

cycle detection continued3
Cycle Detection (continued)
  • Union-Find Problem (continued)

Fig. 8.12 Concatenating two circular linked lists

    • The two lists are merged into one by interchanging next pointers
    • However, all the vertices now have to have the same root, so the vertices of one of the lists need to have their root indicators changed
    • This should be the shorter of the two lists, which can be determined by the length[] array
    • Since the union operation performs all the needed tasks, the find operation is trivial

Data Structures and Algorithms in C++, Fourth Edition

cycle detection continued4
Cycle Detection (continued)
  • Union-Find Problem (continued)
    • By constantly updating the root[] array, the set to which a vertex v belongs can be identified immediately because it is the set identified by root[v]
    • Thus after initializations, the union algorithm can be defined as shown in pseudocode on page 410
    • An application of this is shown in Figure 8.13
    • After the initialization completes, the |𝑉| one-node lists are as shown in Figure 8.13a
    • These smaller ones are merged into larger ones by repeated execution of the union algorithm, and the arrays updated as seen in Figure 8.13 b-d

Data Structures and Algorithms in C++, Fourth Edition

cycle detection continued5
Cycle Detection (continued)

Union-Find Problem (continued)

Fig. 8.13 An example of application of union() to merge lists

Data Structures and Algorithms in C++, Fourth Edition

spanning trees
Spanning Trees

Consider an airline that has routes between seven cities represented as the graph in Figure 8.14a

Fig. 8.14 A graph representing (a) the airline connections between

seven cities and (b–d) three possible sets of connections

If economic hardships force the airline to cut routes, which ones should be kept to preserve a route to each city, if only indirectly?

One possibility is shown in Figure 8.14b

Data Structures and Algorithms in C++, Fourth Edition

spanning trees continued
Spanning Trees (continued)

However, we want to make sure we have the minimum connections necessary to preserve the routes

To accomplish this, a spanning tree should be used, specifically one created using depthFirstSearch()

There is a possibility of multiple spanning trees (Figure 8.14c-d), but each of these has the minimum number of edges

We don’t know which of these might be optimal, since we haven’t taken distances into account

The airline, wanting to minimize costs, will want to use the shortest distances for the connections

So what we want to find is the minimum spanning tree, where the sum of the edge weights is minimal

Data Structures and Algorithms in C++, Fourth Edition

spanning trees continued1
Spanning Trees (continued)

The problem we looked at earlier involving finding a spanning tree in a simple graph is a case of this where edge weights = 1

So each spanning tree is a minimum tree in a simple graph

There are a number of solutions to the minimum spanning tree problem, and we will consider two

One popular algorithm is Kruskal’s algorithm, developed by Joseph Kruskal in 1956

It orders the edges by weight, and then checks to see if they can be added to the tree under construction

It will be added if its inclusion doesn’t create a cycle

Data Structures and Algorithms in C++, Fourth Edition

spanning trees continued2
Spanning Trees (continued)

The algorithm is as follows:

KruskalAlgorithm(weighted connected undirected graph)

tree = null;

edges = sequence of all edges of graph sorted by weight;

for (i = 1; i # |E| and |tree| < |V| – 1; i++)

if ei from edges does not form a cycle with edges in tree

add ei to tree;

A step-by-step example of the application of this algorithm is shown in Figure 8-15ba-bf on page 413

It is not necessary to order the edges in order to build a spanning tree, any order of edges can be used

An algorithm developed by Dijkstra in 1960 (and independently by Robert Kalaba) pursues this approach

Data Structures and Algorithms in C++, Fourth Edition

spanning trees continued3
Spanning Trees (continued)

This algorithm is shown below:

DijkstraMethod(weighted connected undirected graph)

tree = null;

edges = an unsorted sequence of all edges of graph;

for i = 1 to |E|

add ei to tree;

if there is a cycle in tree

remove an edge with maximum weight from this only cycle;

In this algorithm, edges are added to the tree one-by-one

If a cycle results, the edge in the cycle with maximum weight is removed

The use of this method is shown in Figure 8.15ca-cl on page 414

Data Structures and Algorithms in C++, Fourth Edition

connectivity
Connectivity

In many graph problems we want to find a path from a given vertex to any other vertex

In undirected graphs this means there are no separate pieces in the graph (subgraphs)

In a digraph, we may be able to get to some vertices in a particular direction, but not return to the starting vertex

Data Structures and Algorithms in C++, Fourth Edition

connectivity continued
Connectivity (continued)
  • Connectivity in Undirected Graphs
    • An undirected graph is considered to be connectedif there is a path between any two vertices of the graph
    • We can use the depth-first search algorithm to determine connectivity if the while loop heading is removed
    • When the algorithm completes, we check the edges list to see if it contains all the vertices of the graph
    • Connectivity is described in terms of degrees; a graph is more or less connected depending on the number of different paths between vertices
    • An n-connected graph has at least n different paths between any two vertices
    • This means there are n paths between the vertices that have no vertices in common

Data Structures and Algorithms in C++, Fourth Edition

connectivity continued1
Connectivity (continued)
  • Connectivity in Undirected Graphs(continued)
    • One special type of graph is the biconnected (or 2-connected) graph, which has at least two non-overlapping paths between two vertices
    • If we can find a vertex that always has to be included in the path between vertices a and b, then the graph is not biconnected
    • Removing this vertex, and its incident edges, will split the graph into two subgraphs
    • These vertices are referred to as cut-vertices or articulation points
    • If the graph can be split on an edge, the edge is referred to as a cut-edgeor bridge
    • If connected subgraphs have no articulation points or bridges, they are called blocks (if there are at least two vertices, they are biconnected components)

Data Structures and Algorithms in C++, Fourth Edition

connectivity continued2
Connectivity (continued)
  • Connectivity in Undirected Graphs (continued)
    • We can detect articulation points by extending the depth-first algorithm to create a tree with forward and back edges
    • A vertex in the resulting tree is an articulation point if it has at least one subtree unconnected with any of its predecessors by a back edge
    • This is illustrated in Figure 8.16 on page 417
    • A special case of articulation points occurs when the vertex involved is a root with more than one descendant
    • In the case of the graph in Figure 8.16, a is the root, and has three incident edges; however, only one becomes a forward edge
    • This is because the other two are processed by the depth-first search

Data Structures and Algorithms in C++, Fourth Edition

connectivity continued3
Connectivity (continued)
  • Connectivity in Undirected Graphs (continued)
    • Consequently, if a is reached again, there will be no untried edge, whereas if a were a cut-vertex there would be at least one such edge
    • So for a given vertex, v, the vertex is an articulation point if:
      • v is the root of a depth-first tree and has more than one descendant in the tree OR
      • at least one of v’s subtrees includes no vertex connected by a back edge with any of v’s predecessors
    • To find articulation points, a parameter pred(v)is used, defined as the smallest value of the set of vertices connected by a back edge with either v or a predecessor of v
    • A stack is used to store the currently processed edges; after the cut-vertex is identified, the graph edges comprising the block are output
    • The pseudocode for the algorithm is on pages 416 and 418

Data Structures and Algorithms in C++, Fourth Edition

connectivity continued4
Connectivity (continued)
  • Connectivity in Directed Graphs (continued)
    • With directed graphs, defining connectedness depends on whether or not the direction of the edges is considered
    • A weakly connected digraph is one where the undirected graph with the same edges and vertices is connected
    • A strongly connected digraph has, for every pair of vertices, a path between them in both directions
    • A digraph may not be strongly connected, yet contain strongly connected components (SCCs)
    • These are subsets of vertices in the digraph that of themselves represent a strongly connected digraph

Data Structures and Algorithms in C++, Fourth Edition

connectivity continued5
Connectivity (continued)
  • Connectivity in Directed Graphs (continued)
    • Depth-first search can also be used in determining SCCs
    • The root of the SCC is the first vertex of the SCC for which the depth-first search is applied
    • Because every vertex in the SCC is reachable from this root, the value of the root will be less than the value of any other vertex in the SCC
    • Only after those vertices are visited will the depth-first search backtrack to the root
    • At that point the SCC that is accessible from this root can be output
    • The problem then is how to find these vertices in the digraph, which is a problem similar to finding cut-vertices in an undirected graph

Data Structures and Algorithms in C++, Fourth Edition

connectivity continued6
Connectivity (continued)
  • Connectivity in Directed Graphs (continued)
    • To do this, the pred(v) parameter is used, which is the lower of num(v) and pred(u), u being a vertex reachable from v and in the same SCC
    • Of course this leads to the question of how we can determine if two vertices are in the same SCC before we determine if it is an SCC
    • This can be done using a stack to store the vertices of all SCCs under construction
    • The topmost vertices will be in the current SCC
    • This way we know what vertices are already in the SCC even though the construction isn’t finished
    • The algorithm, attributed to Robert Trajan, is shown on page 419; an example of the execution is shown in Figure 8.17 on page 420

Data Structures and Algorithms in C++, Fourth Edition

topological sort
Topological Sort

A topological sortof a directed graph is a linear ordering of its vertices so that, for every edge uv, u comes before v in the ordering

For instance, the vertices of the graph may represent tasks to be performed

The edges may represent constraints that one task must be performed before another

In this application, a topological ordering is just a valid sequence for the tasks

A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG)

Data Structures and Algorithms in C++, Fourth Edition

topological sort continued
Topological Sort (continued)

The algorithm for the topological sort is a simple one:

topologicalSort(digraph)

for i = 1 to |V|

find a minimal vertex v;

num(v) = i;

remove from digraph vertex v and all edges incident with v;

As can be seen, we locate a vertex, v with no outgoing edges

Such a vertex is called a minimal vertex or sink

We then remove any edges leading from a vertex to v

Figure 8.18 shows this process; the graph in Figure 8.18a goes through a series of deletions (Figure 8.18b-f) to produce the sequence g, e, b, f, d, c, a

Data Structures and Algorithms in C++, Fourth Edition

topological sort continued1
Topological Sort (continued)

Fig. 8.18 Executing a topological sort

Data Structures and Algorithms in C++, Fourth Edition

topological sort continued2
Topological Sort (continued)

It is not actually necessary to delete the edges and vertices from a digraph during this processing

If we can determine that all successors of the vertex v have been processed, they can be considered deleted

This is once again handled by applying the depth-first search techniques seen earlier

Basically, if the search backtracks to v, then all its successors can be assumed to have already been searched

The pseudocode for this algorithm is shown on pages 421 and 423

The table (Figure 8.18h) shows how the numbers are assigned for each vertex of the graph of Figure 8.18a

Data Structures and Algorithms in C++, Fourth Edition

networks
Networks
  • Maximum Flows
    • A network is a directed graph where each edge has a capacity and each edge receives a flow
    • The amount of flow on an edge cannot exceed the capacity of the edge
    • A flow must satisfy the restriction that the amount of flow into a node equals the amount of flow out of it, except when it is a source, which has more outgoing flow, or sink, which has more incoming flow
    • A network can be used to model traffic in a road system, fluids in pipes, currents in an electrical circuit, or anything similar in which something travels through a network of nodes
    • Delbert R. Fulkerson and Lester R. Ford, Jr. developed the first computational models of these flow problems in 1954

Data Structures and Algorithms in C++, Fourth Edition

networks continued
Networks (continued)
  • Maximum Flows (continued)
    • The central problem of these network models is to maximize the flow over the edges from the source to the sink
    • This is referred to as the maximum flow (or max-flow) problem
    • Figure 8.19 illustrates this problem for a small water-flow network of 8 pipes and 6 pumping stations; the edges are labeled with the capacity of the pipes in thousands of gallons

Figure 8.19 A pipeline with eight

pipes and six pumping stations

Data Structures and Algorithms in C++, Fourth Edition

networks continued1
Networks (continued)
  • Maximum Flows (continued)
    • A central aspect of the Ford-Fulkerson approach is the concept of a cut
    • A cut separating s and t is a set of edges between two sets, X and
    • Every vertex of the graph is a member of one of these two sets; the source, s, is in X and the sink, t, in
    • In Figure 8.19, if we choose X = {s, a}, then = {b, c, d, t}, and the cut is the set of edges {{a, b}, {s, c}, {s, d}}
    • Thus, if all these edges are cut, there is no way to get from s to t
    • Now we can define the capacity of the cut as the sum of the capacities of the edges in this cut set, so

cap{(a,b),(s,c),(s,d)} = cap(a,b) + cap(s,c) + cap(s,d) = 19

Data Structures and Algorithms in C++, Fourth Edition

networks continued2
Networks (continued)
  • Maximum Flows (continue)
    • From this, we can infer the max-flow min-cut theorem:

Theorem:In any network, the maximal flow from s to t is

equal to the minimal capacity of any cut.

    • This makes it fairly clear that while there may be cuts with larger capacity, it is the cut with the smallest capacity that determines the flow of the network
    • For instance, although the capacity of our earlier cut was 19, the two edges coming to the sink can’t transfer more than 9 units
    • So we have to search all the cuts to find the one with the smallest capacity, and transfer through this as many units as the capacity allows
    • To achieve this, we’ll utilize a new idea

Data Structures and Algorithms in C++, Fourth Edition

networks continued3
Networks (continued)
  • Maximum Flows (continue)
    • A flow-augmenting path is a sequence of edges from s to t such that on any edge, e, in the path the flow f(e) on the forward edges is less than the capacity, cap(e), and greater than 0 on the backward edges
    • This means the path has excess capacity that isn’t being used
    • However if the flow for any edge in that path reaches capacity, the flow cannot be augmented
    • The path also does not have to exclusively use forward edges, so in Figure 8.19, we have paths s, a, b, t and s, d, b, t
    • Backward edges push back against the flow, decreasing the total flow of the network
    • Eliminating them can increase the overall flow in the network, so the goal of augmenting isn’t finished until the flows for those edges is 0

Data Structures and Algorithms in C++, Fourth Edition

networks continued4
Networks (continued)
  • Maximum Flows (continue)
    • The task now is to find an augmenting path; however there may be a large number of paths from s to t, so this is a nontrivial problem
    • Ford and Fulkerson devised the first systematic algorithm for this in 1957
    • The first phase of the algorithm, labeling, assigns each vertex of the graph a label, defined as the pair label(v) = (parent(v), flow(v))
    • parent(v) is the node accessing v, and flow(v) is the flow amount from s to v
    • Forward and backward edges are treated differently; if v accesses vertex u via a forward edge, label(u) = (v+,min(flow(v),slack(edge(vu))))
    • Here, slack(edge(vu)) = cap(edge(vu)) – f(edge(vu)); this is the difference between the capacity of the edge vu and its current flow

Data Structures and Algorithms in C++, Fourth Edition

networks continued5
Networks (continued)
  • Maximum Flows (continue)
    • Now if the edge between v and u is backward, then the value of label(u) = (v–,min(flow(v),f(edge(uv)))) where

flow(v) = min(flow (parent(v)), slack(edge(parent(v)v)))

    • Once a vertex is labeled, it is stored for subsequent processing
    • Only the vu edge is labeled in this activity, leaving open the ability to add more flow
    • This can be done for forward edges when slack(edge(vu)) > 0, and for backward edges when f(edge(uv)) > 0
    • However, finding this path may not complete the whole procedure
    • It is only finished if we are stuck somewhere in the network and unable to label any more edges

Data Structures and Algorithms in C++, Fourth Edition

networks continued6
Networks (continued)
  • Maximum Flows (continue)
    • If we reach the sink, the flows in the augmenting path are adjusted by increasing flows on the forward edges, and decreasing them on the backward ones
    • Then we restart the task and look for another augmenting path
    • The pseudocode for the algorithm is presented on page 425
    • In examining the algorithm, notice there is no particular mechanism specified for scanning the graph
    • The question is in what order vertices should be added to labeled and detached from it; this implementation uses push and pop operations to process it depth-first
    • The operation of this algorithm in shown in Figure 8.20 on pages 426 and 427

Data Structures and Algorithms in C++, Fourth Edition

networks continued7
Networks (continued)
  • Maximum Flows (continue)
    • A major issue with this implementation is the depth-first approach, which has a significant impact on its efficiency
    • Since the depth-first algorithm tries to reach the sink as soon as possible, we may end up choosing the same augmenting path several times as the algorithm proceeds
    • A better approach is to try and find the shortest augmenting path, which suggests a breadth-first approach
    • This concept was developed by Jack Edmonds and Richard Karp in 1972
    • It uses the same approach as the Ford-Fulkerson algorithm, but the labeled structure is now a queue
    • This modified approach is illustrated in Figure 8.22 on page 429

Data Structures and Algorithms in C++, Fourth Edition

networks continued8
Networks (continued)
  • Maximum Flows (continue)
    • Although this approach overcomes the problems associated with the depth-first search, it has its own inefficiencies
    • When we perform a breadth-first search, a large number of vertices are labeled in each iteration in order to find the shortest path
    • However, these labels are all discarded, only to be re-created when we start looking for another augmenting path
    • So to address this shortcoming we turn our attention to an algorithm developed by Efim Dinic in 1970
    • His approach used breadth-first search first to avoid the repetitive loops with the same paths and to make sure the depth-first search takes the shortest path
    • Once that was done, the depth-first component takes over to reach the sink

Data Structures and Algorithms in C++, Fourth Edition

networks continued9
Networks (continued)
  • Maximum Flows (continue)
    • The algorithm makes up to - 1 passes through the network resolving all augmenting paths of the same length from source to sink
    • All the augmenting paths form a layered(or level) network
    • Starting from the lowest values, we first extract layered networks of length one if they exist, then length two, etc.
    • This is illustrated in Figure 8.23a-b on page 431
    • The augmenting paths in this layered network are all of length three; a single path of length one and paths of length two do not exist
    • Breadth-first processing is used to create the layered network, and it includes only forward edges with more capacity and backward edges that already carry some flow

Data Structures and Algorithms in C++, Fourth Edition

networks continued10
Networks (continued)
  • Maximum Flows (continue)
    • Since the paths in a layered network are of the same length, we can avoid redundant edges that are in augmenting paths
    • If we cannot reach any of the neighbors of a vertex v in a layered network, the same situation will exist in that network in later tests
    • Consequently, we won’t need to check the neighbors of v again
    • So if we run into a dead-end node v, we mark incident edges as blocked so we can’t get to v from any direction
    • Any saturated edges (those already at full capacity) are also blocked; these are shown as dashed lines in Figure 8.23
    • Because of the way this works, the layered network is built from the sink to the source

Data Structures and Algorithms in C++, Fourth Edition

networks continued11
Networks (continued)
  • Maximum Flows (continue)
    • Next, the depth-first search proceeds to find as many augmenting paths as possible from the layered network
    • For each of these paths, one edge will become saturated, so eventually no more augmenting paths will be found
    • This process is illustrated in Figure 8.23c-f
    • Once no more augmenting paths are found, a higher-level layered network is created, and the search for augmenting paths begins again, eventually stopping when no layered network can be formed
    • Figure 8.23g-j shows this, as first a four-edge and then a five-edge path are created
    • The algorithm itself is shown on pages 432 and 433

Data Structures and Algorithms in C++, Fourth Edition

networks continued12
Networks (continued)
  • Maximum Flows of Minimum Cost
    • Edges in the previous examples had two parameters, capacity and flow
    • Choice of maximum flow was dictated by the algorithm used, even though there might be many maximum flows
    • This is illustrated in Figure 8.24

Fig. 8.24 Two possible maximum flows for the same network

    • In Figure 8.24a, the edge ab isn’t used at all, whereas in Figure 8.24b all the edges are carrying flow
    • Yet our breadth-first only yields the first result, then halts

Data Structures and Algorithms in C++, Fourth Edition

networks continued13
Networks (continued)
  • Maximum Flows of Minimum Cost (continued)
    • However this may not be the best choice; not all paths of maximum flow are equally good ones
    • If we look at the example as road distances between locations, then capacity and flow may not be sufficient information to properly determine a route
    • For example, the distance from a to t may be quite long, while the distance from a to b and b to t may be shorter, making the second route preferable
    • But distance may not be the sole criterion; there may be many other factors that influence the choice of route
    • This leads us to consider a third factor in evaluating edges, the cost of moving a unit of flow through the edge

Data Structures and Algorithms in C++, Fourth Edition

networks continued14
Networks (continued)
  • Maximum Flows of Minimum Cost (continued)
    • The problem now becomes how to find the maximum flow at minimum cost
    • Finding all the possible maximum flows and then comparing their costs is extremely inefficient
    • What is needed is an algorithm that can find a maximum flow while also determining the minimum cost
    • One possible approach is based on the following theorem:

Theorem. If f is a minimal-cost flow with the flow value v

and p is the minimum cost augmenting path sending a

flow of value 1 from the source to the sink, then the flow

f + p is minimal and its flow value is v + 1.

Data Structures and Algorithms in C++, Fourth Edition

networks continued15
Networks (continued)
  • Maximum Flows of Minimum Cost (continued)
    • The theorem says we first start with the cheapest way to move v units through the network
    • Then we find a path that is the cheapest way of sending a single unit from the source to the sink
    • On combining these, we have the route previously determined and the path just found, which transmits v + 1 units
    • Now if this augmenting path sends 1 unit at minimum cost, it can send, 2, 3, …, n units, where n is the capacity of the path
    • This also suggests a process for finding the cheapest maximum route
    • Starting with all flows 0, we find the cheapest way to send 1 unit and then maximize the flow along this path

Data Structures and Algorithms in C++, Fourth Edition

networks continued16
Networks (continued)
  • Maximum Flows of Minimum Cost (continued)
    • After the next go-around, the path to send 1 unit at least cost is determined, and as many units as this can hold is sent, etc.
    • This continues until we can’t send anything more from the source, or the sink can’t receive any more flows
    • This is something like finding the shortest path, because it can be looked at as the path with minimum cost
    • So we want an algorithm to find the shortest path so we can send the maximum flow through the path
    • So a modification of Dijkstra’s one-to-one shortest path algorithm can be used
    • The pseudocode for this procedure is shown on page 435

Data Structures and Algorithms in C++, Fourth Edition

networks continued17
Networks (continued)
  • Maximum Flows of Minimum Cost (continued)
    • The label for each vertex in this algorithm is the triple label(u) = (parent(u), flow(u), cost(u)) since it has to track three items
    • First, it records u’s predecessor, v, which how s accesses u
    • Then, for the path from s to u, it records the maximum flow
    • Finally, it stores the cost of passing all the edges from the source to u
    • cost(u), for the forward edge(vu), is the sum of accumulated costs in v plus the additional cost of pushing a unit through edge(vu)
    • The unit cost of passing through backward edge(vu) is subtracted from cost(v) and stored in cost(u)
    • The process is illustrated in Figure 8.25 on page 437

Data Structures and Algorithms in C++, Fourth Edition

matching
Matching

A particular company has a set of jobs {a, b, c, d, e}, and a set of applicants {p, q, r, s, t}

However, applicant p is only qualified for jobs a, b, and c; applicant q is only qualified for jobs band d; similar restrictions exist for the other applicants

Our problem is how to match the applicants to the jobs such that each applicant has a job and all jobs are assigned

Numerous problems like this exist, and they are conveniently modeled using bipartite graphs

A bipartite graph is one where the vertices can be divided into two sets, such that any edge has one vertex in each set

Data Structures and Algorithms in C++, Fourth Edition

matching continued
Matching (continued)

For the company, we can construct a bipartite graph where each edge relates an applicant to the job(s) they qualify for

This is shown in Figure 8.26

Fig. 8.26 Matching five applicants with five jobs

The task is to match each applicant with a job; this may not always be possible, so we want to match as many as possible

For a given graph G = (V, E), a matching M is defined as a subset of edges ME, where no two edges are adjacent

Data Structures and Algorithms in C++, Fourth Edition

matching continued1
Matching (continued)

A maximum matching is a matching where the number of unmatched vertices is minimal

Consider Figure 8.27

Fig. 8.27 A graph with matchings M1 = {edge(cd), edge(ef)}

and M2 = {edge(cd), edge(ge), edge(fh)}

Sets M1 = {edge(cd), edge(ef)} and M2= {edge(cd), edge(ge), edge(fh)} are matchings, but M2 is a maximum matching

A perfect matching is one where all vertices in the graph are paired

Data Structures and Algorithms in C++, Fourth Edition

matching continued2
Matching (continued)

A matching problem is the task of finding a maximum matching for a given graph

An alternating path for M is a sequence of edges that alternately belong to M and the set of edges not in M

An augmenting path for M is an alternating path where the end vertices are not incident with any edge in matching M

Augmenting paths have an odd number of edges, 2k + 1, where k are in M and k + 1 are not in M

The symmetric difference of two sets, XY, is the set

X ⊕ Y = (X – Y) (Y – X) = (X Y) – (X Y)

In other words, the symmetric difference of two sets is the set of elements in their union, less the intersection

Data Structures and Algorithms in C++, Fourth Edition

matching continued3
Matching (continued)

This leads us to the following lemma, the proof of which is shown on page 439:

Lemma 1. If for two matchings M and N in a graph G = (V,E) we define

a set of edges M ⊕ N ⊆ E, then each connected component of the

subgraph G′ = (V,M ⊕ N) is either (a) a single vertex, (b) a cycle with

an even number of edges alternately in M and N, or (c) a path whose

edges are alternately in M and N and such that each end vertex of the

path is matched only by one of the two matchings M and N (i.e., the

whole path should be considered, not just part, to cover the entire

connected component)

Figure 8.28 shows an example of this

The symmetric difference between matching M (dashed lines) and matching N (dotted lines) contains one path and a cycle (Figure 8.28 b)

Data Structures and Algorithms in C++, Fourth Edition

matching continued4
Matching (continued)

Notice that the vertices of the graph G not incident with any edges in the symmetric difference are isolated vertices in G’

Fig. 8.28 (a) Two matchings M and N in a graph G = (V,E)

and (b) the graph G’ = (V, M ⊕ N)

Now consider the next lemma:

Lemma 2. If M is a matching and P is an augmenting path for M, then

M ⊕ P is a matching of cardinality |M| + 1

Data Structures and Algorithms in C++, Fourth Edition

matching continued5
Matching (continued)

The proof of this is on page 440; Figure 8.29 illustrates it

Fig. 8.29 (a) Augmenting path P and a matching M and (b) the matching M ⊕ P

For matching edge M (dashed lines) and augmenting path P for M (c, b, f, h, g, i, j, e), the matching is {edge(bc), edge(ej), edge(fh), edge(gi)}

This includes all the edges fromP originally excluded from M

Data Structures and Algorithms in C++, Fourth Edition

matching continued6
Matching (continued)

These two lemmas can then be used to construct the proof of the following important theorem:

Theorem (Claude Berge 1957). A matching M in a graph G is maximum

iff there is no augmenting path connecting two unmatched vertices in G

The proof of this theorem is shown on page 441

This suggests an approach for finding a maximum path

Starting from an initial matching (possibly empty), it repeatedly finds new augmenting paths to increase the cardinality of the matching until no such path can be found

This means we need an algorithm to determine augmenting paths

Fortunately, this is easier to do for bipartite graphs, so we’ll start with them

Data Structures and Algorithms in C++, Fourth Edition

matching continued7
Matching (continued)

To find an augmenting path, the breadth-first algorithm is modified to allow for always finding the shortest path

A tree, called a Hungarian tree, is constructed with an unmatched vertex in the root

It consists of alternating paths, and success is determined as soon as another unmatched vertex is found

This indicates the presence of an augmenting path

The augmenting path increases the size of matching; once no such path can be found, the algorithm is finished

The algorithm is shown on pages 441 and 442; an example of this is shown in Figure 8.30 on page 443

Data Structures and Algorithms in C++, Fourth Edition

matching continued8
Matching (continued)
  • Stable Matching Problem
    • In the example of matching applicants with jobs, any successful maximum matching was fine
    • However, this is typically not possible due to preferences for jobs among applicants, and for applicants among employers
    • The stable matching (also called stable marriage) problem uses two non-overlapping sets with the same cardinality, U and W
    • The elements of U have a ranking list of elements of W, and those of W have a preference list of elements of U
    • The ideal matching is to place elements with their highest preference, but because of possible conflicts, a stable matching is sought
    • A matching is unstable is two elements rank each other higher than those with which they are currently matched; otherwise it is stable

Data Structures and Algorithms in C++, Fourth Edition

matching continued9
Matching (continued)
  • Stable Matching Problem(continued)
    • If we consider the two sets U = {u1, u2, u3, u4} and W = {w1, w2, w3, w4}, and the following ranking lists:

u1: w2 > w1 > w3 > w4w1: u3 > u2 > u1 > u4

u2: w3 > w2 > w1 > w4w2: u1 > u3 > u4 > u2

u3: w3 > w4 > w1 > w2w3: u4 > u2 > u3 > u1

u4: w2 > w3 > w4 > w1w4: u2 > u1 > u3 > u4

then we can see the matching (u1, w1), (u2, w2), (u3, w4), (u4, w3) is unstable because u1 and w2 prefer each other over the current match

    • David Gayle and Lloyd Shapley Designed a matching algorithm in 1962, and also showed that a stable matching always exists
    • This algorithm is shown in page 444, together with a discussion of its application to the sets and table above

Data Structures and Algorithms in C++, Fourth Edition

matching continued10
Matching (continued)
  • Stable Matching Problem (continued)
    • There is an asymmetry associated with the algorithm based on which rankings are considered more important
    • As given, the algorithm favors set U
    • If the roles of the two sets U and W are reversed, then the w’s will have their preferred choices immediately, instead of the u’s

Data Structures and Algorithms in C++, Fourth Edition

matching continued11
Matching (continued)
  • Assignment Problem
    • Finding suitable matches becomes more difficult in a weighted graph
    • In these cases we want to find a matching with a maximum total weight
    • This is known as the assignment problem
    • If we consider complete bipartite graphs with two sets of vertices that are equal in size, then it is known as the optimal assignment problem
    • An algorithm known as the Hungarian algorithm was developed by Harold Kuhn in 1955, and further investigated by James Munkres in 1957
    • Kuhn’s original name was in honor of the work done by Dénis Kõnig and Jenõ Egerváry on this problem in 1931

Data Structures and Algorithms in C++, Fourth Edition

matching continued12
Matching (continued)
  • Assignment Problem (continued)
    • The algorithm is shown on pages 445 and 446
    • An example of its application is shown in Figure 8.31, together with a detailed treatment of its application on pages 446 and 447

Data Structures and Algorithms in C++, Fourth Edition

matching continued13
Matching (continued)
  • Matching in Nonbipartite Graphs
    • The algorithm findMaximumMatching()(pages 441 and 442) is not general enough to correctly handled nonbipartite graphs
    • Considering the graph in Figure 8.32 and using breadth-first search to construct a tree to determine an augmenting path we run into a problem
    • Starting at vertex c, d is on an even level, eis odd, and a and f are even
    • ais then expanded by adding b and f by adding g and then i, creating an augmenting path c, d, e, f, g, i
    • If i were not in the graph, however, the only augmenting path would not be detected because g, being labeled, blocks access to f and h
    • A similar problem would occur if we relied on depth-first search instead

Data Structures and Algorithms in C++, Fourth Edition

matching continued14
Matching (continued)
  • Matching in Nonbipartite Graphs (continued)

Fig. 8.32 Application of the findMaximumMatching() algorithm to a nonbipartitegraph

    • The problem is caused by certain cycles possessing an odd number of edges
    • It isn’t the odd number of edges specifically that leads to this; Figure 8.32b can be successfully processed

Data Structures and Algorithms in C++, Fourth Edition

matching continued15
Matching (continued)
  • Matching in Nonbipartite Graphs (continued)
    • The type of cycle for which the problems occur is called a blossom
    • A technique for determining augmenting paths for graphs with blossoms was developed by Jack Edmonds in 1961 and published in 1965
    • A blossom is an alternating cycle where the first and last edges of the cycle are not in matching
    • In these cycles, the first vertex is called the base of the blossom
    • An alternating path of even length is called a stem, so is a path of length zero with a single vertex
    • If a blossom has a stem whose edge in matching is incident with the base, it is called a flower

Data Structures and Algorithms in C++, Fourth Edition

matching continued16
Matching (continued)
  • Matching in Nonbipartite Graphs (continued)
    • In Figure 8.32a, path c, d, e and path e are stems; cycle e, a, b, g, f, e forms a blossom with base e
    • Blossoms cause problems when the potential augmenting path leads to a blossom through the base
    • Depending on the edge chosen to continue the path, an augmenting path may not be derived
    • If the blossom is entered through any other vertex, however, the problem is averted because only one of the two edges of the vertex can be chosen
    • So the idea is to detect a blossom is being entered through its base
    • We can then temporarily remove the blossom by replacing it with a vertex and attach to this all edges connected to the blossom

Data Structures and Algorithms in C++, Fourth Edition

matching continued17
Matching (continued)
  • Matching in Nonbipartite Graphs (continued)
    • At this point the search for an augmenting path continues
    • If one is found and it includes a vertex representing a blossom, the blossom is re-inserted
    • The path through it is then determined by going backwards from the edge that led to the blossom to an edge incident with the base
    • So first, we need to detect that a blossom has been entered through its base
    • The Hungarian tree in Figure 8.33a was generated using a breadth-first search on the graph of Figure 8.32a
    • Trying to find neighbors of b leads us to g, because edge(ab) is in matching, so only edges not in matching can be included starting from b

Data Structures and Algorithms in C++, Fourth Edition

matching continued18
Matching (continued)
  • Matching in Nonbipartite Graphs (continued)
    • These edges lead to vertices on an even level in the tree, but g has already been labeled and is on an odd level, signaling a blossom
    • Thus, we trace paths back in the tree from g and b until we reach a common, root, which is vertex e; this is the base of the blossom
    • We then replace the blossom with a vertex, A, leading to the graph of Figure 8.33b
    • The augmenting path search is then resumed, and continues until the path is found, which is c, d, A, h
    • Then the blossom is expanded, and the path traced through the blossom
    • This is done by starting from edge(hA) (now edge(hf))

Data Structures and Algorithms in C++, Fourth Edition

matching continued19
Matching (continued)
  • Matching in Nonbipartite Graphs (continued)
    • That edge is not in matching, so from f only edge(fg) can be chosen so the augmenting path remains alternating
    • By moving through the vertices f, g, b, a, e, the part of the augmenting path corresponding to A is determined, as seen in Figure 8.33c
    • So the full augmenting path is c, d, e, a, b, g, f, h
    • Once the path is processed, a new matching is determined, shown in Figure 8.33d

Data Structures and Algorithms in C++, Fourth Edition

matching continued20
Matching (continued)

Matching in Nonbipartite Graphs (continued)

Fig. 8.33 Processing a graph with a blossom

Data Structures and Algorithms in C++, Fourth Edition

eulerian and hamiltonian graphs
Eulerian and Hamiltonian Graphs
  • Eulerian Graphs
    • A trail in a graph which visits every edge exactly once is called anEulerian trail (or Eulerian path)
    • Similarly, an Eulerian trail which starts and ends on the same vertex is called anEulerian circuit or Eulerian cycle
    • They were first discussed by Leonhard Euler while solving the famous Seven Bridges of Königsberg problem in 1736
    • Euler proved that if every vertex of the graph is incident to an even number of edges, then it is Eulerian
    • In addition, if the graph has exactly two vertices incident with an odd number of edges, it contains an Eulerian trail
    • An algorithm developed by M. Fleury in 1883 is the oldest that allows us to find an Eulerian cycle if this is possible; it appears on page 450

Data Structures and Algorithms in C++, Fourth Edition

eulerian and hamiltonian graphs continued
Eulerian and Hamiltonian Graphs(continued)
  • Eulerian Graphs(continued)
    • Figure 8.34 shows an example of finding an Eulerian cycle

Fig. 8.34 Finding an Euleriancycle

    • A test needs to be made, before an edge is chosen, to see if that edge is a bridge in the untraversed subgraph
    • If it is, it could lead to the in ability to complete the path because certain vertices could become unreachable

Data Structures and Algorithms in C++, Fourth Edition

eulerian and hamiltonian graphs continued1
Eulerian and Hamiltonian Graphs(continued)
  • Eulerian Graphs (continued) – The Chinese Postman Problem
    • TheChinese postman problemis to find a shortest closed path or circuit that visits every edge of a (connected) undirected graph
    • Alan Goldman of the U.S. National Bureau of Standards first coined the name 'Chinese Postman Problem' for this problem, as it was originally studied by the Chinese mathematician Mei-Ku Kwan in 1962
    • When the graph has an Eulerian circuit that circuit is an optimal solution
    • If it doesn’t, it can be amplified by including each edge as many times as it appears in the postman’s walk
    • If this is done, we need to construct the graph in such a way as to minimize the sum of the distances of the added edges

Data Structures and Algorithms in C++, Fourth Edition

eulerian and hamiltonian graphs continued2
Eulerian and Hamiltonian Graphs(continued)
  • Eulerian Graphs (continued) – The Chinese Postman Problem
    • First we group odd degree vertices into pairs and add a path of new edges to the already existing path between vertices of each pair
    • The problem now is to find a grouping of odd-degree vertices such that the total distance of the added paths is minimum
    • An algorithm to solve this was developed by Jack Edmonds and Ellis L. Johnson in 1973, based on earlier work by Edmonds in 1965
    • The pseudocode for this algorithm is shown on page 451
    • The task of finding a postman tour is illustrated in Figure 8.35 on page 452
    • The path has six odd degree vertices, c, d, f, g, h, and j

Data Structures and Algorithms in C++, Fourth Edition

eulerian and hamiltonian graphs continued3
Eulerian and Hamiltonian Graphs(continued)
  • Eulerian Graphs (continued) – The Chinese Postman Problem
    • In Figure 8.35b-c the shortest paths between all pairs of these vertices are determined
    • A complete bipartite graph, H, is then found (Figure 8.35d), and an optimal assignment, M is determined
    • A matching in an initial equality subgraph is found by using the optimalAssignment()algorithm (Figure 8.35e)
    • Two matchings are found (Figure 8.35f–g), and then a perfect matching (Figure 8.35h)
    • Using this, we amplify the original graph by adding new edges (dashed lines in Figure 8.35i), so there are no odd-degree vertices
    • Consequently, finding an Eulerian trail is possible

Data Structures and Algorithms in C++, Fourth Edition

eulerian and hamiltonian graphs continued4
Eulerian and Hamiltonian Graphs(continued)
  • Hamiltonian Graphs
    • A Hamiltonian pathis a path in an undirected graph that visits each vertex exactly once
    • AHamiltonian cycleis a Hamiltonian path that is a cycle
    • Determining whether such paths and cycles exist in graphs is the Hamiltonian path problem, which is NP-complete
    • Hamiltonian graphs have no characterizing formula, but all complete graphs are Hamiltonian
    • Hamiltonian paths and cycles are named after William Rowan Hamilton who studied them in 1857
    • The following theorem will prove useful in discussing Hamiltonian graphs

Data Structures and Algorithms in C++, Fourth Edition

eulerian and hamiltonian graphs continued5
Eulerian and Hamiltonian Graphs(continued)
  • Hamiltonian Graphs (continued)

Theorem (Bondy and Chvatal 1976; Ore 1960). If edge(vu) E, graph G* = (V,E{edge(vu)}) is Hamiltonian, and deg(v) + deg(u) >|V|, then graph G =

(V,E) is also Hamiltonian

    • The proof of this is shown on page 453; the theorem essentially says that some Hamiltonian graphs can be created from others by eliminating edges
    • This process leads to an algorithm where finding a Hamiltonian cycle is easy (by expanding the graph with more edges)
    • Then the cycle is manipulated by adding and removing edges until a Hamiltonian cycle is found based on the edges of the original graph
    • The algorithm is presented on pages 453 and 454
    • Figure 8.37 on page 455 shows an example of this

Data Structures and Algorithms in C++, Fourth Edition

eulerian and hamiltonian graphs continued6
Eulerian and Hamiltonian Graphs(continued)
  • Hamiltonian Graphs (continued) – The Traveling Salesman Problem
    • Thetravelling salesman problemconsists of finding the shortest possible route that visits each city (in a set of cities) exactly once and returns to the origin city
    • If the distances between each pair of cities is known, there are (n – 1)! possible routes
    • The problem is then to find a minimum Hamiltonian cycle
    • Many versions of this problem use the triangle inequality, dist(vivjj) <dist(vivk)+ dist(vkvj)
    • A possibility is to add to an already constructed path v1, …, vj a vertex vj+1, that is closest to vj
    • The problem is the last edge added may be as long as the total distance of the remaining edges

Data Structures and Algorithms in C++, Fourth Edition

eulerian and hamiltonian graphs continued7
Eulerian and Hamiltonian Graphs(continued)
  • Hamiltonian Graphs (continued) – The Traveling Salesman Problem
    • Another possibility uses a minimum spanning tree
    • The length of the tree is defined to be the sum of the lengths of all the edges of the tree
    • Since removing an edge from the tour creates a spanning tree, the tour cannot be less than the length of the minimum spanning tree
    • Also, each edge of the tree is traversed twice in a depth-first search, so the length of the tour is at most twice the length of the tree
    • However a path that includes each edge twice includes some vertices twice, and each vertex should be included only once

Data Structures and Algorithms in C++, Fourth Edition

eulerian and hamiltonian graphs continued8
Eulerian and Hamiltonian Graphs(continued)
  • Hamiltonian Graphs (continued) – The Traveling Salesman Problem
    • So if a vertex is already in such a path, its second occurrence is eliminated, and the path contracted
    • This shortens the length of the path due to the triangle inequality
    • For example, Figure 8.38b (pages 456 and 457) shows the minimum spanning tree for the graph that connects the cities a through h in Figure 8.38a
    • Depth-first search yields 8.38c, and applying the triangle inequality repeatedly (Figure 8.38c-i) transforms the path into the path in 8.38i
    • This final path can be obtained directly from the minimum spanning tree in Figure 8.38b using preorder traversal

Data Structures and Algorithms in C++, Fourth Edition

eulerian and hamiltonian graphs continued9
Eulerian and Hamiltonian Graphs(continued)
  • Hamiltonian Graphs (continued) – The Traveling Salesman Problem
    • The tour in Figure 8.38i is obtained by considering a as the vertex of the tree, so the cities are visited a, d, e, f, h, g, c, b from which we return to a
    • This tour is minimum, which won’t always be the case
    • For example, if d is considered to be the root, the algorithm yields the path in Figure 8.38j, clearly not minimal
    • In another version of the algorithm, a tour is extended by adding to it the closest city
    • Since the tour is kept in one piece, it resembles a method developed by Vojtech Jarnik in 1930 (and separately by Robert C. Prim in 1957)

Data Structures and Algorithms in C++, Fourth Edition

eulerian and hamiltonian graphs continued10
Eulerian and Hamiltonian Graphs(continued)
  • Hamiltonian Graphs (continued) – The Traveling Salesman Problem
    • This algorithm is shown on page 458
    • An example of its application is shown in Figure 8.39 on pages 458 and 459

Data Structures and Algorithms in C++, Fourth Edition

graph coloring
Graph Coloring

Occasionally, we want to determine the minimum number of sets of non-coincident vertices, where some vertices in each set are independent

By this we mean that the vertices are not connected by any edge

By example, we may have several tasks to be performed by several people

If one task can be performed by one person at one time, the scheduling must be such that this can be done

We can let the task represent vertices of a graph, and join with an edge two tasks that require the same person

Data Structures and Algorithms in C++, Fourth Edition

graph coloring continued
Graph Coloring (continued)

Then we try to construct the minimum number of sets of independent tasks

Because all the tasks in a given set can be done concurrently, the number of sets indicates the number of time slots needed

As a variation of this, we could join with an edge those tasks that cannot be performed concurrently

As before, the independent sets indicate the tasks that can be performed at the same time

However in this case the minimum number of sets indicates the minimum number of people needed to perform the tasks

In general, two vertices are joined by an edge if they cannot be members of the same class

Data Structures and Algorithms in C++, Fourth Edition

graph coloring continued1
Graph Coloring (continued)

We can restate the problem to say that vertices of a graph are assigned colors so that vertices joined by an edge are different colors

So the task amounts to coming up with a graph coloring using a minimum number of colors

More formally, given a set of colors, C, we determine a function f : V → C so that if edge(vw) exists, f(v) ≠ f(w) and C is of minimum cardinality

The chromatic number of a graph G is the minimum number of colors needed to color the graph, denoted χ(G)

A graph where k = χ(G) is called k-colorable

Data Structures and Algorithms in C++, Fourth Edition

graph coloring continued2
Graph Coloring (continued)
  • There may be many sets of minimum colors; no general formula exists for the chromatic number of an arbitrary graph
  • There are some special cases, however:
    • A complete graph, Knhas the chromatic number χ(Kn) = n
    • For a cycle with an even number of edges, C2n , χ(C2n) = 2
    • For a cycle with an odd number of edges, C2n + 1, χ(C2n + 1) = 3
    • For a bipartite graph, G, χ(G) < 2
  • The determination of a graph’s chromatic number is an NP-complete problem
  • Consequently, techniques need to be used that can color a graph with a number of colors close to the chromatic number

Data Structures and Algorithms in C++, Fourth Edition

graph coloring continued3
Graph Coloring (continued)

Sequential coloring is an approach that establishes sequences of vertices and colors before coloring the vertices

Then the next vertex in sequence is colored with the lowest number possible

This algorithm appears on page 460

The algorithm does not specify any ordering criteria for the vertices (order of colors makes no difference)

One possibility is to use the indices assigned to the vertices before the algorithm is executed, as shown in Figure 8.40b

This can result in a wide disparity between the coloring and the chromatic number, however

Data Structures and Algorithms in C++, Fourth Edition

graph coloring continued4
Graph Coloring (continued)

Fig. 8.40 (a) A graph used for coloring; (b) colors assigned to

vertices with the sequential coloring algorithm that orders

vertices by index number; (c) vertices are put in the largest

first sequence; (d) graph coloring obtained with the Brélazalgorithm

Data Structures and Algorithms in C++, Fourth Edition

graph coloring continued5
Graph Coloring (continued)

A theorem due to Dominic Welsh and M. B. Powell (1967) will be of use (the proof is on page 460)

Theorem:For the sequential coloring algorithm, the number of

colors needed to color the graph, χ(G) < maxmin(i, deg() + 1)

Applying this to the graph of Figure 8.40a, we have χ(G) = max(min(1,4), min(2,4), min(3,3),min(4,3), min(5,3), min(6,5),min(7,6), min(8,4)) = max(1, 2, 3, 3, 3, 5, 6, 4) = 6

The theorem suggests that vertices of higher degree be placed first, so the min value is their position in the sequence

Vertices of lower degree get placed last, so their minimum value is the degree of the vertex

This leads to the largest first approach, where the vertices are ordered in descending order by degree

Data Structures and Algorithms in C++, Fourth Edition

graph coloring continued6
Graph Coloring (continued)

Doing it this way gives us the order v7, v6, v1, v2, v8, v3, v4, v5,where v7 gets colored first, as seen in Figure 8.40c

This also gives us a better sense of the chromatic number, because with this ordering χ(G) < 4

Although this ordering method uses a single criterion, there is no restriction on the number of criteria that can be applied

This can be helpful in breaking ties, since in our example, two vertices with the same degree are chosen by their index order

In 1979, Daniel Brélaz proposed an algorithm where the saturation degree of a vertex (the number of colors of the vertex’s neighbors) is used

Data Structures and Algorithms in C++, Fourth Edition

graph coloring continued7
Graph Coloring (continued)

If a tie occurs, it is broken by choosing the vertex with the largest uncolored degree, which is the number of uncolored vertices adjacent to the vertex

This algorithm is on page 462, and is applied in Figure 8.40d

First we choose v7 because it has the highest degree; then vertices 1, 3, 4, 6 and 8 have their saturations set to 1

From these, v6 is chosen, since it has the most uncolored neighbors

The saturation of vertices 1 and 8 are changed to 2, and since their saturation and uncolored neighbors are equal, we rely on the index to select v1;the remainder are as shown in the figure

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory
NP-Complete Problems in Graph Theory
  • The Clique Problem
    • A cliquein an undirected graph is a subset of its vertices such that every two vertices in the subset are connected by an edge
    • The clique problem is to determine, for some graph G, whether or not it contains a clique Km for some integer m
    • The problem is NP because we can check in polynomial time whether a set of m vertices forming a subgraph is a clique
    • To show it is NP-complete, we can use the 3-satisfiability problem and reduce it to the clique problem
    • The reduction is performed by showing that for a Boolean expression BE of 3 variables in CNF we can construct a graph such that the expression is satisfiable if there is a clique of m vertices in the graph

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued
NP-Complete Problems in Graph Theory(continued)
  • The Clique Problem (continued)
    • We will let m be the number of alternatives in BE, such that we have BE = A1A2 … Am
    • Each Ai= (p qr), where the p, q, and rare the three Boolean variables or their negations
    • A graph is constructed where the vertices represent all the variables and their negations found in BE
    • An edge will join two vertices if they are not complements and they are in different alternatives
    • The expression BE = (x y z) (x y z) (w x y) corresponds to the graph in Figure 8.41

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued1
NP-Complete Problems in Graph Theory(continued)
  • The Clique Problem (continued)

Fig. 8.41 A graph corresponding to the Boolean

expression (x y ¬z) (x ¬y ¬z) (w ¬x ¬y)

    • An edge between variables represents the possibility that both variables are true at the same time

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued2
NP-Complete Problems in Graph Theory(continued)
  • The Clique Problem (continued)
    • An m-clique represents the possibility that a variable from each alternative is true, making the BE true
    • Each triangle in Figure 8.41 represents a 3-clique
    • This way, if BE is satisfiable, an m-clique exists, and if an m-clique exists, BE is satisfiable
    • So the satisfiability problem is reduced to the clique problem
    • Since the satisfiability problem is NP-complete, the clique problem is NP-complete as well

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued3
NP-Complete Problems in Graph Theory(continued)
  • The 3-Colorability Problem
    • The 3-colorability problem is the question of whether or not a graph can be colored with three colors
    • As with the clique problem, we’ll show this is NP-complete by reducing it to the 3-satisfiability problem
    • The problem is NP because we can come up with a coloring of the vertices in three colors and check that the coloring in correct in quadratic time
    • We will use an auxiliary 9-subgraph to reduce the 3-satisfiability problem to the 3-colorability problem
    • The 9-subgraph takes 3 vertices from an existing graph and adds 6 new vertices and 10 edges, as can be seen in Figure 8.42a

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued4
NP-Complete Problems in Graph Theory(continued)

The 3-Colorability Problem (continued)

Fig. 8.42 (a) A 9-subgraph; (b) a graph corresponding to the

Boolean expression (¬w x y) (¬w ¬y z) (w ¬y ¬z)

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued5
NP-Complete Problems in Graph Theory(continued)
  • The 3-Colorability Problem (continued)
    • Now, consider the set of three colors {f, t, n} corresponding to (fuchsia/false, turquoise/true, nasturtium/neutral) used to color the graph
    • The following lemma will help us in demonstrating the reducibility of the 3-satisfiability problem to the 3-colorability problem

Lemma. 1) If all three vertices, v1, v2, and v3, of a 9-subgraph are

colored with f, then vertex v4 must also be colored with f to have

the 9-subgraph colored correctly. 2) If only colors t and f can be

used to color vertices v1, v2, and v3 of a 9-subgraph, and at least

one is colored with t, then vertex v4 can be colored with t

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued6
NP-Complete Problems in Graph Theory(continued)
  • The 3-Colorability Problem (continued)
    • Now the graph for the given Boolean expression BE of k alternatives is constructed in the following way
    • There are two special vertices, a and b, and edge(ab) in the graph; also there is a vertex for the variables in BE and for the negation of these
    • The graph includes edge(ax), edge(a(x)), and edge(x(x)) for each vertex, x, and its negation, x
    • Now, the graph has a 9-subgraph whose vertices v1, v2, and v3 correspond to the three Boolean variables or their negations p, q, and r in the alternative p qr included in BE
    • Lastly, the graph includes edge(v4b) for each 9-subgraph

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued7
NP-Complete Problems in Graph Theory(continued)
  • The 3-Colorability Problem (continued)
    • The graph corresponding to (w x y) (w yz) (w y z) is shown in Figure 8.42b
    • Now we can claim that if a Boolean expression BE is satisfiable, the graph corresponding to it is 3-colorable
    • For every variable x in BE, if x is true we set color(x) = t and color(x) = f; otherwise color(x) = f and color(x) = t
    • If each alternative in BE is satisfiable, then the Boolean expression is satisfiable
    • This takes place when at least one variable or its negation is true in each alternative

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued8
NP-Complete Problems in Graph Theory(continued)
  • The 3-Colorability Problem (continued)
    • Since each neighbor of a has color t or f, and since at least one of the three vertices of each 9-subgraph has color t, each 9-subgraph is 3-colorable
    • Thus color(v4) = t, and the entire graph is 3-colorable by setting color(a) = n and color(b) = f
    • Now, suppose a graph as in Figure 8.42b is 3-colorable and that color(a) = n and color(b) = f
    • Since color(a) = n, the neighbors of a have color f or t, and this can be interpreted as the Boolean variable associated with the vertices being true or false

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued9
NP-Complete Problems in Graph Theory(continued)
  • The 3-Colorability Problem (continued)
    • Only if all three vertices of any 9-subgraph have color f can vertex v4have color f, but this would conflict with color f of vertex b
    • So no 9-subgraph’s vertices can all have color f; one must be t
    • As a consequence, each alternative of the 9-subgraph is true, so the entire Boolean expression is satisfiable

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued10
NP-Complete Problems in Graph Theory(continued)
  • The Vertex Cover Problem
    • A vertex cover of a graph is a set of vertices such that each edge of the graph is incident to at least one vertex of the set
    • In this way the vertices in the set cover all the edges
    • The problem to determine whether a graph, G, has a vertex cover containing at most k vertices for some integer k is NP-complete
    • This problem is NP because a solution can be guessed and checked in polynomial time
    • To show it is NP-complete, we’ll reduce the clique problem to the vertex cover problem

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued11
NP-Complete Problems in Graph Theory(continued)
  • The Vertex Cover Problem (continued)
    • The first thing to do is define a complement graph of G that has the same vertices, but whose connections are edges not in G
    • The reduction algorithm converts a graph G with a ( - k) – clique into its complement with a vertex cover size of k
    • If C = (VC , EC) is a clique in G, vertices from V – VC cover all the edges in the complement, because it has no edges with both vertices in VC
    • As a result, V – VCis a vertex cover in the complement graph,
    • Figure 8.43a shows a graph with a clique and 8.43b shows a complement graph with a vertex cover
    • Now suppose there is a vertex cover W for

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued12
NP-Complete Problems in Graph Theory(continued)
  • The Vertex Cover Problem (continued)
    • If W contains none of the endpoints of an edge, that edge must be in G meaning the latter endpoints are in V – W
    • Therefore, VC= V – W forms a clique
    • As a result, this proves a positive answer to the clique problem is a positive answer to the vertex cover problem through the conversion
    • And since the former is NP-complete, so is the latter

Fig. 8.43 (a) A graph with a clique; (b) a complement graph

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued13
NP-Complete Problems in Graph Theory(continued)
  • The Hamiltonian Cycle Problem
    • Asserting that the Hamiltonian cycle problem is NP-complete can be shown by reducing the vertex cover problem to the Hamiltonian cycle problem
    • We will make use of an auxiliary 12-graph, as shown in Figure 8.44a
    • Each edge(vu) of the graph G is converted into a 12-subgraph so that one side of the subgraph (vertices a and b) corresponds to a vertex v of G and the other side (vertices c and d) corresponds to vertex u
    • After entering a side of the 12-subgraph at vertex a, we can go through all 12 vertices in order a, c, d, b and exit at b on the same side
    • We can also go directly from a to b, and if there is a Hamiltonian circuit in the entire graph, vertices c and b are traversed in another visit of the 12-subgraph

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued14
NP-Complete Problems in Graph Theory(continued)

The Hamiltonian Cycle Problem (continued)

Fig. 8.44 (a) A 12-subgraph; (b) a graph G and (c) its transformation, graph GH

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued15
NP-Complete Problems in Graph Theory(continued)
  • The Hamiltonian Cycle Problem (continued)
    • Any other path through the 12-subgraph would render building a Hamiltonian cycle of the entire graph impossible
    • Now, assuming we have a graph G, we can proceed to build another graph, GH, in the following manner
    • We first create a set of vertices u1, u2, …, uk, where the value k is the parameter that corresponds to the vertex cover problem for graph G
    • Next, for each edge of G, we create a 12-subgraph, and those 12-subgraphs associated with a vertex v are connected together on the sides corresponding to v
    • Finally, the endpoint of the string of these 12-subgraphs is connected to the vertices u1, u2, …, uk

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued16
NP-Complete Problems in Graph Theory(continued)
  • The Hamiltonian Cycle Problem (continued)
    • The result of this transformation from G to GH for k = 3 is shown in Figure 8.44b-c
    • Figure 8.44c only shows some of the connections, to avoid clutter; the small segments from the other vertices indicate other connections
    • Now the claim is that there is a Hamiltonian cycle in GH if there is a vertex cover of size k in G
    • We’ll start by assuming there is a vertex cover in G, designated by the set W = {v1, v2, …, vk}
    • Next, we’ll assert there is a Hamiltonian cycle in GH, which is formed in the following procedure

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued17
NP-Complete Problems in Graph Theory(continued)
  • The Hamiltonian Cycle Problem (continued)
    • Starting at u1, we go through the sides of 12-subgraphs corresponding to v1
    • We will go through all the 12 vertices of a particular 12-subgraph if the other side of it does not correspond to a vertex in set W
    • Otherwise we go straight through the 12-subgraph, which means we won’t traverse 6 of the vertices corresponding to a vertex w
    • However, we will traverse them when we process that part of the Hamiltonian cycle corresponding to w
    • Once we reach the end of the string of 12-subgraphs, we go to vertex u2 and repeat this process for vertex v2, etc.
    • For the last vertex uk, we process vk and end the path at u1

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued18
NP-Complete Problems in Graph Theory(continued)
  • The Hamiltonian Cycle Problem (continued)
    • The result of this is the creation of a Hamiltonian cycle
    • The thick line in Figure 8.44c represents the part of the Hamiltonian cycle matching v1 that starts at u1 and ends at u2
    • Because the cover in this case is W = {v1, v2, v6}, this processing continues at u2 and ends at u3 for v2, and then for v6 from u3 to u1
    • Now if GH has a Hamiltonian cycle, conversely it would have k 12-subgraph strings including subpaths that correspond to the k vertices in GC that form a cover
    • Consequently, we have shown the reducibility of the vertex cover problem to the Hamiltonian cycle problem, and since the former is NP-complete, so is the latter

Data Structures and Algorithms in C++, Fourth Edition

np complete problems in graph theory continued19
NP-Complete Problems in Graph Theory(continued)
  • The Hamiltonian Cycle Problem (continued)
    • As an afterthought, now consider the traveling salesman problem
    • Given a graph with distance assigned to each edge, we try to identify a cycle with a total distance not greater than some integer, k
    • We can demonstrate this problem is NP-complete by reducing it to the Hamiltonian cycle problem

Data Structures and Algorithms in C++, Fourth Edition