1 / 30

Greedy Algorithms - PowerPoint PPT Presentation

Greedy Algorithms. Lecture 18. Prof. Sin-Min Lee Department of Computer Science. Greedy Algorithms:

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about ' Greedy Algorithms' - ghalib

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Greedy Algorithms

Lecture 18

Prof. Sin-Min Lee

Department of Computer Science

Many real-world problems are optimization problems in that they attempt to find an optimal solution among many possible candidate solutions. A familiar scenario is the change-making problem that we often encounter at a cash register: receiving the fewest numbers of coins to make change after paying the bill for a purchase. For example, the purchase is worth \$5.27, how many coins and what coins does a cash register return after paying a \$6 bill?

The Make-Change algorithm:

For a given amount (e.g. \$0.73), use as many quarters (\$0.25) as possible without exceeding the amount. Use as many dimes (\$.10) for the remainder, then use as many nickels (\$.05) as possible. Finally, use the pennies (\$.01) for the rest.

Example: To make change for the amount x = 67 (cents). Use q = x/25 = 2 quarters. The remainder = x – 25q = 17, which we use d = 17/10 = 1 dime. Then the remainder = 17 – 10d = 7, so we use n = 7/5 = 1 nickel. Finally, the remainder = 7 – 5n = 2, which requires p = 2/1 = 2 pennies. The total number of coins used = q + d + n + p = 6.

Note: The above algorithm is optimal in that it uses the fewest number of coins among all possible ways to make change for a given amount. (This fact can be proven formally.) However, this is dependent on the denominations of the US currency system. For example, try a system that uses denominations of 1-cent, 6-cent, and 7-cent coins, and try to make change for x = 18 cents. The greedy strategy uses 2 7-cents and 4 1-cents, for a total of 6 coins. However, the optimal solution is to use 3 6-cent coins.

(1) Initialize C to be the set of candidate solutions (2) Initialize a set S = the empty set  (the set is to be the optimal solution we are constructing). (3) While C   and S is (still) not a solution do (3.1) select x from set C using a greedy strategy (3.2) delete x from C (3.3) if {x}  S is a feasible solution, then S = S  {x} (i.e., add x to set S) (4) if S is a solution then return S (5)else return failure

In general, a greedy algorithm is efficient because it makes a sequence of (local) decisions and never backtracks. The solution is not always optimal, however.

Given n objects each have a weightwi and a valuevi , and given a knapsack of total capacityW. The problem is to pack the knapsack with these objects in order to maximize the total value of those objects packed without exceeding the knapsack’s capacity. More formally, let xi denote the fraction of the object i to be included in the knapsack, 0  xi  1, for 1  i  n. The problem is to find values for the xi such that

Note that we may assume because otherwise, we would choose xi= 1 for each i which would be an obvious optimal solution.

(Max value) Sort the objects from the highest value to the lowest, then pick them in that order.

(Min weight) Sort the objects from the lowest weight to the highest, then pick them in that order.

(Max value/weight ratio) Sort the objects based on the value to weight ratios, from the highest to the lowest, then select.

Example: Given n = 5 objects and a knapsack capacity W = 100 as in Table I. The three solutions are given in Table II.

select xi value Max vi 0 0 1 0.5 1 146 Min wi 1 1 1 1 0 156 Max vi/wi 1 1 1 0 0.8 164

w 10 20 30 40 50 v 20 30 66 40 60 v/w 2.0 1.5 2.2 1.0 1.2

Table I

Table II

Input: an integer n, positive values wi and vi , for 1  i  n, and another positive value W.

Output:n values xi such that 0  xi  1 and

Algorithm (of time complexity O(n lgn))

(1) Sort the n objects from large to small based on the ratios vi/wi. We assume the arrays w[1..n] and v[1..n] store the respective weights and values after sorting. (2) initialize array x[1..n] to zeros. (3) weight = 0; i = 1 (4) while (i n and weight < W) do (4.1) if weight + w[i] W then x[i] = 1 (4.2) else x[i] = (W – weight) / w[i] (4.3) weight = weight + x[i] * w[i] (4.4) i++

Example. Suppose there are 3 sorted lists L1, L2, and L3, of sizes 30, 20, and 10, respectively, which need to be merged into a combined sorted list, but we can merge only two at a time. We intend to find an optimal merge pattern which minimizes the total number of comparisons. For example, we can merge L1 and L2, which uses 30 + 20 = 50 comparisons resulting in a list of size 50. We can then merge this list with list L3, using another 50 + 10 = 60 comparisons, so the total number of comparisons is 50 + 60 = 110. Alternatively, we can merge lists L2 and L3, using 20 + 10 = 30 comparisons, the resulting list (size 30) can then be merged with list L1, for another 30 + 30 = 60 comparisons. So the total number of comparisons is 30 + 60 = 90. It doesn’t take long to see that this latter merge pattern is the optimal one.

Binary Merge Trees: We can depict the merge patterns using a binary tree, built from the leaf nodes (the initial lists) towards the root in which each merge of two nodes creates a parent node whose size is the sum of the sizes of the two children. For example, the two previous merge patterns are depicted in the following two figures:

Cost = 30*2 + 20*2 + 10*1 = 110

Cost = 30*1 + 20*2 + 10*2 = 90

60

60

10

30

30

50

20

10

30

20

Merge L1 and L2, then with L3

Merge L2 and L3, then with L1

merge cost = sum of all weighted external path lengths

Input: n leaf nodes each have an integer size, n 2.

Output: a binary tree with the given leaf nodes which has a minimum total weighted external path lengths

Algorithm:

(1)create a min-heap T[1..n ] based on the n initial sizes. (2) while (the heap size  2) do (2.1) delete from the heap two smallest values, call them a and b, create a parent node of size a + b for the nodes corresponding to these two values (2.2) insert the value (a + b) into the heap which corresponds to the node created in Step (2.1)

When the algorithm terminates, there is a single value left in the heap whose corresponding node is the root of the optimal binary merge tree. The algorithm’s time complexity is O(n lgn) because Step (1) takes O(n) time; Step (2) runs O(n) iterations, in which each iteration takes O(lgn) time.

Initially, 5 leaf nodes with sizes

2

3

5

7

9

5

Iteration 1: merge 2 and 3 into 5

2

3

5

7

9

10

Iteration 2: merge 5 and 5 into 10

Iteration 3: merge 7 and 9 (chosen among 7, 9, and 10) into 16

5

5

16

2

3

7

9

26

Iteration 4: merge 10 and 16 into 26

16

10

5

5

7

9

Cost = 2*3 + 3*3 + 5*2 + 7*2 + 9*2 = 57.

2

3

We use induction on n 2 to show that the binary merge tree is optimal in that it gives the minimum total weighted external path lengths (among all possible ways to merge the given leaf nodes into a binary tree). (Basis) When n = 2. There is only one way to merge two nodes. (Induction Hypothesis) Suppose the merge tree is optimal when there are k leaf nodes, for some k  2. (Induction) Consider (k + 1) leaf nodes. Call them a1, a2, …, and ak+1. We may assume nodes a1, a2 are of the smallest values, which are merged in the first step of the merge algorithm into node b. We call the merge tree T, the part excluding a1, a2T’ (see figure). Suppose an optimal binary merge tree is S. We make two observations. (1) If node x of S is a deepest internal node, we may swap its two children with nodes a1, a2 in S without increasing the total weighted external path lengths. Thus, we may assume tree S has a subtree S’ with leaf nodes x, a2, …, and ak+1. (2) The tree S’ must be an optimal merge tree for k nodes x, a2, …, and ak+1. By induction hypothesis, tree S’ has a total weighted external path lengths equal to that of tree T’. Therefore, the total weighted external path lengths of T equals to that of tree S, proving the optimality of T.

T

S

T’

S’

b

x

a1

a2

a1

a2

Suppose we wish to save a text (ASCII) file on the disk or to transmit it though a network using an encoding scheme that minimizes the number of bits required. Without compression, characters are typically encoded by their ASCII codes with 8 bits per character. We can do better if we have the freedom to design our own encoding.

Example. Given a text file that uses only 5 different letters (a, e, i, s, t), the space character, and the newline character. Since there are 7 different characters, we could use 3 bits per character because that allows 8 bit patterns ranging from 000 through 111 (so we still one pattern to spare). The following table shows the encoding of characters, their frequencies, and the size of encoded (compressed) file.

Character Frequency Code Total bits a 10 000 30 e 15 001 45 i 12 010 36 s 3 011 9 t 4 100 12 space 13 101 39 newline 1 110 3

Total 58 174

Code Total bits 001 30 01 30 10 24 00000 15 0001 16 11 26 00001 5

146

Fixed-length encoding

Variable-length encoding

If we can use variable lengths for the codes, we can actually compress more as shown in the above. However, the codes must satisfy the property that no code is the prefix of another code; such code is called a prefix code.

How to design an optimal prefix code (i.e., with minimum total length) for a given file?

We can depict the codes for the given collection of characters using a binary tree as follows: reading each code from left to right, we construct a binary tree from the root following the left branch when encountering a ‘0’, right branch when encountering a ‘1’. We do this for all the codes by constructing a single combined binary tree. For example,

1

0

0

0

1

1

0

0

0

0

1

1

1

Codes 001, 01, 10, 00000, 0001, 11, and 00001

Note: each code terminates at a leaf node, by the prefix property.

Code 001

Codes 001

and 01

Codes 001, 01, and 10

We note that the encoded file size is equal to the total weighted external path lengths if we assign the frequency to each leaf node. For example,

‘e’

15

12

13

Total file size = 3*5 + 1*5 + 4*4 + 10*3 + 15*2 + 12*2 + 13*2 = 146, which is exactly the total weighted external path lengths.

‘i’

‘ ’

10

‘a’

4

‘t’

3

1

‘s’

‘\n’

We also note that in an optimal prefix code, each node in the tree has either no children or has two. Thus, the optimal binary merge tree algorithm finds the optimal code (Huffman code).

x

x

y

Merge x and y, reducing total size

Node x has only one child y

Greedy Strategies Applied to Graph problems: weighted external path lengths if we assign the frequency to each leaf node. For example,

We first review some notations and terms about graphs. A graph consists of vertices (nodes) and edges (arcs, links), in which each edge “connects” two vertices (not necessarily distinct). More formally, a graph G = (V, E), where V and E denote the sets of vertices and edges, respectively.

In this example, V = {1, 2, 3, 4}, E = {a, b, c, d, e}. Edges c and d are parallel edges; edge e is a self-loop. A path is a sequence of “adjacent” edges, e.g., path abeb, path acdab.

1

a

b

2

3

d

c

e

4

Directed graphs vs. (un-directed) graphs: weighted external path lengths if we assign the frequency to each leaf node. For example,

If every edge has an orientation, e.g., an edge starting from node x terminating at node y, the graph is called a directed graph, or digraph for short. If all edges have no orientation, the graph is called an undirected graph, or simply, a graph. When there are no parallel edges (two edges that have identical end points), we could identify an edge with its two end points, such as edge (1,2), or edge (3,3). In an undirected graph, edge (1,2) is the same as edge (2,1). We will assume no parallel edges unless otherwise stated.

1

A directed graph. Edges c and d are parallel (directed) edges. Some directed paths are ad, ebac.

a

b

2

3

d

c

e

4

Both directed and undirected graphs appear often and naturally in many scientific (call graphs in program analysis), business (query trees, entity-relation diagrams in databases), and engineering (CAD design) applications. The simplest data structure for representing graphs and digraphs is using 2-dimensional arrays. Suppose G = (V, E), and |V| = n. Declare an array T[1..n][1..n] so that T[i][j] = 1 if there is an edge (i, j)  E; 0 otherwise. (Note that in an undirected graph, edges (i, j) and (j, i) refer to the same edge.)

j

1 2 3 4

1

A 2-dimensional array for the digraph, called the adjacency matrix.

1234

2

i

3

4

Sometimes, edges of a graph or digraph are given a positive naturally in many scientific (call graphs in program analysis), business (query trees, entity-relation diagrams in databases), and engineering (CAD design) applications. The simplest data structure for representing graphs and digraphs is using 2-dimensional arrays. Suppose weight or cost value. In that case, the adjacency matrix can easily modified so that T[i][j] = the weight of edge (i, j); 0 if there is no edge (i, j). Since the adjacency matrix may contain many zeros (when the graph has few edges, known as sparse), a space-efficient representation uses linked lists representing the edges, known as the adjacency list representation.

1

2

1234

4

2

3

3

1

2

4

The adjacency lists for the digraph, which can store edge weights by adding another field in the list nodes.

The Minimum Spanning Tree (MST) Problem: naturally in many scientific (call graphs in program analysis), business (query trees, entity-relation diagrams in databases), and engineering (CAD design) applications. The simplest data structure for representing graphs and digraphs is using 2-dimensional arrays. Suppose

Given a weighted (undirected) graph G = (V, E), where each edge e has a positive weight w(e). A spanning tree of G is a tree (connected graph without cycles, or circuits) which has V as its vertex set, i.e., the tree connects all vertices of the graph G. If |V| = n, then the tree has n – 1 edges (this is a fact which can be proved by induction). A minimum spanning tree of G is a spanning tree that has the minimum total edge weight.

1

1

3

3

6

6

8

A minimum spanning tree (of 4 edges), weight = 3 + 2 + 4 + 6 = 15.

2

2

3

3

5

4

4

7

5

5

4

4

2

2

A weighted graph of no parallel edges or self-loops

Prim’s Algorithm for the Minimum Spanning Tree problem: naturally in many scientific (call graphs in program analysis), business (query trees, entity-relation diagrams in databases), and engineering (CAD design) applications. The simplest data structure for representing graphs and digraphs is using 2-dimensional arrays. Suppose

Create an array B[1..n] to store the nodes of the MST, and an array T[1..n –1] to store the edges of the MST. Starting with node 1 (actually, any node can be the starting node), put node 1 in B[1], find a node that is the closest (i.e., an edge connected to node 1 that has the minimum weight, ties broken arbitrarily). Put this node as B[2], and the edge as T[1]. Next look for a node connected from either B[1] or B[2] that is the closest, store the node as B[3], and the corresponding edge as T[2]. In general, in the kth iteration, look for a node not already in B[1..k] that is the closest to any node in B[1..k]. Put this node as B[k+1], the corresponding edge as T[k]. Repeat this process for n –1 iterations (k = 1 to n –1). This is a greedy strategy because in each iteration, the algorithm looks for the minimum weight edge to include next while maintaining the tree property (i.e., avoiding cycles). At the end there are exactly n –1 edges without cycles, which must be a spanning tree.

Example: Prim’s MST Algorithm. naturally in many scientific (call graphs in program analysis), business (query trees, entity-relation diagrams in databases), and engineering (CAD design) applications. The simplest data structure for representing graphs and digraphs is using 2-dimensional arrays. Suppose

Step Next edge selected Partial tree

1

3

Initially

1

6

8

2

3

3

1

5

1 (1,5), weight=3

4

7

5

5

4

1

2

2 (5,4), weight=2

5

A weighted graph

4

2

1

3 (4,2), weight=4

2

5

4

4

1

6

4 (1,3), weight=6

2

5

3

4

An adjacency matrix implementation of Prim’s algorithm: naturally in many scientific (call graphs in program analysis), business (query trees, entity-relation diagrams in databases), and engineering (CAD design) applications. The simplest data structure for representing graphs and digraphs is using 2-dimensional arrays. Suppose

Input: W[1..n][1..n] with W[i, j] = weight of edge (i, j); set W[i, j] =  if no edge Output: an MST with tree edges stored in T[1..n –1] Algorithm:

(1) declare nearest[2..n], minDist[2..n] such that minDistt[i] = the minimum edge weight connecting node i to any node in partial tree T, and nearest[i]=the node in T that gives minimum distance for node i. (2) for i = 2 to n do nearest[i]=1; minDist[i]=W[i, 1] (3) for p = 1 to (n –1) do (3.1) min =  (3.2) for j = 2 to n do if 0  minDist[j] < min then min = minDist[j]; k = j (3.3) T[p] = edge (nearest[k], k) // selected the nest edge (3.4) minDist[k] = –1 // a negative value means node k is “in” (3.5) for j = 2 to n do // update minDist and nearest values if W[j, k] < minDist[j] then minDist[j] = W[j, k]; nearest[j] = k

The time complexity is O(n2) because Step (3) runs O(n) iterations, each iteration runs O(n) time in Steps (3.2) and (3.5).

Tree T

nearest[i]

i

minDist[i]

The Single-Source Shortest Paths Problem: naturally in many scientific (call graphs in program analysis), business (query trees, entity-relation diagrams in databases), and engineering (CAD design) applications. The simplest data structure for representing graphs and digraphs is using 2-dimensional arrays. Suppose

Given a directed graph, and a single node called the source. For each of the remaining nodes, find a shortest path connected from the source (assuming the direction of the edges along the paths are respected).

A Greedy algorithm due to Dijkstra which finds these shortest paths in sequence can be described as follows: find the shortest among all shortest paths (from the source), then find the second shortest, etc., breaking ties arbitrarily, until all shortest paths are found. During the process, the collection of all the shortest paths determined so far form a tree; the next shortest path is selected by finding a node that is one edge away from the current tree and has the shortest distance measured from the source.

Example (Dijkstra’s shortest paths algorithm): naturally in many scientific (call graphs in program analysis), business (query trees, entity-relation diagrams in databases), and engineering (CAD design) applications. The simplest data structure for representing graphs and digraphs is using 2-dimensional arrays. Suppose

Remaining nodes and the distances step tree of shortest paths from the source

1

50

10

30

C = [ 2, 3, 4, 5] D = [50,30,100,10]

5

2

Initially

1

100

20

10

5

Choose node 5

1

Changed from 100

[ 2, 3, 4] [50,30, 20]

4

3

5

50

A weighted directed graph, source node = 1

1

Changed from 50

[ 2, 3] [40,30]

Choose node 4

5

4

Shortest paths: To Path Distance

5 (1,5) 10 4 (1,5,4) 20 3 (1,3) 30 2 (1,3,2) 35

1

Choose node 3

[ 2] [35]

Changed from 40

3

5

4

1

2

Choose node 2

5

3

4

Implementation of Dijkstra’s algorithm: naturally in many scientific (call graphs in program analysis), business (query trees, entity-relation diagrams in databases), and engineering (CAD design) applications. The simplest data structure for representing graphs and digraphs is using 2-dimensional arrays. Suppose

Input: W[1..n][1..n] with W[i, j] = weight of edge (i, j); set W[i, j] =  if no edge Output: an array D[2..n] of distances of shortest paths to each node in [2..n] Algorithm:

(1) C = {2,3,…,n} // the set of remaining nodes (2) for i = 2 to n do D[i] = W[1,i] // initialize distance from node 1 to node i (3) repeat the following n – 2 times // determine the shortest distances (3.1) select node v of set C that has the minimum value in array D (3.2) C = C – {v} // delete node v from set C (3.3) for each node w in C do if (D[v] + W[v, w] < D[w]) then D[w] = D[v] + W[v, w] // update D[w] if found shorter path to w

The algorithm’s time complexity is O(n2) because Steps (1) and (2) each take O(n) time; Step (3) runs in O(n) iterations in which each iteration runs in O(n) time.

1

D[w]

D[v]

w

Tree of shortest paths

W[v,w]

v

Graph (and Digraph) Traversal techniques: naturally in many scientific (call graphs in program analysis), business (query trees, entity-relation diagrams in databases), and engineering (CAD design) applications. The simplest data structure for representing graphs and digraphs is using 2-dimensional arrays. Suppose

Given a (directed) graph G = (V, E), determine all nodes that are connected from a given node v via a (directed) path.

The are essentially two graph traversal algorithms, known as Breadth-first search (BFS) and depth-first search (DFS), both of which can be implemented efficiently.

BFS: From node v, visit each of its neighboring nodes in sequence, then visit their neighbors, etc., while avoiding repeated visits.

DFS: From node v, visit its first neighboring node and all its neighbors using recursion, then visit node v’s second neighbor applying the same procedure, until all v’s neighbors are visited, while avoiding repeated visits.

Breadth-First Search (BFS): naturally in many scientific (call graphs in program analysis), business (query trees, entity-relation diagrams in databases), and engineering (CAD design) applications. The simplest data structure for representing graphs and digraphs is using 2-dimensional arrays. Suppose

BFS(v) // visit all nodes reachable from node v (1) create an empty FIFO queue Q, add node v to Q (2) create a boolean array visited[1..n], initialize all values to false except for visited[v] to true (3) while Q is not empty (3.1) delete a node w from Q (3.2) for each node z adjacent from node w if visited[z] is false then add node z to Q and set visited[z] to true

The time complexity is O(n+e) with n nodes and e edges, if the adjacency lists are used. This is because in the worst case, each node is added once to the queue (O(n) part), and each of its neighbors gets considered once (O(e) part).

1

Node search order starting with node 1, including two nodes not reached

2

4

5

6

3

Depth-First Search (DFS): naturally in many scientific (call graphs in program analysis), business (query trees, entity-relation diagrams in databases), and engineering (CAD design) applications. The simplest data structure for representing graphs and digraphs is using 2-dimensional arrays. Suppose

(1) create a boolean array visited[1..n], initialize all values to false except for visited[v] to true (2) call DFS(v) to visit all nodes reachable via a path

DFS(v) for each neighboring nodes w of v do if visited[w] is false then set visited[w] to true; call DFS(w) // recursive call

1

Node search order starting with node 1, including two nodes not reached

The algorithm’s time complexity is also O(n+e) using the same reasoning as in the BFS algorithm.

2

5

4

6

3