390 likes | 649 Views
Design of Algorithms by Induction Part 1. Algorithm Design and Analysis 2015 - Week 3 http://bigfoot.cs.upt.ro/~ioana/algo/ Bibliography: [Manber]- Chap 5. Algorithm Analysis <-> Design. Given an algorithm, we can Analyze its complexity Proof its correctness But:
E N D
Design of Algorithms by InductionPart 1 Algorithm Design and Analysis 2015 - Week 3 http://bigfoot.cs.upt.ro/~ioana/algo/ Bibliography: [Manber]- Chap 5
Algorithm Analysis <-> Design • Given an algorithm, we can • Analyze its complexity • Proof its correctness • But: • Given a problem, how do we design a solution that is both correct and efficient ? • Is there a general method for this ?
Induction proofs vs Design of algorithms by induction (1) • Induction used to prove that a statement P(n) holds for all integers n: • Base case: Prove P(0) • Assumption: assume that P(n-1) is true • Induction step: prove that P(n-1) implies P(n) for all n>0 • Strong induction:when we assume P(k) is true for all k<=n-1 and use this in proving P(n)
Induction proofs vs Design of algorithms by induction (2) • Induction used in algorithm design: • Base case: Solve a small instance of the problem • Assumption: assume you can solve smaller instances of the problem • Induction step: Show how you can construct the solution of the problem from the solution(s) of the smaller problem(s)
Design of algorithms by induction • Represents a fundamental design principle that underlies techniques such as divide & conquer, dynamic programming and even greedy • How to reduce the problem to a smaller problem or a set of smaller problems ? (n -> n-1, n/2, n/4, …?) • Examples: • The successful party problem [Manber 5.3] • The celebrity problem [Manber 5.5] • The skyline problem [Manber 5.6] • One knapsack problem [Manber 5.10] • The Max Consecutive Subsequence [Manber 5.8] Part 1 Part 2
The successful party problem • Problem: you are arranging a party and have a list of n persons that you could invite. In order to have a successful party, you want to invite as many people as possible, but every invited person must be friends with at least k of the other party guests. For each person, you know his/her friends. Find the set of invited people.
Successful party - Example K=3 Ann Bob Finn Chris Ellie Dan
Successful party – Direct approach • Direct approach for solving: remove persons who have less than k friends • But: which is the right order of removing ? • Remove all persons with less than k friends, then deal with the persons that are left without enough friends ? • Remove first one person, then continue with affected persons ? Instead of thinking about our algorithm as a sequence of steps to be executed, think of proving a theorem that the algorithm exists
The design by induction approach • Instead of thinking about our algorithm as a sequence of steps to be executed, think of proving a theorem that the algorithm exists • We need to prove that this “theorem” holds for a base case, and that if it holds for “n-1” this implies that it holds for “n”
Successful party - Solution • Induction hypothesis:We know how to find the solution (maximal list of invited persons that have at least k friends among the other invited persons), provided that the number of given persons is <n • Base case: • n<= k: no person can be invited • n=k+1:if every person knows all of the other persons, everybody gets invited. Otherwise, no one can be invited.
Successful party - Solution • Inductive step: • Assume we know how to select the invited persons out of a list of n-1, prove that we know how to select the invited persons out of a list of n, n>k+1 • If all the n persons have more than k friends among them, all n persons get invited => problem solved • Else, there exists at least one person that has less than k friends. Remove this person and the solution is what results from solving the problem for the remaining n-1 persons. By induction assumption, we know to solve the problem for n-1 => problem solved
Successful party - Conclusions • Designing by induction got us to a correct algorithm, because we designed it proving its correctness • Every invited person knows at least k other invited • We got the maximum possible number of invited persons • The best way to reduce a problem is to eliminate some of its elements. • In this case, it was clear which persons to eliminate • We will see examples where the elimination is not so straightforward
The Celebrity problem • Problem: A celebrity in a group of people is someone who is known by everybody but does not know anyone. You are allowed to ask anyone from the group a question such as “Do you know that person?” pointing to any other person from the group. Identify the celebrity (if one exists) by asking as few questions as possible • Problem: • Given a n*n matrix “know” with know[p, q] = true if p knows q and know[p, q] = false otherwise. • Determine whether there exists an i such that: • Know[j; i] = true (for all j, j≠ i) and Know[i; j] = false (for all j, j≠i )
Celebrity - Solution 1 • Brute force approach: ask questions arbitrary, for each person ask questions for all others => n*(n-1) questions asked
Celebrity - Solution 2 • Use induction: • Base case: Solution is trivial for n=1, one person is a celebrity by definition. • Assumption: Assume we leave out one person and that we can solve the problem for the remaining n-1 persons. • If there is a celebrity among the n-1 persons, we can find it • If there is no celebrity among the n-1 persons, we find this out • Induction step: • For the n'th person we have 3 possibilities: • The celebrity was among the n-1 persons • The n’th person is the celebrity • There is no celebrity in this group
Celebrity - Sol 2- Induction step • We must solve the problem for n persons, assuming that we know to solve it for n-1 • We leave out one person. • We choose this person randomly – let it be the n’th person • Case 1: There is a celebrity among the n-1 persons, say p. To check if this is also a celebrity for the n'th person • check if know[n, p] and not know[p, n] • Case 2: There is no celebrity among the n-1 persons. In this case, either person n is the celebrity, or there is no celebrity in the group. • To check this we have to find out if know[i, n] and not know[n, i], for all i <>n.
Celebrity – Solution 2 Function Celebrity_Sol2(n:integer) returns integer if n = 1 then return 1 else p = Celebrity_Sol2(n-1); if p != 0 then // p is the celebrity among n-1 if( knows[n,p] and not knows[p,n] ) then return p end if end if forall i = 1..n-1 if (knows[n, i] or not knows[i, n]) then return 0 // there is no celebrity end if end for return n // n is the celebrity
Celebrity – Solution 2 Analysis T(n) Function Celebrity_Sol2(n:integer) returns integer if n = 1 then return 1 else p = Celebrity_Sol2(n-1); if p != 0 then // p is the celebrity among n-1 if( knows[n,p] and not knows[p,n] ) then return p end if end if forall i = 1..n-1 if (knows[n, i] or not knows[i, n]) then return 0 // there is no celebrity end if end for return n // n is the celebrity T(n-1) O(n)
Celebrity – Solution 2 Analysis • T(n)=T(n-1) + n • T(1)=1 • T(n)=1+2+3+ +n=n*(n+1)/2 • T(n) is O(n2) • We have reduced a problem of size n to a problem of size n-1. We then still have to relate the n-th element to the n-1 other elements, and here this is done by a sequence which is O(n), so we get an algorithm of complexity O (n2), which is the same as the brute force. • If we want to reduce the complexity of the algorithm to O(n), we should have T(n)=T(n-1)+c
Celebrity – Solution 3 • The key idea here is to reduce the size of the problem from n persons to n-1, but in a clever way – by eliminating someone who is a non-celebrity. • After each question, we can eliminate a person • if knows[i,j] then i cannot be a celebrity => elim i • if not knows[i,j] then j cannot be a celebrity => elim j
Celebrity - Solution 3 • Use induction: • Base case: Solution is trivial for n=1, one person is a celebrity by definition. • Assumption: Assume we leave out one person who is not a celebrity and that we can solve the problem for the remaining n-1 persons. • If there is a celebrity among the n-1 persons, we can find it • If there is no celebrity among the n-1 persons, we find this out • Induction step: • For the n'th person we have only 2 possibilities left: • The celebrity was among the n-1 persons • There is no celebrity in this group
Celebrity - Sol 3- Induction step • We must solve the problem for n persons, assuming that we know to solve it for n-1 • We leave out one person. In order to decide which person to elim, we ask a question (to a random person i about a random person j). • The eliminated person is e=i or e=j, and we know that e is not a celebrity • Case 1: There is a celebrity among the n-1 persons that remain after eliminating e, say p. To check if this is also a celebrity for the person e • check if know[e, p] and not know[p, e] • Case 2: There is no celebrity among the n-1 persons. In this case, there is no celebrity in the group. It is no need any more to check if e is a celebrity !
Celebrity – Solution 3 Function Celebrity_Sol3(S:Set of persons) return person if card(S) = 1 return S(1) pick i, j any persons in S if knows[i, j] then // i no celebrity elim=i else // if not knows[i, j] then j no celebrity elim=j p = Celebrity_Sol3(S-elim) if p != 0 and knows[elim,p] and not knows[p,elim] return p else return 0 // no celebrity end if end if end function Celebrity_Sol3
Celebrity – Solution 3 Analysis Function Celebrity_Sol3(S:Set of persons) return person if card(S) = 1 return S(1) pick i, j any persons in S if knows[i, j] then // i no celebrity elim=i else // if not knows[i, j] then j no celebrity elim=j p = Celebrity_Sol3(S-elim) if p != 0 and knows[elim,p] and not knows[p,elim] return p else return 0 // no celebrity end if end if end function Celebrity_Sol3 T(n)=T(n-1)+c Algorithm is O(n)
Celebrity - Conclusions • The size of the problem should be reduced (from n to n-1) in a clever way • This example shows that it sometimes pays off to expend some effort ( in this case – one question) to perform the reduction more effectively
The Skyline Problem • Problem: Given the exact locations and heights of several rectangular buildings, having the bottoms on a fixed line, draw the skyline of these buildings, eliminating hidden lines.
The Skyline Problem • A building Bi is represented as a triplet (Li, Hi, Ri) • A skyline of a set of n buildings is a list of x coordinates and the heights connecting them • Input (1,11,5), (2,6,7), (3,13,9), (12,7,16), (14,3,25), (19,18,22), (23,13,29), (24,4,28) • Output (1, 11), (3, 13), (9, 0), (12, 7), (16, 3), (19, 18), (22, 3), (23, 13), (29, 0)
Skyline – Solution 1 • Base case: If number of buildings n=1, the skyline is the building itself • Inductive step: We assume that we know to solve the skyline for n-1 buildings, and then we add the n’th building to the skyline
Adding One Building to the Skyline • We scan the skyline, looking at one horizontal line after the other, and adjusting when the height of the building is higher than the skyline height Worst case: O(n) Bn
Skyline – Solution 1 Analysis T(n) Algorithm Skyline_Sol1(n:integer) returns Skyline if n = 1 then return Building[1] else sky1 = Skyline_Sol1(n-1); sky2 = MergeBuilding(Building[n], sky1); return sky2; end algorithm T(n-1) O(n) • T(n)=T(n-1)+O(n) • O(n2)
Skyline – Solution 2 • Base case: If number of buildings n=1, the skyline is the building itself • Inductive step: We assume that we know to solve the skylines for n/2 buildings, and then we merge the two skylines
Merging two skylines • We scan the two skylines together, from left to right, match x coordinates and adjust height where needed n1 n2 Worst case: O(n1+n2)
Skyline – Solution 2 Analysis T(n) Algorithm Skyline_Sol2(left, right:integer) returns Skyline if left=right then return Building[left] else middle=left+(left+right)/2 sky1 = Skyline_Sol2(left, middle); sky2 = Skyline_Sol2(middle+1, right); sky3 = MergeSkylines(sky1, sky2); return sky3; end algorithm T(n/2) T(n/2) O(n) • T(n)=2T(n/2)+O(n) • O(n log n)
Skyline - Conclusion • When the effort of combining the subproblems cannot be reduced, it is more efficient to split into several subproblems of the same type which are of equal sizes • T(n)=T(n-1)+O(n) … O(n2) • T(n)=2T(n/2)+O(n) … O(n log n) • This technique is Divide and conquer
Efficiently dividing and combining • Solving a problem takes following 3 actions: • Divide the problem into a number of subproblems that are smaller instances of the same problem. • Solve the subproblems • Combine the solutions to the subproblems into the solution for the original problem. • Execution time is defined by recurrences like • T(size)=Sum (T(subproblem size)) + CombineTime • Choosing the right subproblems sizes and the right way of combining them decides the performance !
Recurrences • T(n)=O(1), if n=1 • For n>1, we may have different recurrence relations, according to the way of splitting into subproblems and of combining them. Most common are: • T(n)=T(n-1)+O(n) … O(n2) • T(n)=T(n-1)+O(1) … O(n) • T(n)=2T(n/2)+O(n) … O(n log n) • T(n)=2T(n/2)+O(1) … O(n) • T(n)=a*T(n/b) + f (n) ;
Conclusions (1) • What is Design by induction ? • An algorithm design method that uses the idea behind induction to solve problems • Instead of thinking about our algorithm as a sequence of steps to be executed, think of proving a theorem that the algorithm exists • We need to prove that this “theorem” holds for a base case, and that if it holds for “n-1” this implies that it holds for “n”
Conclusions (2) • Why/when should we use Design by induction ? • It is a well defined method for approaching a large variety of problems (“where do I start ?”) • Just take the statement of the problem and consider it is a theorem to be proven by induction • This method is always safe: designing by induction gets us to a correct algorithm, because we designed it proving its correctness • Encourages abstract thinking vs coding => you can handle the reasoning in case of complex algorithms • A[1..n]; A[n/4+1 .. 3n/4] • L, R; L=(R+3L+1)/4, R=(L+3R+1)/4 • Wecan make it also efficient (see next slide)
Conclusions (3) • The inductive step is always based on a reduction from problem size n to problems of size <n. How to efficiently make the reduction to smaller problems: • Sometimes one has to spend some effort to find the suitable element to remove. (see Celebrity Problem) • If the amount of work needed to combine the subproblems is not trivial, reduce by dividing in subproblems of equal size –divide and conquer(see Skyline Problem)