CE 221 Data Structures and Algorithms

CE 221Data Structures and Algorithms Chapter 2: Algorithm Analysis - II Text: Read Weiss, §2.4.3 – 2.4.6 Izmir University of Economics

Solutions for the Maximum Subsequence Sum Problem: Algorithm 1 • exhaustively tries all possibilities: for all combinations of all the values for starting and ending points (i and j respectively), the partial sum (ThisSum) is calculated and compared with the maximum sum value (MaxSum) computed so far. The running time is O(N3 ) and is entirely due to lines 5 and 6. • A more precise analysis; public static int MaxSubSum1(int [ ] A) { intThisSum, MaxSum, i, j, k; int N = A.length; /* 1*/ MaxSum = 0; /* 2*/for( i = 0; i < N; i++ ) /* 3*/ for( j = i; j < N; j++ ) { /* 4*/ ThisSum = 0; /* 5*/for( k = i; k <= j; k++ ) /* 6*/ ThisSum += A[ k ]; /* 7*/ if( ThisSum > MaxSum ) /* 8*/ MaxSum = ThisSum; } /* 9*/ return MaxSum; }

Solutions for the Maximum Subsequence Sum Problem: Algorithm 2 • We can improve upon Algorithm 1 to avoid the cubic running time by removing a for loop. Obviously, this is not always possible, but in this case there are an awful lot of unnecessary computations present in Algorithm 1. • Notice that • so the computation at lines 5 and 6 in Algorithm 1 is unduly expensive. Algorithm 2 is clearly O(N2 ); the analysis is even simpler than before. public static intMaxSubSum2( int [ ] A ){ intThisSum, MaxSum, i, j; int N = A.length; /* 1*/MaxSum = 0; /* 2*/for( i = 0; i < N; i++ ){ /* 3*/ ThisSum = 0; /* 4*/for( j = i; j < N; j++ ){ /* 5*/ThisSum += A[ j ]; /* 6*/if( ThisSum > MaxSum ) /* 7*/MaxSum = ThisSum; } } /* 8*/return MaxSum; } Izmir University of Economics

Solutions for the Maximum Subsequence Sum Problem: Algorithm 3 • It is a recursive O(N log N) algorithm using a divide-and-conquer strategy. Divide part: Split the problem into two roughly equal subproblems, each half the size of the original. The subproblems are then solved recursively. Conquer part: Patch together the two solutions of the subproblems possibly doing a small amount of additional work, to arrive at a solution for the whole problem. • The maximum subsequence sum can (1) either occur entirely in the left half of the input, or (2) entirely in the right half, or (3) it crosses the middle and is in both halves. • Solve (1) and (2) recursively. For (3), Add the largest sum in the first half including the last element in the first half and the largest sum in the second half including the first element in the second half. • Example: • (1) first half: 6 (A0 - A2), (2) second half: 8 (A5 - A6). (3) max sum (first half) covering the last item: 4 (A0 - A3), max sum (second half) spanning the first element: 7 (A4 - A6). Thus, the max sum crossing the middle is 4+7=11 (A0 - A6). Answer! Izmir University of Economics

Solutions for the Maximum Subsequence Sum Problem: Algorithm 3 – Implementation I /* Initial Call */ public static intMaxSubSum3( int [ ] a ){ return maxSumRec ( a, 0, a.length - 1 ); } Izmir University of Economics

Solutions for the Maximum Subsequence Sum Problem: Algorithm 3 – Implementation II Izmir University of Economics

Solutions for the Maximum Subsequence Sum Problem: Algorithm 3 – Implementation III Izmir University of Economics

Solutions for the Maximum Subsequence Sum Problem: Algorithm 3 – Analysis • T(N): time to solve a maximum subsequence sum problem of size N • T(1) = 1; constant amount of time to execute lines 6 to 12. • Otherwise, the program must perform two recursive calls, the two for loops between lines 18 and 32, and some small amount of bookkeeping, such as lines 14 and 34. The two for loops combine to touch every element from a0 to aN-1, and there is constant work inside the loops, so the time spent in lines 18 to 32 is O(N). The remainder of the work is performed in lines 15 and 16 to solve two subsequence problems of size N/2 (assuming N is even). The total time for the algorithm then obeys: • T(1) = 1 • T(N) = 2T(N/2) + O(N) • we can replace the O(N) term in the equation above with N; since T(N) will be expressed in Big-Oh notation anyway, this will not affect the answer. • T(N) = 2(2T(N/4)+N/2) + N = 4T(N/4) + 2N • = 4(2T(N/8)+N/4) + 2N = 8T(N/8) + 3N = ... = 2kT(N/2k) + kN • If N = 2kthen T(N) = N + kN = N log N + N = O(N log N) Izmir University of Economics

Solutions for the Maximum Subsequence Sum Problem: Algorithm 4 • Algorithm 4 is O(N). • Why does the algorithm actually work? It’s an improvement over Algorithm 2 given the following: • Observation 1: If A[i] < 0 then it can not start an optimal subsequence. Hence, no negative subsequence can be a prefix in the optimal. • Observation 2: If i can advance to j+1. • Proof: Let . Any subsequence starting at p is not larger than the corresponding sequence starting at i, since j is the first index causing sum<0). Izmir University of Economics

Logarithms in the Running Time • The most frequent appearance oflogarithms centers around the following general rule: An algorithmisO(log n) if it takes constant (O(1)) time to cut the problem size by a fraction (which is usually1/2). • On the other hand, if constant time is required to merely reduce the problem by a constant amount (such as to make the problem smaller by 1), then the algorithm is O(n). • We usually presume that the input is preread (Otherwise Ω(n)). Izmir University of Economics

Binary Search - I • Definition: Given an integer x and integers A0, A1, . . . , AN-1, which are presorted and already in memory, find i such that Ai = x, or return i = -1 if x is not in the input. • The loop is O(1) per iteration. It starts with high-low=N - 1 and ends with high-low ≤-1. Every time through the loop the value high-low must beat least halved from its previous value; thus, theloop is repeated at most = O(log N). Izmir University of Economics

Binary Search - II ≤ • Initially, high – low = N – 1 = d • Assume 2k ≤ d < 2k+1 • After each new iteration, new value for d may be one of high – mid – 1 or mid – 1 – low which are both bounded from above as shown below: • Hence, after k iterations, d becomes 1. Loop iterates 2 more times where d takes on the values 0 and -1 in this order. Thus, it is repeated k+2 times. Izmir University of Economics

Euclid’s Algorithm • It computes the greatest common divisor. The greatest common divisor (gcd) of two integers is the largest integer that divides both. Thus, gcd (50, 15) = 5. • It computes gcd(m, n), assuming m ≥n (If n > m, the first iteration of the loop swaps them). • Fact: If m>n, then m mod n < m/2 • Proof:There are two cases: • If n≤m/2, then since the remainder is always smaller than n, the theorem is true for this case. • If n>m/2,But then n goes into m once with a remainder m-n<m/2, proving the theorem. • Thus, the Algorithm takes O(log N) Izmir University of Economics

Exponentiation - I public static booleanisEven(int n){ return n % 2 == 0; } public static long pow(long X,int N) { /* 1*/ if( N == 0 ) /* 2*/ return 1; /* 3*/ if( N == 1 ) /* 4*/ return X; /* 5*/ if( IsEven( N ) ) /* 6*/ return pow(X*X, N/2); else /* 7*/ return pow(X*X, N/2)*X; } • Algorithm pow(X, N) raises an integer to an integer power. • Count the number of multiplications as the measurement of running time. • XN : N -1 multiplications. • Lines 1 to 4 handle the base case of the recursion. • XN=XN/2*XN/2if N is even • XN=X(N-1)/2*X(N-1)/2*X if N isodd • # of multiplications required is clearly at most 2 log N, because at most two multiplications are required to halve the problem. Izmir University of Economics

Exponentiation - II • It is interesting to note how much the code can be tweaked. • Lines 3 and 4 are unnecessary (Line 7 does the right thing). • Line 7 • /* 7*/ return pow(X*X, N/2)*X; • can be rewritten as • /* 7*/ return pow(X, N-1)*X; • Line 6, on the other hand, • /* 6*/ return pow(X*X, N/2); • cannot be substituted by any of the following: • /*6a*/ return( pow( pow( X, 2 ), N/2 ) ); • /*6b*/ return( pow( pow( X, N/2 ), 2 ) ); • /*6c*/ return( pow( X, N/2 ) * pow( X, N/2 ) ); • Both lines 6a and 6b are incorrect because pow(X, 2) can not make any progress and an infinite loop results. Using line 6c affects the efficiency, because there are now two recursive calls of size N/2 instead of only one. An analysis will show that the running time is no longer O(log N). Izmir University of Economics

Checking Your Analysis - I • Once an analysis has been performed, it is desirable to see if theanswer is correct and as good as possible. • One way to do this is to code up the program and see if the empiricallyobserved running time matches the running time predicted by theanalysis. • When n doubles, the running time goes up by a factor of 2 for linearprograms, 4 for quadratic programs, and 8 for cubic programs. Programsthat run in logarithmic time take only an additive constant longer when n doubles, and programs that run in O(n log n) take slightly more thantwice as long to run under the same circumstances. • Another commonly used trick to verify that some program is O(f(n)) is to compute the values T(n)/ f(n) for a range of nwhere T(n) is theempirically observed running time. If f(n) is a tight answer for the running time, then the computed values converge to a positive constant. If f(n) is an over-estimate, the values converge to zero. If f(n) is an under-estimate and hence wrong, the values diverge. Izmir University of Economics

Checking Your Analysis - II • This program segment computes the probability that two distinct positive integers, less than or equal to N and chosen randomly , are relatively prime (as N gets large, the answer approaches 6/π2). • What is the running time complexity? • O(N2log N) • Are you sure? Izmir University of Economics

Checking Your Analysis - III • As the table dictates, last column is most likely to be the correct one. Izmir University of Economics

Homework Assignments • 2.13, 2.14, 2.15, 2.22, 2.25, 2.26, 2.28 • You are requested to study and solve the exercises. Note that these are for you to practice only. You are not to deliver the results to me. Izmir University of Economics

CE 221 Data Structures and Algorithms