CS 201 Data Structures and Algorithms

CS 201Data Structures and Algorithms Chapter 2: Algorithm Analysis - II Text: Read Weiss, §2.4.3 – 2.4.6 Izmir University of Economics

Solutions for the Maximum Subsequence Sum Problem: Algorithm 1 int MaxSubSum1( const int A[ ], int N ) { int ThisSum, MaxSum, i, j, k; /* 1*/ MaxSum = 0; /* 2*/for( i = 0; i < N; i++ ) /* 3*/ for( j = i; j < N; j++ ) { /* 4*/ ThisSum = 0; /* 5*/for( k = i; k <= j; k++ ) /* 6*/ ThisSum += A[ k ]; /* 7*/ if( ThisSum > MaxSum ) /* 8*/ MaxSum = ThisSum; } /* 9*/ return MaxSum; } • exhaustively tries all possibilities: for all combinations of all the values for starting and ending points (i and j respectively), the partial sum (ThisSum) is calculated and compared with the maximum sum value (MaxSum) computed so far. The running time is O(N3 ) and is entirely due to lines 5 and 6. • A more precise analysis;

Solutions for the Maximum Subsequence Sum Problem: Algorithm 2 • We can improve upon Algorithm 1 to avoid the cubic running time by removing a for loop. Obviously, this is not always possible, but in this case there are an awful lot of unnecessary computations present in Algorithm 1. • Notice that • so the computation at lines 5 and 6 in Algorithm 1 is unduly expensive. Algorithm 2 is clearly O(N2 ); the analysis is even simpler than before. intMaxSubSum2( const int A[ ], int N ){ int ThisSum, MaxSum, i, j; /* 1*/MaxSum = 0; /* 2*/for( i = 0; i < N; i++ ){ /* 3*/ ThisSum = 0; /* 4*/for( j = i; j < N; j++ ){ /* 5*/ThisSum += A[ j ]; /* 6*/if( ThisSum > MaxSum ) /* 7*/MaxSum = ThisSum; } } /* 8*/return MaxSum; } Izmir University of Economics

Solutions for the Maximum Subsequence Sum Problem: Algorithm 3 • It is a recursive O(N log N) algorithm using a divide-and-conquer strategy. Divide part: Split the problem into two roughly equal subproblems, each half the size of the original. The subproblems are then solved recursively. Conquer part: Patch together the two solutions of the subproblems possibly doing a small amount of additional work, to arrive at a solution for the whole problem. • The maximum subsequence sum can (1) either occur entirely in the left half of the input, or (2) entirely in the right half, or (3) it crosses the middle and is in both halves. • Solve (1) and (2) recursively. For (3), Add the largest sum in the first half including the last element in the first half and the largest sum in the second half including the first element in the second half. • Example: • (1) first half: 6 (A0 - A2), (2) second half: 8 (A5 - A6). (3) max sum (first half) covering the last item: 4 (A0 - A3), max sum (second half) spanning the first element: 7 (A4 - A6). Thus, the max sum crossing the middle is 4+7=11 (A0 - A6). Answer! Izmir University of Economics

Solutions for the Maximum Subsequence Sum Problem: a)Algorithm 3 – Implementation /* Initial Call */ intMaxSubSum3( const int A[ ], int N ){ return MaxSubSum( A, 0, N - 1 ); } /* Utility Function */ static intMax3( int A, int B, int C ){ return A > B ? A > C ? A : C : B > C ? B : C; } /* Implementation */ static intMaxSubSum(const int A[ ], int Left, int Right){ int MaxLeftSum, MaxRightSum; int MaxLeftBorderSum, MaxRightBorderSum; int LeftBorderSum, RightBorderSum; int Center, i; /* 1*/ if( Left == Right ) /* Base case */ /* 2*/ if( A[ Left ] > 0 ) /* 3*/ return A[ Left ]; else /* 4*/ return 0;

Solutions for the Maximum Subsequence Sum Problem: b) Algorithm 3 – Implementation /* Implementation */ /* Calculate the center */ /* 5*/ Center = ( Left + Right ) / 2; /* Make recursive calls */ /* 6*/ MaxLeftSum = MaxSubSum( A, Left, Center ); /* 7*/ MaxRightSum = MaxSubSum( A, Center + 1, Right ); /* Find the max subsequence sum in the left half where the */ /* subsequence spans the last element of the left half */ /* 8*/ MaxLeftBorderSum = 0; LeftBorderSum = 0; /* 9*/ for( i = Center; i >= Left; i-- ) { /*10*/ LeftBorderSum += A[ i ]; /*11*/ if( LeftBorderSum > MaxLeftBorderSum ) /*12*/ MaxLeftBorderSum = LeftBorderSum; } Izmir University of Economics

Solutions for the Maximum Subsequence Sum Problem: c) Algorithm 3 – Implementation /* Implementation */ /*13*/ MaxRightBorderSum = 0; RightBorderSum = 0; /*14*/ for( i = Center + 1; i <= Right; i++ ) { /*15*/ RightBorderSum += A[ i ]; /*16*/ if( RightBorderSum > MaxRightBorderSum ) /*17*/ MaxRightBorderSum = RightBorderSum; } /* The function Max3 returns the largest of */ /* its three arguments */ /*18*/ return Max3( MaxLeftSum, MaxRightSum, /*19*/ MaxLeftBorderSum + MaxRightBorderSum ); } Izmir University of Economics

Solutions for the Maximum Subsequence Sum Problem: Algorithm 3 – Analysis • T(n) : time to solve a maximum subsequence sum problem of size n. • T(1) = 1; constant amount of time to execute lines 1 to 4 • Otherwise, the program must perform two recursive calls, the two for loops between lines 9 and 17, and some small amount of bookkeeping, such as lines 5 and 18. The two for loops combine to touch every element from A0 to AN-1, and there is constant work inside the loops, so the time spent in lines 9 to 17 is O(N). The remainder of the work is performed in lines 6 and 7 to solve two subsequence problems of size N/2 (assuming N is even). The total time for the algorithm then obeys: • T(1) = 1 • T(N) = 2T(N/2) + O(N) • we can replace the O(N) term in the equation above with N; since T(N) will be expressed in Big-Oh notation anyway, this will not affect the answer. • T(N) = 2(2T(N/4)+N/2) + N = 4T(N/4) + 2N • = 4(2T(N/8)+N/4) + 2N = 8T(N/8) + 3N = ... = 2kT(N/2k) + kN • If N = 2kthen T(N) = N + kN = N log N + N = O(N log N) Izmir University of Economics

Solutions for the Maximum Subsequence Sum Problem: Algorithm 4 • Algorithm 4 is O(N). • Why does the algorithm actually work? It’s an improvement over Algorithm 2 given the following: • Observation 1: If A[i] < 0 then it can not start an optimal subsequence. Hence, no negative subsequence can be a prefix in the optimal. • Observation 2: If i can advance to j+1. • Proof: Let . Any subsequence starting at p int MaxSubSum4(int A[],int n){ int ThisSum, MaxSum, j; /* 1*/ ThisSum = MaxSum = 0; /* 2*/ for( j = 0; j < N; j++ ) { /* 3*/ ThisSum += A[ j ]; /* 4*/ if( ThisSum > MaxSum ) /* 5*/ MaxSum = ThisSum; /* 6*/ else if( ThisSum < 0 ) /* 7*/ ThisSum = 0; } /* 8*/ return MaxSum; } is not larger than the corresponding sequence starting at i, since j is the first index causing sum<0). Izmir University of Economics

Logarithms in the Running Time • The most frequent appearance oflogarithms centers around the following general rule: An algorithmisO(log n) if it takes constant (O(1)) time to cut the problem size by a fraction (which is usually1/2). • On the other hand, if constant time is required to merely reduce the problem by a constant amount (such as to make the problem smaller by 1), then the algorithm is O(n). • We usually presume that the input is preread (Otherwise Ω(n)). Izmir University of Economics

Binary Search - I typedef int ElementType; #define NotFound (-1) int BinarySearch(const ElementTypeA[], ElementType X, int N){ int Low, Mid, High; /* 1*/ Low = 0; High = N - 1; /* 2*/ while( Low <= High ){ /* 3*/ Mid = ( Low + High ) / 2; /* 4*/ if( A[ Mid ] < X ) /* 5*/ Low = Mid + 1; /* 6*/ else if( A[ Mid ] > X ) /* 7*/ High = Mid - 1; else /* 8*/ return Mid;/* Found */ } /* 9*/ return NotFound; } • Definition: Given an integer x and integers A0, A1, . . . , AN-1, which are presorted and already in memory, find i such that Ai = x, or return i = -1 if x is not in the input. • The loop is O(1) per iteration. It starts with High-Low=N - 1 and ends with High-Low ≤-1. Every time through the loop the value High-Low must beat least halved from its previous value; thus, theloop is repeated at most = O(log N). Izmir University of Economics

Binary Search - II ≤ • Initially, High – Low = N – 1 = d • Assume 2k ≤ d < 2k+1 • After each new iteration, new value for d may be one of High – Mid – 1 or Mid – 1 – Low which are both bounded from above as shown below: • Hence, after k iterations, d becomes 1. Loop iterates 2 more times where d takes on the values 0 and -1 in this order. Thus, it is repeated k+2 times. Izmir University of Economics

Euclid’s Algorithm unsigned int gcd(unsigned int M, unsigned int N) { unsigned int Rem; /* 1*/ while( N > 0 ) { /* 2*/ Rem = M % N; /* 3*/ M = N; /* 4*/ N = Rem; } /* 5*/ return M; } • It computes the greatest common divisor. The greatest common divisor (gcd) of two integers is the largest integer that divides both. Thus, gcd (50, 15) = 5. • It computes gcd(M, N), assuming M≥N (If N > M, the first iteration of the loop swaps them). • Fact: If M>N, then M mod N < M/2 • Proof:There are two cases: • If N≤M/2, then since the remainder is always smaller than N, the theorem is true for this case. • If N>M/2,But then N goes into M once with a remainder M-N<M/2, proving the theorem. • Thus, the Algorithm takes O(log N) Izmir University of Economics

Exponentiation - I #define IsEven( N ) (( N )%2==0) long int pow( long int X, unsigned int N ) { /* 1*/ if( N == 0 ) /* 2*/ return 1; /* 3*/ if( N == 1 ) /* 4*/ return X; /* 5*/ if( IsEven( N ) ) /* 6*/ return pow(X*X, N/2); else /* 7*/ return pow(X*X, N/2)*X; } • Algorithm pow(X, N) raises an integer to an integer power. • Count the number of multiplications as the measurement of running time. • XN : N -1 multiplications. • Lines 1 to 4 handle the base case of the recursion. • XN=XN/2*XN/2if N is even • XN=X(N-1)/2*X(N-1)/2*X if N isodd • # of multiplications required is clearly at most 2 log N, because at most two multiplications are required to halve the problem. Izmir University of Economics

Exponentiation - II • It is interesting to note how much the code can be tweaked. • Lines 3 and 4 are unnecessary (Line 7 does the right thing). • Line 7 • /* 7*/ return pow(X*X, N/2)*X; • can be rewritten as • /* 7*/ return pow(X, N-1)*X; • Line 6, on the other hand, • /* 6*/ return pow(X*X, N/2); • cannot substituted by any of the following: • /*6a*/ return( pow( pow( X, 2 ), N/2 ) ); • /*6b*/ return( pow( pow( X, N/2 ), 2 ) ); • /*6c*/ return( pow( X, N/2 ) * pow( X, N/2 ) ); • Both lines 6a and 6b are incorrect because pow(X, 2) can not make any progress and an infinite loop results. Using line 6c affects the efficiency, because there are now two recursive calls of size N/2 instead of only one. An analysis will show that the running time is no longer O(log N). Izmir University of Economics

Checking Your Analysis - I • Once an analysis has been performed, it is desirable to see if theanswer is correct and as good as possible. • One way to do this is to code up the program and see if the empiricallyobserved running time matches the running time predicted by theanalysis. • When n doubles, the running time goes up by a factor of 2 for linearprograms, 4 for quadratic programs, and 8 for cubic programs. Programsthat run in logarithmic time take only an additive constant longer when n doubles, and programs that run in O(n log n) take slightly more thantwice as long to run under the same circumstances. • Another commonly used trick to verify that some program is O(f(n)) is to compute the values T(n)/ f(n) for a range of nwhere T(n) is theempirically observed running time. If f(n) is a tight answer for the running time, then the computed values converge to a positive constant. If f(n) is an over-estimate, the values converge to zero. If f(n) is an under-estimate and hence wrong, the values diverge. Izmir University of Economics

Checking Your Analysis - II • This program segment computes the probability that two distinct positive integers, less than or equal to N and chosen randomly , are relatively prime (as N gets large, the answer approaches 6/π2). • What is the running time complexity? • O(N2log N) • Are you sure? Izmir University of Economics

Checking Your Analysis - III • As the table dictates, last column is most likely to be the correct one. Izmir University of Economics

CS 201 Data Structures and Algorithms