Chapter 2: Algorithm Analysis

Chapter 2: Algorithm Analysis • Time Complexity • Big-O, Big-, Big-, Little-O • Running Time Calculation • Analyzing Specific Functions CS 340 Page 10

Time Complexity A fundamental question when determining the efficiency of an algorithm is the relationship between the size of the problem and the amount of computation needed to solve it. CS 340 Page 11

Big-O Notation Function T(n) is said to be O(f (n)) if there are positive constants c and n0 such that T(n)  c f (n) for any n  n0 (i.e., T(n) is ultimately bounded above by c f (n)). • Example: n3 + 3n2 + 6n + 5 is O(n3). (Use c = 15 and n0 = 1.) Proof: For n ≥ 1, n2 ≥ n and n3 ≥ n2(by mult. property of ineq.) For n ≥ 1, n3 ≥ n and n3 ≥ 1 (by transitive law of ≥) For n ≥ 1, 3n3 ≥ 3n2, 6n3 ≥ 6n, and 5n3 ≥ 5 (by mult. property of ineq.) For n ≥ 1, n3 + 3n3 + 6n3 + 5n3 ≥ n3 + 3n2 + 6n + 5 (by add. property of ineq.) For n ≥ 1, 15n3 ≥ n3 + 3n2 + 6n + 5 (by distrib. law of * over +) • Example: n2 + nlogn is O(n2). (Use c = 2 and n0 = 1.) Proof: For n ≥ 1, n ≥ logn(by inductive proof) For n ≥ 1, n2 ≥ nlogn(by mult. property of ineq.) For n ≥ 1, n2 + n2 ≥ n2 + nlogn(by add. property of ineq.) For n ≥ 1, 2n2 ≥ n2 + nlogn(by distrib. law of * over +) CS 340 Page 12

g(n) r(n) ng nr An Intuitive Big-O Example r(n) is O(g(n)) since 1*g(n) exceeds r(n) for all n-values past ng g(n) is O(r(n)) since 3*r(n) exceeds g(n) for all n-values past nr CS 340 Page 13

ALGORITHM A B 10 1,110 11,110 Input Size n 100 1,010,100 2,010,100 1,000 1,001,001,000 1,101,001,000 10,000 1,000,100,010,000 1,010,100,010,000 100,000 1,000,010,000,100,000 1,001,010,000,100,000 1,000,000 1,000,001,000,001,000,000 1,000,101,000,001,000,000 Demonstrating The Big-O Concept Each of the algorithms below has O(n3) time complexity... (In fact, the execution time for Algorithm A is n3 + n2 + n, and the execution time for Algorithm B is n3 + 101n2 + n.) CS 340 Page 14

ALGORITHM C D Input Size n 10 123 10,123 100 10,203 110,203 1,000 1,002,003 2,002,003 10,000 100,020,003 110,020,003 100,000 10,000,200,003 10,100,200,003 1,000,000 1,000,002,000,003 1,001,002,000,003 A Second Big-O Demonstration Each of the algorithms below has O(n2) time complexity... (In fact, the execution time for Algorithm C is n2 + 2n + 3, and the execution time for Algorithm D is n2 + 1002n + 3.) CS 340 Page 15

ALGORITHM E F Input Size n 10 83 1,083 100 1,164 11,164 1,000 14,966 114,966 10,000 182,877 1,182,877 100,000 2,160,964 12,160,964 1,000,000 24,931,569 124,931,569 One More Big-O Demonstration Each of the algorithms below has O(n logn) time complexity… (In fact, the execution time for Algorithm E is nlogn + 5n, and the execution time for Algorithm F is nlogn + 105n. Note that the linear term for Algorithm F will dominate until n reaches 2105.) CS 340 Page 16

g(n) 2*r(n) r(n) 1*g(n) r(n) g(n) Big-O Represents An Upper Bound If T(n) is O(f(n)), then f(n) is basically a cap on how bad T(n) will behave when n gets big. Is r(n) O(g(n))? Is g(n) O(r(n))? r(n)  1*g(n) for all n≥ 0 g(n)  2*r(n) for all n≥ 0 YES! Try c=1 starting at n=0 YES! Try c=2 starting at n=0 CS 340 Page 17

2*v(n) v(n) v(n) y(n) 1*y(n) y(n) n0 n0 Another Example Is y(n) O(v(n))? Is v(n) O(y(n))? v(n)  1*y(n) for all n≥ n0 y(n)  2*v(n) for all n≥ n0 YES! Try c=1 after 3rd cross YES! Try c=2 after 2nd cross CS 340 Page 18

c*b(n) p(n) p(n) 1*p(n) b(n) b(n) One More Example Is p(n) O(b(n))? Is b(n) O(p(n))? b(n)  1*p(n) for all n≥ 0 NO! b(n) returns to 0 for any c! YES! Try c=1 starting at n=0 CS 340 Page 19

g(n) r(n) nr Big- (Big-Omega) Notation Function T(n) is said to be (g(n)) if there are positive constants c and n0 such that T(n)  c g (n) for any n  n0(i.e., T(n) is ultimately bounded below by c g (n)). Example: n3 + 3n2 + 6n + 5 is (n3). (Use c = 1 and n0 = 1.) Example: n2 + nlogn is (n2). (Use c = 1 and n0 = 1.) r(n) is not(g(n)) since for every positive constant c, c*g(n) ultimately gets bigger than r(n) g(n) is (r(n)) since g(n) exceeds 1*r(n) for all n-values past nr CS 340 Page 20

r(n) g(n) n0 Big- (Big-Theta) Notation Function T(n) is said to be (h(n)) if T(n) is both O(h(n)) and (h(n)). Example: n3 + 3n2 + 6n + 5 is (n3). Example: n2 + nlogn is (n2). r(n) is (g(n)) since r(n) is squeezed between 1*g(n) and 2*g(n) once n exceeds n0 g(n) is (r(n)) since g(n) is squeezed between ½*r(n) and 1*r(n) once n exceeds n0 CS 340 Page 21

Little-O Notation Function T(n) is said to be o(p(n)) if T(n) is O(p(n)) but not(p(n)). Example: n3 + 3n2 + 6n + 5 is O(n4). (Use c = 15 and n0 = 1.) However, n3 + 3n2 + 6n + 5 is not (n4). Proof (by contradiction): Assume that there are positive constants c and n0 such that n3 + 3n2 + 6n + 5  c n4 for all n  n0. Then dividing by n4 on both sides yields the fact that (1/n)+(3/n2)+(6/n3)+(5/n4)  c, for all n  n0. Since limn((1/n)+(3/n2)+(6/n3)+(5/n4)) = 0, we must conclude that 0  c, which contradicts the fact that c must be a positive constant. CS 340 Page 22

Mathematically Speaking… CS 340 Page 23

Computational Model For Algorithm Analysis To formally analyze the performance of algorithms, we will use a computational model with a couple of simplifying assumptions: Each simple instruction (assignment, comparison, addition, multiplication, memory access, etc.) is assumed to execute in a single time unit. (For example, accessing a one-dimensional array element A[i] requires a multiplication, an addition, and a memory access – three time units in all. Similarly, accessing a two-dimensional array element B[j,k] requires two multiplications, two additions, and one memory access – five time units in all.) Memory is assumed to be limitless, so there is always room to store whatever data is needed. The size of the input, n, will normally be used as our main variable, and we’ll primarily be interested in “worst case” scenarios. CS 340 Page 24

General Rules For Running Time Calculation Rule One: Loops The running time of a loop is at most the running time of the statements inside the loop, multiplied by the number of iterations. Example: for (i = 0; i < n; i++) // n iterations, A[i] = (1-t)*X[i] + t*Y[i]; // 15 time units // per iteration (Retrieving X[i] requires one multiplication, one addition, and one memory access, as does retrieving Y[i]; the calculation involves a subtraction, two multiplications, and an addition; assigning A[i] the resulting value requires one multiplication, one addition, and one memory access; and each loop iteration requires a comparison and either an assignment or an increment. This totals fifteen primitive operations.) Thus, the total running time is 15n time units, i.e., this part of the program is O(n). CS 340 Page 25

Rule Two: Nested Loops The running time of a nested loop is at most the running time of the statements inside the innermost loop, multiplied by the product of the number of iterations of all of the loops. Example: for (i = 0; i < n; i++) // n iterations. 2 ops each for (j = 0; j < n; j++) // n iterations, 2 ops each C[i,j] = j*A[i] + i*B[j]; // 14 time units/iteration (3 for retrieving A[i], 3 for retrieving B[j], 3 for the RHS arithmetic, 5 for assigning C[i,j].) Total running time: ((14+2)n+2)n = 16n2+2ntime units, which is O(n2). More complex example: for (i = 0; i < n; i++) // n iter, 2 ops each for (j = i; j < n; j++) // n-i iter, 2 ops each C[j,i] = C[i,j] = j*A[i]+i*B[j]; // 19 time units/iter Total running time:  i=0,n-1(2+ j=i, n-121) =  i=0,n-1(2+21(n-i)) = 2n+21( i=0,n-1n -  i=0,n-1i) = 2n+21(n2 - ½n(n-1)) = 10.5n2+ 12.5n time units, which is also O(n2). CS 340 Page 26

Rule Three: Consecutive Statements The running time of a sequence of statements is merely the sum of the running times of the individual statements. Example: for (i = 0; i < n; i++) { // 28n time units A[i] = (1-t)*X[i] + t*Y[i]; // for this B[i] = (1-s)*X[i] + s*Y[i]; // entire loop } for (i = 0; i < n; i++) // (16n+2)n time for (j = 0; j < n; j++) // units for this C[i,j] = j*A[i] + i*B[j]; // nested loop Total running time: 16n2+30n time units, i.e., this code is O(n2). CS 340 Page 27

Rule Four: Conditional Statements The running time of an if-else statement is at most the running time of the conditional test, added to the maximum of the running times of the if and else blocks of statements. Example: if (amt > cost + tax) // 2 time units { count = 0; // 1 time unit while ((count<n) && (amt>cost+tax)) // At most n iter., // 4 TUs each, but { // (count<n) is // evaluated when // the (n+1)-st // iteration is // tried amt -= (cost + tax); // 3 time units count++; // 1 time unit } cout << “CAPACITY:” << count; // 2 time units } else cout << “INSUFFICIENT FUNDS”; // 1 time unit Total running time: 2 + max(1 + (4 + 3 + 1)n + 1 + 2, 1) = 8n + 6 time units, i.e., this code is O(n). CS 340 Page 28

Analysis Of Binary Search Function intbinsrch(constetype A[], constetype x, constint n) { int low = 0, high = n-1; // 3 time units int middle; // 0 time units while (low <= high) // 1 time unit per iteration { middle = (low + high)/2; // 3 time units if (A[middle] < x) // 4 TU | <-- Worst Case low = middle + 1; // 2 TU | else if (A[middle] > x) // 4 TU | <-- Worst Case high = middle - 1; // 2 TU | <-- Worst Case else // 0 TU | return middle; // 1 TU | } return -1; // If search is unsuccessful; 1 time unit. } In the worst case, the loop will keep dividing the distance between the low and high indices in half until they are equal, iterating at most logn times. Thus, the total running time is: 17logn+ 4 time units, which is O(logn). CS 340 Page 29

Analysis Of Another Function: SuperFreq etypeSuperFreq(constetype A[], constint n) { etypebestElement = A[0]; // 4 time units intbestFreq = 0; // 1 time unit intcurrFreq; // 0 time units for (i = 0; i < n; i++) // n iterations; 2 TUs each { currFreq = 0; // 1 time unit for (j = i; j < n; j++) // n-i iterations; 2 TUs each if (A[i] == A[j]) // 7 time units currFreq++; // 1 time unit if (currFreq > bestFreq) // 1 time unit bestElement = A[i]; // 4 time units } return bestElement; // 1 time unit } Note that the function is obviously O(n2) due to its familiar nested loop structure. Specifically, its worst-case running time is 6+ i=0,n-1(8+ j=i, n-110) = 5n2 + 13n + 6. CS 340 Page 30

What About Recursion? humongIntpow(consthumongInt &val, consthumongInt &n) { if (n == 0) return humongInt(0); if (n == 1) return val; if (n % 2 == 0) return pow(val*val, n/2); return pow(val*val, n/2) * val; } The worst-case running time would require all 3 conditions to be checked, and to fail (taking 4 time units for the comparisons and arithmetic). The last return statement requires 3 time units for arithmetic operations each time it’s executed, which happens logn – 1 times (since it halves n with each execution, until it reaches a value of 1). When the parameterized n-value finally reaches 1, three last operations are performed. Thus, the worst-case running time is 7logn - 4. CS 340 Page 31

Recurrence Relations To Evaluate Recursion int powerOf2(constint &n) { if (n == 0) return 1; return powerOf2(n-1) + powerOf2(n-1); } Assume that there is a function T(n) such that it takes T(k) time to execute powerOf2(k). Examining the code allows us to conclude the following: T(0) = 2 T(k) = 5 + 2T(k-1) for all k > 0 The second fact tells us that: T(n) = 5 + 2T(n-1) = 5 + 2(5 + 2T(n-2)) = 5 + 2(5 + 2(5 + 2(T(n-3)))) = … = 5(1 + 2 + 22 + 23 + … + 2n-1) + 2nT(0) = 5(2n-1) + 2n(2) = 7(2n) - 5, which is O(2n). CS 340 Page 32

Another Recurrence Relation Example int alternatePowerOf2(constint &n) { if (n == 0) return 1; return 2*alternatePowerOf2(n-1); } Assume that there is a function T(n) such that it takes T(k) time to execute alternatePowerOf2(k). Examining the code allows us to conclude that: T(0) = 2 T(k) = 4 + T(k-1) for all k > 0 The second fact tells us that: T(n) = 4 + T(n-1) = 4 + (4 + T(n-2)) = 4 + (4 + (4 + (T(n-3)))) = … = 4n + T(0) = 4n + 2, which is O(n). CS 340 Page 33

Chapter 2: Algorithm Analysis

Chapter 2: Algorithm Analysis

Presentation Transcript

Chapter 1

COMP108 Algorithmic Foundations Algorithm efficiency + Searching/Sorting

Data Warehousing/Mining Comp 150 DW Chapter 8. Cluster Analysis

Chapter 6 Strategy Analysis And Choice

MANAGERIAL ECONOMICS 12th Edition

Chapter 5: Essentials of Financial Statement Analysis

Chapter 4 Time Series Analysis and Forecasting

Chapter 2

CMPT 225

Graphics PRIMITIVES

CHAPTER 12

Chapter 12 DNA Analysis

Divide and Conquer

Advanced Algorithm Design and Analysis

Algorithm Analysis

Chapter 3: Lexical Analysis

Algorithm-Based Fault Tolerance Theory of Check Placement

Chapter 8 Frequency-Domain Analysis

Chapter 5 Analysis of CCS

Cluster Analysis : — Chapter 4 —

Part 2: Advanced Static Analysis

Numerical Analysis