1 / 47

CSCE 3110 Data Structures & Algorithm Analysis

CSCE 3110 Data Structures & Algorithm Analysis. Algorithm Analysis II Reading: Weiss, chap. 2. Algorithm Analysis. We know: Experimental approach – problems Low level analysis – count operations Abstract even further Characterize an algorithm as a function of the “problem size” E.g.

traci
Download Presentation

CSCE 3110 Data Structures & Algorithm Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSCE 3110Data Structures & Algorithm Analysis Algorithm Analysis II Reading: Weiss, chap. 2

  2. Algorithm Analysis • We know: • Experimental approach – problems • Low level analysis – count operations • Abstract even further • Characterize an algorithm as a function of the “problem size” • E.g. • Input data = array  problem size is N (length of array) • Input data = matrix  problem size is N x M

  3. Asymptotic Notation • Goal: to simplify analysis by getting rid of unneeded information (like “rounding” 1,000,001≈1,000,000) • We want to say in a formal way 3n2 ≈ n2 • The “Big-Oh” Notation: • given functions f(n) and g(n), we say that f(n) is O(g(n)) if and only if there are positive constants c and n0 such that f(n)≤c g(n) forn≥n0

  4. Graphic Illustration • f(n) = 2n+6 • Conf. def: • Need to find a function g(n) and a const. c and a constant n0 such as f(n) < cg(n) when n>n0 • g(n) = n and c = 4 and n0=3 •  f(n) is O(n) • The order of f(n) is n c g n ( n ) = 4 g n ( n ) = n

  5. More examples • What about f(n) = 4n2 ? Is it O(n)? • Find a c and n0 such that 4n2 < cn for any n > n0 • 50n3 + 20n + 4 is O(n3) • Would be correct to say is O(n3+n) • Not useful, as n3 exceeds by far n, for large values • Would be correct to say is O(n5) • OK, but g(n) should be as close as possible to f(n) • 3log(n) + log (log (n)) = O( ? ) • Simple Rule: Drop lower order terms and constant factors

  6. Big-Oh Rules • If f1(n)=O(g1(n)) and f2(n)=O(g2(n)) • f1(n)+f2(n)=O(g1(n)+g2(n))=max(O(g1(n)+g2(n)) • f1(n)*f2(n)=O(g1(n)*g2(n)) • logkN=O(N) for any constant k • The relative growth rate of two functions can always be determined by computing their limit But using this method is almost always an overkill

  7. Big-Oh and Growth Rate • The big-Oh notation gives an upper bound on the growth rate of a function. • The statement “f(n) is O(g(n))” means that the growth rate of f(n) is no more than the growth rate of g(n). • We can use the big-Oh notation to rank functions according to their growth rate.

  8. Classes of Functions • Let {g(n)} denote the class (set) of functions that are O(g(n)) • We have{n}  {n2}  {n3}  {n4}  {n5}  … where the containment is strict {n3} {n2} {n}

  9. Big-Oh Rules • If is f(n) a polynomial of degree d, then f(n) is O(nd), i.e., • Drop lower-order terms • Drop constant factors • Use the smallest possible class of functions • Say “2n is O(n)”instead of “2n is O(n2)” • Use the simplest expression of the class • Say “3n+5 is O(n)”instead of “3n+5 is O(3n)”

  10. Inappropriate Expressions X X

  11. Properties of Big-Oh • If f(n) is O(g(n)) then af(n) is O(g(n)) for any a. • If f(n) is O(g(n)) and h(n) is O(g’(n)) then f(n)+h(n) is O(g(n)+g’(n)) • If f(n) is O(g(n)) and h(n) is O(g’(n)) then f(n)h(n) is O(g(n)g’(n)) • If f(n) is O(g(n)) and g(n) is O(h(n)) then f(n) is O(h(n)) • If f(n) is a polynomial of degree d , then f(n) is O(nd) • nx = O(an), for any fixed x > 0 and a > 1 • An algorithm of order n to a certain power is better than an algorithm of order a ( > 1) to the power of n • log nx is O(log n), for x > 0 – how? • log x n is O(ny) for x > 0 and y > 0 • An algorithm of order log n (to a certain power) is better than an algorithm of n raised to a power y.

  12. Asymptotic analysis - terminology • Special classes of algorithms: logarithmic: O(log n) linear: O(n) quadratic: O(n2) polynomial: O(nk), k ≥ 1 exponential: O(an), n > 1 • Polynomial vs. exponential ? • Logarithmic vs. polynomial ? • Graphical explanation?

  13. Some Numbers

  14. Computing Prefix Averages • The i-th prefix average of an array X is average of the first (i+ 1) elements of X A[i]= (X[0] +X[1] +… +X[i])/(i+1) • Computing the array A of prefix averages of another array X has applications to financial analysis

  15. Prefix Averages (Quadratic) • The following algorithm computes prefix averages in quadratic time by applying the definition AlgorithmprefixAverages1(X, n) Inputarray X of n integers Outputarray A of prefix averages of X #operations A new array of n integers n fori0ton 1do n sX[0] n forj1toido 1 + 2 + …+ (n 1) ss+X[j] 1 + 2 + …+ (n 1) A[i]s/(i+ 1)n returnA 1

  16. The running time of prefixAverages1 isO(1 + 2 + …+ n) The sum of the first n integers is n(n+ 1) / 2 There is a simple visual proof of this fact Thus, algorithm prefixAverages1 runs in O(n2) time Arithmetic Progression

  17. Prefix Averages (Linear) • The following algorithm computes prefix averages in linear time by keeping a running sum AlgorithmprefixAverages2(X, n) Inputarray X of n integers Outputarray A of prefix averages of X #operations A new array of n integers n s 0 1 fori0ton 1do n ss+X[i] n A[i]s/(i+ 1)n returnA 1 • Algorithm prefixAverages2 runs in O(n) time

  18. A table of functions wrt input n, assume that each primitive operation takes one microsecond (1 second = 106 microsecond).

  19. Running Time Calculations General Rules • FOR loop • The number of iterations times the time of the inside statements. • Nested loops • The product of the number of iterations times the time of the inside statements. • Consecutive Statements • The sum of running time of each segment. • If/Else • The testing time plus the larger running time of the cases.

  20. Some Examples Case1: for (i=0; i<n; i++) for (j=0; j<n; j++) k++; Case 2: for (i=0; i<n; i++) k++; for (i=0; i<n; i++) for (j=0; j<n; j++) k++; Case 3: for (int i=0; i<n-1; i++) for (int j=0; j<i; j++) int k+=1; O(n2) O(n2) O(n2)

  21. Maximum Subsequence Sum Problem Given a set of integers A1, A2, …, AN, find the maximum value of For convenience, the maximum subsequence sum is zero if all the integers are negtive.

  22. Algorithm 1 int MaxSubSum1(const vector <int> & a) { int maxSum=0; for (int i=0; i<a.size(); i++) for (int j=i; j<a.size(); j++) { int thisSum=0; for (int k=i; k<=j; k++) thisSum+=a[k]; if (thisSum>maxSum) maxSum=thisSum; } return maxSum; } O(n3)

  23. Algorithm 2 int MaxSubSum2(const vector <int> & a) { int maxSum=0; for (int i=0; i<a.size(); i++) { thisSum=0; for (int j=i; j<a.size(); j++) { thisSum+=a[j]; if (thisSum>maxSum) maxSum=thisSum; } } return maxSum; } O(n2)

  24. T(n’)=T(n/2) O(n) Algorithm 3 rightBorderSum+=a[j]; if (rightBorderSum>maxrightBorderSum) maxRightBorderSum=rightBorderSum; } return max3(maxLeftSum, maxRightSum, maxLeftBorderSum+maxRightBorderSum); } int maxSubSum3 (const vector <int> & a) { return maxSumRec(a, 0, a.size()-1); } int maxSumRec (const verctor<int> & a, int left, int right) { if (left==right) if (a[left]>0) return a[left]; else return 0; int center = (left+right)/2; int maxLeftSum=maxSumRec(a, left, center); int maxRightSum=maxSumRec(a, center+1, right); int maxLeftBorderSum=0, leftBorderSum=0; for (int i=center; i>=left; i--) { leftBorderSum += a[i]; if (leftBorderSum>maxLeftBorderSum) maxLeftBorderSum=leftBorderSum; } int maxRightBorderSum=0; rightBorderSum=0; for (int j=center+1; j<=right; j++) {

  25. T(1) = 1 • T(n)=2 T(n/2) + O(n) • T(2) = ? • T(4) = ? • T(8) = ? • … 4 = 2*2 12 = 4*3 32 = 8*4 • If n=2k, T(n)=n*(k+1) • k=log n • T(n)=n(log n + 1)

  26. Logarithms in the Running Time • An algorithm is O(log N) if it takes constant time to cut the problem size by a fraction (usually ½). • On the other hand, if constant time is required to merely reduce the problem by a constant amount (such as to make the problem smaller by 1), then the algorithm is O(N) • Examples of the O(log N) • Binary Search • Euclid’s algorithm for computing the greatest common divisor

  27. Analyzing recursive algorithms function foo (param A, param B) { statement 1; statement 2; if (termination condition) { return; foo(A’, B’); }

  28. Solving recursive equations by repeated substitution T(n) = T(n/2) + c substitute for T(n/2) = T(n/4) + c + c substitute for T(n/4) = T(n/8) + c + c + c = T(n/23) + 3c in more compact form = … = T(n/2k) + kc “inductive leap” T(n) = T(n/2logn) + clogn “choose k = logn” = T(n/n) + clogn = T(1) + clogn = b + clogn = θ(logn)

  29. Solving recursive equations by telescoping T(n) = T(n/2) + c initial equation T(n/2) = T(n/4) + c so this holds T(n/4) = T(n/8) + c and this … T(n/8) = T(n/16) + c and this … … T(4) = T(2) + c eventually … T(2) = T(1) + c and this … T(n) = T(1) + clogn sum equations, canceling theterms appearing on both sides T(n) = θ(logn)

  30. Algorithm 4 int MaxSubSum4(const vector <int> & a) { int maxSum=0, thisSum=0; for (int j=0; j<a.size(); j++) { thisSum+=a[j]; if (thisSum>maxSum) maxSum=thisSum; else if (thisSum<0) thisSum=0; } return maxSum; } O(n)

  31. Problem: • Order the following functions by their asymptotic growth rates • nlogn • logn3 • n2 • n2/5 • 2logn • log(logn) • Sqr(logn)

  32. Back to the original question • Which solution would you choose? • O(n2) vs. O(n) • Some math … • properties of logarithms: logb(xy) = logbx + logby logb (x/y) = logbx - logby logbxa= alogbx logba= logxa/logxb • properties of exponentials: a(b+c) = aba c abc = (ab)c ab /ac = a(b-c) b = a logab bc = a c*logab

  33. “Relatives” of Big-Oh • “Relatives” of the Big-Oh •  (f(n)): Big Omega – asymptotic lower bound •  (f(n)): Big Theta – asymptotic tight bound • Big-Omega – think of it as the inverse of O(n) • g(n) is  (f(n)) if f(n) is O(g(n)) • Big-Theta – combine both Big-Oh and Big-Omega • f(n) is  (g(n)) if f(n) is O(g(n)) and g(n) is  (f(n)) • Make the difference: • 3n+3 is O(n) and is  (n) • 3n+3 is O(n2) but is not  (n2)

  34. More “relatives” • Little-oh – f(n) is o(g(n)) if f(n) is O(g(n)) and f(n) is not  (g(n)) • 2n+3 is o(n2) • 2n + 3 is o(n) ?

  35. Important Series • Sum of squares: • Sum of exponents: • Geometric series: • Special case when A = 2 • 20 + 21 + 22 + … + 2N = 2N+1 - 1

  36. Problem • Running time for finding a number in a sorted array [binary search] • Pseudo-code • Running time analysis

  37. ADT • ADT = Abstract Data Types • A logical view of the data objects together with specifications of the operations required to create and manipulate them. • Describe an algorithm – pseudo-code • Describe a data structure – ADT

  38. What is a data type? • A set of objects, each called an instance of the data type. Some objects are sufficiently important to be provided with a special name. • A set of operations. Operations can be realized via operators, functions, procedures, methods, and special syntax (depending on the implementing language) • Each object must have some representation (not necessarily known to the user of the data type) • Each operation must have some implementation (also not necessarily known to the user of the data type)

  39. What is a representation? • A specific encoding of an instance • This encoding MUST be known to implementors of the data type but NEED NOT be known to users of the data type • Terminology: "we implement data types using data structures“

  40. Two varieties of data types • Opaque data types in which the representation is not known to the user. • Transparent data types in which the representation is profitably known to the user:- i.e. the encoding is directly accessible and/or modifiable by the user. • Which one you think is better? • What are the means provided by C++ for creating opaque data types?

  41. Why are opaque data types better? • Representation can be changed without affecting user • Forces the program designer to consider the operations more carefully • Encapsulates the operations • Allows less restrictive designs which are easier to extend and modify • Design always done with the expectation that the data type will be placed in a library of types available to all.

  42. How to design a data typeStep 1: Specification • Make a list of the operations (just their names) you think you will need. Review and refine the list. • Decide on any constants which may be required. • Describe the parameters of the operations in detail. • Describe the semantics of the operations (what they do) as precisely as possible.

  43. How to design a data type Step 2: Application • Develop a real or imaginary application to test the specification. • Missing or incomplete operations are found as a side-effect of trying to use the specification.

  44. How to design a data typeStep 3: Implementation • Decide on a suitable representation. • Implement the operations. • Test, debug, and revise.

  45. Example - ADT Integer Name of ADT Integer Operation Description C/C++ Create Defines an identifier with an undefined value int id1; Assign Assigns the value of one integer id1 = id2; identifier or value to another integer identifier isEqual Returns true if the values associated id1 == id2; with two integer identifiers are the same

  46. Example – ADT Integer LessThan Returns true if an identifier integer is less than the value of the second id1<id2 integer identifier Negative Returns the negative of the integer value -id1 Sum Returns the sum of two integer values id1+id2 Operation Signatures Create: identifier  Integer Assign: Integer  Identifier IsEqual: (Integer,Integer)  Boolean LessThan: (Integer,Integer)  Boolean Negative: Integer  Integer Sum: (Integer,Integer)  Integer

  47. More examples • We’ll see more examples throughout the course • Stack • Queue • Tree • And more

More Related