CSC401 – Analysis of Algorithms Chapter 1 Algorithm Analysis

CSC401 – Analysis of AlgorithmsChapter 1Algorithm Analysis Objectives: • Introduce algorithm and algorithm analysis • Discuss algorithm analysis methodologies • Introduce pseudo code of algorithms • Asymptotic notion of algorithm efficiency • Mathematics foundation for algorithm analysis • Amortization analysis techniques

What is an algorithm ? An algorithm is a step-by-step procedure for solving a problem in a finite amount of time. Input Algorithm Output What is algorithm analysis ? • Two aspects: • Running time – How much time is taken to complete the algorithm execution? • Storage requirement – How much memory is required to execute the program?

Running Time • Most algorithms transform input objects into output objects. • The running time of an algorithm typically grows with the input size. • Average case time is often difficult to determine. • We focus on the worst case running time. • Easier to analyze • Crucial to applications such as games, finance and robotics

How Is An Algorithm Analyzed? • Two methodologies • Experimental analysis • Method • Write a program implementing the algorithm • Run the program with inputs of varying size and composition • Use a method like System.currentTimeMillis() to get an accurate measure of the actual running time • Plot the results • Limitations • It is necessary to implement the algorithm, which may be difficult • Results may not be indicative of the running time on other inputs not included in the experiment. • In order to compare two algorithms, the same hardware and software environments must be used

How Is An Algorithm Analyzed? • Theoretical Analysis • Method • Uses a high-level description of the algorithm instead of an implementation • Characterizes running time as a function of the input size, n. • Takes into account all possible inputs • Allows us to evaluate the speed of an algorithm independent of the hardware/software environment • Characteristics • A description language -- Pseudo code • Mathematics

Example: find max element of an array AlgorithmarrayMax(A, n) Inputarray A of n integers Outputmaximum element of A currentMaxA[0] fori1ton  1do ifA[i]  currentMaxthen currentMaxA[i] returncurrentMax Pseudocode • High-level description of an algorithm • More structured than English prose • Less detailed than a program • Preferred notation for describing algorithms • Hides program design issues

Control flow if…then… [else…] switch … while…do… repeat…until… for…do… Indentation replaces braces Method declaration Algorithm method (arg [, arg…]) Input… Output… Array indexing: A[i] Method call var.method (arg [, arg…]) Return value returnexpression Expressions Assignment(like  in Java) Equality testing(like  in Java) n2 Superscripts and other mathematical formatting allowed Pseudocode Details

2 1 0 The Random Access Machine (RAM) Model • A CPU • An potentially unbounded bank of memory cells, each of which can hold an arbitrary number or character • Memory cells are numbered and accessing any cell in memory takes unit time.

Basic computations performed by an algorithm Identifiable in pseudocode Largely independent from the programming language Exact definition not important (we will see why later) Assumed to take a constant amount of time in the RAM model Examples: Evaluating an expression Assigning a value to a variable Performing an arithmetic operation Comparing two numbers Indexing into an array Following an object reference Calling a method Returning from a method Primitive Operations

By inspecting the pseudocode, we can determine the maximum number of primitive operations executed by an algorithm, as a function of the input size AlgorithmarrayMax(A, n) # operations currentMaxA[0] 2 fori1ton 1do 1+n ifA[i]  currentMaxthen 2(n 1) currentMaxA[i] 2(n 1) { increment counter i } 2(n 1) returncurrentMax 1 Total 7n 2 Counting Primitive Operations

Algorithm arrayMax executes 7n 2 primitive operations in the worst case. Define: a = Time taken by the fastest primitive operation b = Time taken by the slowest primitive operation Let T(n) be the worst-case time of arrayMax.Thena (7n 1) T(n)b(7n 1) Hence, the running time T(n) is bounded by two linear functions Estimating Running Time

Growth Rate of Running Time • Changing the hardware/ software environment • Affects T(n) by a constant factor, but • Does not alter the growth rate of T(n) • The linear growth rate of the running time T(n) is an intrinsic property of algorithm arrayMax • Growth rates of functions: • Linear  n • Quadratic  n2 • Cubic  n3 • In a log-log chart, the slope of the line corresponds to the growth rate of the function

Constant Factors • The growth rate is not affected by • constant factors or • lower-order terms • Examples • 102n+105is a linear function • 105n2+ 108nis a quadratic function

Big-Oh Notation • Given functions f(n) and g(n), we say that f(n) is O(g(n))if there are positive constantsc and n0 such that f(n)cg(n) for n n0 • Example: 2n+10 is O(n) • 2n+10cn • (c 2) n  10 • n  10/(c 2) • Pick c = 3 and n0 = 10 • Example: the function n2is not O(n) • n2cn • n c • The above inequality cannot be satisfied since c must be a constant

More Big-Oh Examples • 7n-2 • 7n-2 is O(n) • need c > 0 and n0 1 such that 7n-2  c•n for n  n0 • this is true for c = 7 and n0 = 1 • 3n3 + 20n2 + 5 • 3n3 + 20n2 + 5 is O(n3) • need c > 0 and n0 1 such that 3n3 + 20n2 + 5  c•n3 for n  n0 • this is true for c = 4 and n0 = 21 • 3 log n + log log n • 3 log n + log log n is O(log n) • need c > 0 and n0 1 such that 3 log n + log log n  c•log n for n  n0 • this is true for c = 4 and n0 = 2

Big-Oh and Growth Rate • The big-Oh notation gives an upper bound on the growth rate of a function • The statement “f(n) is O(g(n))” means that the growth rate of f(n) is no more than the growth rate of g(n) • We can use the big-Oh notation to rank functions according to their growth rate

Big-Oh Rules • If is f(n) a polynomial of degree d, then f(n) is O(nd), i.e., • Drop lower-order terms • Drop constant factors • Use the smallest possible class of functions • Say “2n is O(n)”instead of “2n is O(n2)” • Use the simplest expression of the class • Say “3n+5 is O(n)”instead of “3n+5 is O(3n)”

Asymptotic Algorithm Analysis • The asymptotic analysis of an algorithm determines the running time in big-Oh notation • To perform the asymptotic analysis • We find the worst-case number of primitive operations executed as a function of the input size • We express this function with big-Oh notation • Example: • We determine that algorithm arrayMax executes at most 7n 2 primitive operations • We say that algorithm arrayMax “runs in O(n) time” • Since constant factors and lower-order terms are eventually dropped anyhow, we can disregard them when counting primitive operations

Summations • Summation: • Geometric summation: • Natural summation: • Harmonic number: • Split summation:

Logarithms and Exponents • Logarithms • properties of logarithms: logb(xy) = logbx + logby logb (x/y) = logbx - logby logbxa = alogbx logba = logxa/logxb • Exponents • properties of exponentials: a(b+c) = aba c abc = (ab)c ab /ac = a(b-c) b = a logab bc = a c*logab • Floor and ceiling functions • the largest integer less than or equal to x • the smallest integer greater than or equal to x

Proof Techniques • By Example -- counterexample • Contrapositive • Principle: To justify “if p is true, then q is true”, show “if q is not true, then p is not true”. • Example: To justify “if ab is odd, a is odd or b is even”, assume “’a is odd or b is odd’ is not true”, then “a is even and b is odd”, then “ab is even”, then “’ab is odd’ is not true” • Contradiction • Principle: To justify “p is true”, show “if p is not true, then there exists a contradiction”. • Example: To justify “if ab is odd, then a is odd or b is even”, let “ab is odd”, and assume “’a is odd or b is even’ is not true”, then “a is even and b is odd”, thus “ab is even” which is a contradiction to “ab is odd”.

Proof by Induction • Induction -- To justify S(n) for n>=n0 • Principle • Base cases: Justify S(n) is true for n0<=n<=n1 • Assumption: Assume S(n) is true for n=N>=n1; or Assume S(n) is true for n1<=n<=N • Induction: Justify S(n) is true for n=N+1 • Example: • Question: Fibonacci sequence is defined as F(1)=1, F(2)=1, and F(n)=F(n-1)+F(n-2) for n>2. Justify F(n)<2^n for all n>=1 • Proof by induction where n0=1. Let n1=2. • Base cases: For n0<=n<=n1, n=1 or n=2. If n=1, F(1)=1<2=2^1. If n=2, F(2)=2<4=2^2. It holds. • Assumption: Assume F(n)<2^n for n=N>=2. • Induction: For n=N+1, F(N+1)=F(N)+F(N-1)<2^N+2^(N-1)<2 2^N=2^(N+1). It holds.

Proof by Loop Invariants • Principle: • To prove a statement S about a loop is correct, define S in terms of a series of smaller statements S0, S1, …, Sk, where • The initial claim S0 is true before the loop begins • If Si-1 is true before iteration i begins, then show that Si is true after iteration i is over • The final statement Sk is true • Example: Consider the algorithm arrayMax • Statement S: max is the maximum number when finished • A series of smaller statements: • Si: max is the maximum in the first i+1 elements of the array • S0 is true before the loop • If Si is true, then easy to show Si+1 is true • S=Sn-1 is also true AlgorithmarrayMax(A, n) maxA[0] fori1ton 1do ifA[i]  maxthen maxA[i] returnmax

Basic Probability • Sample space: the set of all possible outcomes from some experiment • Probability space: a sample space S together with a probability function that maps subsets of S to real numbers between 0 and 1 • Event: Each subset of A of S called an event • Properties of the probability function Pr • Pr(Ø)=0 • Pr(S)=1 • 0<=Pr(A)<=1 for any subset A of S • If A and B are subsets of S and AB=, then Pr(AB)=Pr(A)+Pr(B) • Independence • A and B are independent if Pr(AB)=Pr(A)Pr(B) • A1, A2, …, An are mutually independent if Pr(A1A2… An)=Pr(A1)Pr(A2)…Pr(An)

Basic Probability • Conditional probability • The conditional probability that A occurs, given B, is defined as: Pr(A|B)=Pr(AB)/Pr(B), assuming Pr(B)>0 • Random variables • Intuitively, Variables whose values depend on the outcomes of some experiment • Formally, a function X that maps outcomes from some sample space S to real numbers • Indicator random variable: a variable that maps outcomes to either 0 or 1 • Expectation:a random variable has random values • Intuitively, average value of a random variable • Formally, expected value of a random variable X is defined as E(X)=xxPr(X=x) • Properties: • Linearity: E(X+Y) = E(X) + E(Y) • Independence: If X and Y are independent, that is, Pr(X=x|Y=y)=Pr(X=x), then E(XY)=E(X)E(Y)

Computing Prefix Averages • We further illustrate asymptotic analysis with two algorithms for prefix averages • The i-th prefix average of an array X is average of the first (i+ 1) elements of X: A[i]= (X[0] +X[1] +… +X[i])/(i+1) • Computing the array A of prefix averages of another array X has applications to financial analysis

Prefix Averages (Quadratic) • The following algorithm computes prefix averages in quadratic time by applying the definition AlgorithmprefixAverages1(X, n) Inputarray X of n integers Outputarray A of prefix averages of X #operations A new array of n integers n fori0ton 1do n sX[0] n forj1toido 1 + 2 + …+ (n 1) ss+X[j] 1 + 2 + …+ (n 1) A[i]s/(i+ 1)n returnA 1

The running time of prefixAverages1 isO(1 + 2 + …+ n) The sum of the first n integers is n(n+ 1) / 2 There is a simple visual proof of this fact Thus, algorithm prefixAverages1 runs in O(n2) time Arithmetic Progression

Prefix Averages (Linear) • The following algorithm computes prefix averages in linear time by keeping a running sum AlgorithmprefixAverages2(X, n) Inputarray X of n integers Outputarray A of prefix averages of X #operations A new array of n integers n s 0 1 fori0ton 1do n ss+X[i] n A[i]s/(i+ 1)n returnA 1 • Algorithm prefixAverages2 runs in O(n) time

Amortization • Amortization • Typical data structure supports a wide variety of operations for accessing and updating the elements • Each operation takes a varying amount of running time • Rather than focusing on each operation • Consider the interactions between all the operations by studying the running time of a series of these operations • Average the operations’ running time • Amortized running time • The amortized running time of an operation within a series of operations is defined as the worst-case running time of the series of operations divided by the number of operations • Some operations may have much higher actual running time than its amortized running time, while some others have much lower

The Clearable Table Data Structure • The clearable table • An ADT • Storing a table of elements • Being accessing by their index in the table • Two methods: • add(e) -- add an element e to the next available cell in the table • clear() -- empty the table by removing all elements • Consider a series of operations (add and clear) performed on a clearable table S • Each add takes O(1) • Each clear takes O(n) • Thus, a series of operations takes O(n2), because it may consist of only clears

Amortization Analysis • Theorem: • A series of n operations on an initially empty clearable table implemented with an array takes O(n) time • Proof: • Let M0, M1, …, Mn-1 be the series of operations performed on S, where k operations are clear • Let Mi0, Mi1, …, Mik-1 be the k clear operations within the series, and others be the (n-k) add operations • Define i-1=-1,Mij takes ij-ij-1, because at most ij-ij-1-1 elements are added by add operations between Mij-1 and Mij • The total time of the series is: • (n-k) + ∑k-1j=0(ij-ij-1) = n-k + (ik-1 - i-1) <= 2n-k • Total time is O(n) • Amortized time is O(1)

Accounting Method • The method • Use a scheme of credits and debits: each operation pays an amount of cyber-dollar • Some operations overpay --> credits • Some operations underpay --> debits • Keep the balance at any time at least 0 • Example: the clearable table • Each operation pays two cyber-dollars • add always overpays one dollar -- one credit • clear may underpay a variety of dollars • the underpaid amount equals the number of add operations since last clear - 2 • Thus, the balance is at least 0 • So the total cost is 2n -- may have credits

Potential Functions • Based on energy model • associate a value with the structure, Ø, representing the current energy state • each operation contributes to Ø a certain amount of energy t’ and consumes a varying amount of energy t • Ø0 -- the initial energy, Øi -- the energy after the i-th operation • for the i-th operation • ti -- the actual running time • t’i -- the amortized running time • ti = t’i + Øi-1 - Øi • Overall: T=∑(ti), T’=∑(t’i) • The total actual time T = T’ + Ø0 - Øn • As long as Ø0 <= Øn, T<=T’

Potential Functions • Example -- The clearable table • the current energy state Ø is defined as the number of elements in the table, thus Ø>=0 • each operation contributes to Ø t’=2 • Ø0 = 0 • Øi = Øi-1 + 1, if the i-th operation is add • add: t = t’+ Øi-1 - Øi = 2 - 1 = 1 • Øi = 0, if the i-th operation is clear • clear: t =t’+ Øi-1 = 2 + Øi-1 • Overall: T=∑ (t), T’=∑(t’) • The total actual time T = T’ + Ø0 - Øn • Because Ø0 = 0 <= Øn, T<=T’ =2n

Extendable Array • ADT: Extendable array is an array with extendable size. One of its methods is add to add an element to the array. If the array is not full, the element is added to the first available cell. When the array is full, the add method performs the following • Allocate a new array with double size • Copy elements from the old array to the new array • Add the element to the first available cell in the new array • Replace the old array with the new array • Question: What is the amortization time of add? • Two situations of add: • add -- O(1) • Extend and add -- (n) • Amortization analysis: • Each add deposits 3 dollars and spends 1 dollar • Each extension spends k dollars from the size k to 2k (copy) • Amortization time of add is O(1)

Relatives of Big-Oh • big-Omega • f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n)  c•g(n) for n  n0 • big-Theta • f(n) is (g(n)) if there are constants c’ > 0 and c’’ > 0 and an integer constant n0  1 such that c’•g(n)  f(n)  c’’•g(n) for n  n0 • little-oh • f(n) is o(g(n)) if, for any constant c > 0, there is an integer constant n0  0 such that f(n)  c•g(n) for n  n0 • little-omega • f(n) is (g(n)) if, for any constant c > 0, there is an integer constant n0  0 such that f(n)  c•g(n) for n  n0

Intuition for Asymptotic Notation Big-Oh • f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n) big-Omega • f(n) is (g(n)) if f(n) is asymptotically greater than or equal to g(n) big-Theta • f(n) is (g(n)) if f(n) is asymptotically equal to g(n) little-oh • f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n) little-omega • f(n) is (g(n)) if is asymptotically strictly greater than g(n)

Example Uses of the Relatives of Big-Oh • 5n2 is (n2) f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n)  c•g(n) for n  n0 let c = 5 and n0 = 1 • 5n2 is (n) f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n)  c•g(n) for n  n0 let c = 1 and n0 = 1 • 5n2 is (n) f(n) is (g(n)) if, for any constant c > 0, there is an integer constant n0  0 such that f(n)  c•g(n) for n  n0 need 5n02  c•n0  given c, the n0 that satisfies this is n0  c/5  0

CSC401 – Analysis of Algorithms Chapter 1 Algorithm Analysis