TCSS 342, Winter 2006 Lecture Notes

TCSS 342, Winter 2006Lecture Notes Course Overview,Review of Math Concepts, Algorithm Analysis and Big-Oh Notation

Course objectives • (broad) prepare you to be a good software engineer • (specific) learn basic data structures and algorithms • data structures – how data is organized • algorithms – unambiguous sequence of steps to compute something

Software design goals • What are some goals one should have for good software?

Course content • data structures • algorithms • data structures + algorithms = programs • algorithm analysis – determining how long an algorithm will take to solve a problem • Who cares? Aren't computers fast enough and getting faster?

An example Given an array of 1,000,000 integers, find the maximum integer in the array. Now suppose we are asked to find the kth largest element (The Selection Problem)

Candidate solutions • candidate solution 1 • sort the entire array (from small to large), using Java's Arrays.sort() • pick out the (1,000,000 –k)th element • candidate solution 2 • sort the first k elements • for each of the remaining 1,000,000 –k elements, • keep the k largest in an array • pick out the smallest of the k survivors

Is either solution good? • Is there a better solution? • What makes a solution "better" than another? Is it entirely based on runtime? • How would you go about determining which solution is better? • could code them, test them • could somehow make predictions and analysis of each solution, without coding

Why algorithm analysis? • as computers get faster and problem sizes get bigger, analysis will become more important • The difference between good and bad algorithms will get bigger • being able to analyze algorithms will help us identify good ones without having to program them and test them first

Why data structures? • when programming, you are an engineer • engineers have a bag of tools and tricks – and the knowledge of which tool is the right one for a given problem • Examples: arrays, lists, stacks, queues, trees, hash tables, heaps, graphs

Development practices • modular (flexible) code • appropriate commenting of code • each method needs a comment explaining its parameters and its behavior • writing code to match a rigid specification • being able to choose the right data structures to solve a variety of programming problems • using an integrated development environment (IDE) • incremental development

Math background: exponents • Exponents • XY = "X to the Yth power";X multiplied by itself Y times • Some useful identities • XA XB = XA+B • XA / XB = XA-B • (XA)B = XAB • XN+XN = 2XN • 2N+2N = 2N+1

Logarithms • Logarithms • definition: XA = B if and only if logX B = A • intuition: logX B means "the power X must be raised to, to get B" • a logarithm with no base implies base 2log B means log2 B • Examples • log2 16 = 4 (because 24 = 16) • log10 1000 = 3 (because 103 = 1000)

Logarithms, continued • log AB = log A + log B • Proof: (let's write it together!) • log A/B = log A – log B • log (AB) = B log A • example:log432 = (log2 32) / (log2 4) = 5 / 2

Arithmetic series • Series • for some expression Expr (possibly containing i ), means the sum of all values of Expr with each value of i between j and k inclusive • Example: • = (2(0) + 1) + (2(1) + 1) + (2(2) + 1) + (2(3) + 1) + (2(4) + 1)= 1 + 3 + 5 + 7 + 9= 25

Series identities • sum from 1 through N inclusive • is there an intuition for this identity? • sum of all numbers from 1 to N1 + 2 + 3 + ... + (N-2) + (N-1) + N • how many terms are in this sum? Can we rearrange them?

Series • sum of powers of 2 • 1 + 2 + 4 + 8 + 16 + 32 = 64 - 1 • think about binary representation of numbers... • sum of powers of any number a • ("Geometric progression" for a>0, a ≠1)

Algorithm performance • How to determine how much time an algorithm A uses to solve problem X ? • Depends on input; use input size "N" as parameter • Determine function f(N) representing cost • empirical analysis: code it and use timer running on many inputs • algorithm analysis: Analyze steps of algorithm, estimating amount of work each step takes

RAM model • Typically use a simple model for basic operation costs • RAM (Random Access Machine) model • RAM model has all the basic operations:+, -, *, / , =, comparisons • fixed sized integers (e.g., 32-bit) • infinite memory • All basic operations take exactly one time unit (one CPU instruction) to execute

Critique of the model • Strengths: • simple • easier to prove things about the model than the real machine • can estimate algorithm behavior on any hardware/software • Weaknesses: • not all operations take the same amount of time in a real machine • does not account for page faults, disk accesses, limited memory, floating point math, etc

Why use models? model real world • Idea: useful statements using the model translate into useful statements about real computers

Relative rates of growth • most algorithms' runtime can be expressed as a function of the input size N • rate of growth: measure of how quickly the graph of a function rises • goal: distinguish between fast- and slow-growing functions • we only care about very large input sizes(for small sizes, most any algorithm is fast enough) • this helps us discover which algorithms will run more quickly or slowly, for large input sizes

Growth rate example Consider these graphs of functions. Perhaps each one represents an algorithm: n3 + 2n2 100n2 + 1000 • Which growsfaster?

Growth rate example • How about now?

Big-Oh notation • Defn: T(N) = O(f(N))if there exist positive constants c , n0such that: T(N)  c· f(N) for all N  n0 • idea: We are concerned with how the function grows when N is large. We are not picky about constant factors: coarse distinctions among functions • Lingo: "T(N) grows no faster than f(N)."

Examples • n = O(2n) ? • 2n = O(n) ? • n = O(n2) ? • n2 = O(n) ? • n = O(1) ? • 100 = O(n) ? • 10 log n = O(n) ? • 214n + 34 = O(2n2 + 8n) ?

Preferred big-Oh usage • pick tightest bound. If f(N) = 5N, then: f(N) = O(N5) f(N) = O(N3) f(N) = O(N)  preferred f(N) = O(N log N) • ignore constant factors and low order terms • T(N) = O(N), not T(N) = O(5N) • T(N) = O(N3), not T(N) = O(N3 + N2 + N log N) • Bad style: f(N)  O(g(N)) • Wrong: f(N)  O(g(N))

Big-Oh of selected functions

Ten-fold processor speedup 2n

Big omega, theta • Defn: T(N) = (g(N)) if there are positive constants c and n0such that T(N)  c g(N) for all N  n0 • Lingo: "T(N) grows no slower than g(N)." • Defn: T(N) = (h(N)) if and only if T(N) = O(h(N)) and T(N) = (h(N)). • Big-Oh, Omega, and Theta establish a relative order among all functions of N

Intuition, little-Oh • Defn: T(N) = o(p(N))if T(N) = O(p(N)) and T(N)  (p(N))

More about asymptotics • Fact: If f(N) = O(g(N)), then g(N) = (f(N)). • Proof: Suppose f(N) = O(g(N)).Then there exist constants c and n0 such that f(N) c g(N) for all N n0 Then g(N)  (1/c) f(N) for all N n0,and so g(N) = (f(N))

More terminology • T(N) = O(f(N)) • f(N) is an upper bound on T(N) • T(N) grows no faster than f(N) • T(N) = (g(N)) • g(N) is a lower bound on T(N) • T(N) grows at least as fast as g(N) • T(N) = o(h(N)) • T(N) grows strictly slower than h(N)

Facts about big-Oh • If T1(N) = O(f(N)) and T2(N) = O(g(N)), then • T1(N) + T2(N) = O(f(N) + g(N)) • T1(N) * T2(N) = O(f(N) * g(N)) • If T(N) is a polynomial of degree k, then:T(N) = (Nk) • example: 17n3 + 2n2 + 4n + 1 = (n3) • logk N = O(N), for any constant k

Techniques • Algebra ex. f(N) = N / log N g(N) = log N same as asking which grows faster, N or log 2 N • Evaluate:

Techniques, cont'd • L'Hôpital's rule: If and , then example: f(N) = N, g(N) = log N Use L'Hôpital's rule f'(N) = 1, g'(N) = 1/N  g(N) = o(f(N))

Program loop runtimes for (int i = 0; i < n; i += c) // O(n) statement(s); • Adding to the loop counter means that the loop runtime grows linearly when compared to its maximum value n. for (int i = 0; i < n; i *= c) // O(log n) statement(s); • Multiplying the loop counter means that the maximum value n must grow exponentially to linearly increase the loop runtime; therefore, it is logarithmic. for (int i = 0; i < n * n; i += c) // O(n2) statement(s); • The loop maximum is n2, so the runtime is quadratic.

More loop runtimes for (int i = 0; i < n; i += c) // O(n2) for (int j = 0; j < n; i += c) statement; • Nesting loops multiplies their runtimes. for (int i = 0; i < n; i += c) statement; for (int i = 0; i < n; i += c) // O(n log n) for (int j = 0; j < n; i *= c) statement; • Loops in sequence add together their runtimes, which means the loop set with the larger runtime dominates.

Maximum subsequence sum The maximum contiguous subsequence sum problem: Given a sequence of integers A0, A1, ..., An - 1, find the maximum value offor any integers 0  (i, j) < n. (This sum is zero if all numbers in the sequence are negative.)

First algorithm (brute force) try all possible combinations of subsequences // implement together function maxSubsequence(array[]): max sum = 0 for each starting index i, for each ending index j, add up the sum from Ai to Aj if this sum is bigger than max, max sum = this sum return max sum • What is the runtime (Big-Oh) of this algorithm? How could it be improved?

Second algorithm (improved) • still try all possible combinations, but don't redundantly add the sums • key observation: • in other words, we don't need to throw away partial sums • can we use this information to remove one of the loops from our algorithm? // implement together function maxSubsequence2(array[]): • What is the runtime (Big-Oh) of this new algorithm? Can it still be improved further?

Third algorithm (improved!) • must avoid trying all possible combinations; to do this, we must find a way to broadly eliminate many potential combinations from consideration • For convenience let • Ai,j to denote the subsequence from index i to j. • Si,j to denote the sum of subsequence Ai,j.

Third algorithm, continued • Claim #1: A subsequence with a negative sum cannot be the start of the maximum-sum subsequence.

Third algorithm, continued • Claim #1, more formally: If Ai, j is a subsequence such that ,then there is no q>j such that Ai,q is the maximum-sum subsequence. • Proof: (do it together in class) • Can this help us produce a better algorithm?

Third algorithm, continued • Claim #2: When examining subsequences left - to - right, for some starting index i, searching over ending index j, suppose Ai,j is the first subsequence with a negative sum. (so , and this is not true for smaller j)Then the maximum-sum subsequence cannot start at any index p satisfying i<p j. • Proof: (do it together in class)

Third algorithm, continued • These figures show the possible contents of Ai,j

Third algorithm, continued • Can we eliminate another loop from our algorithm? // implement together function maxSubsequence3(array[]): • What is its runtime (Big-Oh)? • Is there an even better algorithm than this third algorithm? Can you make a strong argument about why or why not?

Kinds of runtime analysis • Express the running time as f(N), where N is the size of the input • worst case: your enemy gets to pick the input • average case: need to assume a probability distribution on the inputs • amortized: your enemy gets to pick the inputs/operations, but you only have to guarantee speed over a large number of operations

References • Lewis & Chase book, Chapter 1 (mostly sections 1.2, 1.6) • Rosen’s Discrete Math book, mostly sections 2.1-2.3. (Also various other sections for math review)

TCSS 342, Winter 2006 Lecture Notes