Understanding Algorithm Efficiency: From Basics to Advanced Concepts

CMPT 420 Algorithms

What are Algorithms? • An algorithm is a sequence of computational steps that transform the input into the output. • An algorithm is also a tool for solving a well-specified computational problem. • E.g., sorting problem: • <31, 26, 41, 59, 58> is an instance of the sorting problem.

An algorithm is correct if, for every input instance, it halts with the correct output.

Analyzing Algorithms • Predict the amount of resources required: • memory: how much space is needed? • computational time: how fast the algorithm runs? • FACT: running time grows with the size of the input • Input size (number of elements in the input) • Size of an array, # of elements in a matrix, # of bits in the binary representation of the input, vertices and edges in a graph • Def: Running time = the number of primitive operations (steps) executed before termination • Arithmetic operations (+, -, *), data movement, control, decision making (if, while), comparison

Algorithm Efficiency vs. Speed • E.g.: sorting n numbers (n = 106) • Friend’s computer = 109 instructions/second • Friend’s algorithm = 2n2instructions (insertion sort) • Your computer = 107 instructions/second • Your algorithm = 50nlgninstructions (merge sort)

Algorithm Efficiency vs. Speed • To sort 100 million numbers: • Insertion sort takes more than 23 days • Merge sort takes under 4 hours

Typical Running Time Functions • 1 (constant running time): • Instructions are executed once or a few times • logN(logarithmic) • A big problem is solved by cutting the original problem in smaller sizes, by a constant fraction at each step • N (linear) • A small amount of processing is done on each input element • N logN • A problem is solved by dividing it into smaller problems, solving them independently and combining the solution

Typical Running Time Functions • N2 (quadratic) • Typical for algorithms that process all pairs of data items (double nested loops) • N3(cubic) • Processing of triples of data (triple nested loops) • NK(polynomial) • 2N(exponential) • Few exponential algorithms are appropriate for practical use

Why Faster Algorithms?

Insertion Sort • Idea: like sorting a hand of playing cards • Remove one card at a time from the table, and insert it into the correct position in the left hand • compare it with each of the cards already in the hand, from right to left

Example of insertion sort 5 2 4 6 1 3

j n 1 i A: key sorted INSERTION-SORT INSERTION-SORT(A, n)⊳A[1 . . n] for j ←2 to n do key ← A[ j] i ← j –1 while i > 0 and A[i] > key do A[i+1] ← A[i] i ← i –1 A[i+1] = key Insertion sort sorts the elements in place.

Loop Invariant for Insertion Sort • Invariant: at the start of each iteration of the for loop, the elements in A[1 . . j-1] are in sorted order INSERTION-SORT(A, n)⊳A[1 . . n] for j ←2 to n do key ← A[ j] i ← j –1 while i > 0 and A[i] > key do A[i+1] ← A[i] i ← i –1 A[i+1] = key

Proving Loop Invariants • Proving loop invariants works like induction • Initialization (base case): • It is true prior to the first iteration of the loop • Maintenance (inductive step): • If it is true before an iteration of the loop, it remains true before the next iteration • Termination: • When the loop terminates, the invariant gives us a useful property that helps show that the algorithm is correct

Loop Invariant for Insertion Sort • Initialization: • Just before the first iteration, j = 2: the subarray A[1 . . j-1] = A[1], (the element originally in A[1]) – is sorted

Loop Invariant for Insertion Sort • Maintenance: • the while inner loop moves A[j -1], A[j -2], A[j -3], and so on, by one position to the right until the proper position for key (which has the value that started out in A[j]) is found • At that point, the value of key is placed into this position.

Invariant: at the start of each iteration of the for loop, the elements in A[1 . . j-1] are in sorted order Loop Invariant for Insertion Sort • Termination: • The outer for loop ends when j = n + 1 • Replace n with j-1 in the loop invariant: • the subarray A[1 . . n] consists of the elements originally in A[1. . n], but in sorted order

Analysis of Insertion Sort

Running time •The running time depends on the input: an already sorted sequence is easier to sort. •Parameterize the running time by the size of the input, since short sequences are easier to sort than long ones. •Generally, we seek upper bounds on the running time, because everybody likes a guarantee.

Kinds of analyses Worst-case: •T(n) =maximum time of algorithm on any input of size n. Average-case: •T(n) =expected time of algorithm over all inputs of size n. • Need assumption of statistical distribution of inputs. Best-case: • Cheat with a slow algorithm that works fast on some input.

Machine-independent time What is insertion sort’s worst-case time? •It depends on the speed of our computer: •relative speed (on the same machine), •absolute speed (on different machines). BIG IDEA: •Ignore machine-dependent constants. •Look at growth of T(n) as n→∞. “Asymptotic Analysis”

Θ-notation Math: Θ(g(n))= { f (n): there exist positive constants c1, c2, and n0 such that 0 ≤c1g(n) ≤f (n) ≤c2g(n) for all n≥n0} Engineering: •Drop low-order terms; ignore leading constants. •Example: 3n3 + 90n2–5n+ 6046 = Θ(n3)

Asymptotic performance When n gets large enough, a Θ(n2)algorithm always beats a Θ(n3)algorithm. •We shouldn’t ignore asymptotically slower algorithms, however. •Real-world design situations often call for a careful balancing of engineering objectives. •Asymptotic analysis is a useful tool to help to structure our thinking. T(n) n0 n

Best Case Analysis • The array is already sorted • A[i] ≤ key upon the first time the while loop test is run (when i = j -1) • tj= 1

Worst Case Analysis • The array is in reverse sorted order • Always A[i] > key in while loop test • Have to compare key with all elements to the left of the j-th position • compare with j-1 elements • tj= j

Average Case? • All permutations equally likely.

Insertion Sort Summary • Advantages • Good running time for “almost sorted” arrays θ(n) • Disadvantages • θ(n2) running time in worst and average case Is insertion sort a fast sorting algorithm? •Moderately so, for small n. •Not at all, for large n.

Worst-Case and Average-Case • We usually concentrate on finding only the worst-case running time • an upper bound on the running time • For some algorithms, the worst case occurs often. • E.g., searching when information is not present in the DB • The average case is often as bad as the worst case.

Merge Sort MERGE-SORT A[1 . . n] 1.If n= 1, done. 2.Recursively sort A[ 1 . . .n/2]and A[ [n/2]+1 . . n ] . 3.“Merge” the 2 sorted lists. Key subroutine: MERGE

Divide-and-Conquer • Dividethe problem into a number of subproblems • Similar sub-problems of smaller size • Conquerthe sub-problems • Solve the sub-problems recursively • Sub-problem size small enough to solve the problems in straightforward manner • Combinethe solutions to the sub-problems • Obtain the solution for the original problem

Merge Sort Approach • To sort an array A[p . . r]: • Divide • Divide the n-element sequence to be sorted into two subsequences of n/2 elements each • Conquer • Sort the subsequences recursively using merge sort • When the size of the sequences is 1 there is nothing more to do • Combine • Merge the two sorted subsequences

Merge sort

Example

Analyzing merge sort MERGE-SORTA[1 . . n] 1.If n= 1, done. 2.Recursively sort A[ 1 . . 「 n/2」] and A[「n/2」+1 . . n ] . 3.“Merge”the 2sorted lists T(n) Θ(1) 2T(n/2) ? Sloppiness: Should be T(「 n/2」) + T(「n/2」) , but it turns out not to matter asymptotically.

Merging two sorted arrays 20 12 13 11 7 9 2 1

Merging two sorted arrays 20 12 13 11 7 9 2 20 12 13 11 7 9 20 12 13 11 9 20 12 13 11 20 12 13 20 12 13 11 7 9 2 1 1 2 11 12 9 7 Time = Θ(n) to merge a total of n elements (linear time).

In place sort? Run time?

MERGE – SORT Running Time • Divide: • compute q as the average of p and r: D(n) = θ(1) • Conquer: • recursively solve 2 subproblems, each of size n/2 -> 2T (n/2) • Combine: • MERGE on an n-element subarray takes θ(n) time C(n) = θ(n) • T(n) = 2T(n/2) + θ(n) if n > 1

Analyzing Divide and Conquer Algorithms • The recurrence is based on the three steps of the paradigm: • T(n) – running time on a problem of size n • Divide the problem into a subproblems, each of size n/b: takes D(n) • Conquer (solve) the subproblems: takes aT(n/b) • Combine the solutions: takes C(n) • T(n) = aT(n/b) + D(n) + C(n) otherwise

Recurrence for merge sort Θ(1) ifn= 1; 2T(n/2)+ Θ(n) ifn> 1. T(n) = • We shall usually omit stating the base case when T(n) = Θ(1) for sufficiently small n, but only when it has no effect on the asymptotic solution to the recurrence.

Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.

Recursion tree cn Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/2 cn/2 cn/4 cn/4 cn/4 cn/4 Θ(1) cn h= lgn cn ．．．．．． Θ(n) #leaves = n Total= Θ( n lg n)

Conclusions •Θ(n lg n) grows more slowly than Θ(n2). • Therefore, merge sort asymptotically beats insertion sort in the worst case. • Disadvantage • Requires extra space Θ (n) • In practice, merge sort beats insertion sort for n> 30 or so.

Divide-and-Conquer Example:Binary Search Find an element in a sortedarray: 1. Divide: Check middle element. 2. Conquer: Recursively search 1 subarray. 3. Combine: Trivial. • A[8] = {1, 2, 3, 4, 5, 7, 9, 11} Find 7

Divide-and-Conquer Example:Binary Search • For an ordered array A, finds if x is in the array A[lo…hi]

Example • A[8] = {1, 2, 3, 4, 5, 7, 9, 11} • lo = 1 hi = 8 x = 6

Analysis of Binary Search ?

Divide-and-Conquer Example:Powering a Number ? ? ?

Understanding Algorithm Efficiency: From Basics to Advanced Concepts

Understanding Algorithm Efficiency: From Basics to Advanced Concepts

Presentation Transcript

CS 420 – Design of Algorithms

CMPT 420 / CMPG 720 Artificial Intelligence

CMPT 371

CMPT 371

CMPT 120 Algorithms

CMPT 371

CMPT 225

CMPT 225

CMPT 466

CMPT 371

CMPT 371

CMPT 371

CMPT 361

cmpt-225

CMPT 401

CS 420 - Design of Algorithms

CMPT 371

CMPT 225

CMPT 280 Intermediate Data Structures and Algorithms

CMPT 225

CMPT 126