CS 420 – Design of Algorithms

1 / 42

# CS 420 – Design of Algorithms - PowerPoint PPT Presentation

CS 420 – Design of Algorithms. Basic Concepts. Design of Algorithms. We need mechanism to describe/define algorithms Independent of the language implementation of the algorithm Pseudo-code. Algorithms.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## CS 420 – Design of Algorithms

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### CS 420 – Design of Algorithms

Basic Concepts

Design of Algorithms
• We need mechanism to describe/define algorithms
• Independent of the language implementation of the algorithm
• Pseudo-code
Algorithms
• Algorithm – “any well defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output” Cormen, et a.
Algorithms
• Algorithm – “is a procedure (a finite set of well-defined instructions) for accomplishing some task which, given an initial state, will terminate in a defined end-state. “
• from Wikipedia.org
• http://en.wikipedia.org/wiki/Algorithm
Algorithms
• Human Genome Project
• Security/encryption for e-commerce
• Pulsar searches
• Search Engines
Algorithms
• Search engines
• Search algorithms
• linear search-
• run-time – linear function of n (n=size of database)
• suppose the DB has 40,000,000 records
• at 1000 read-compare cycles per second = 40,000 seconds = 667 minutes ~ 11 hours
Algorithms
• 730,000,000 hits in 0.1 seconds
Algorithms
• Binary tree search algorithm
• The keyword is indexed in a set of binary indexes – is keyword in left or right half of database?

Database

aaa-mon

moo-zxy

aaa-jaa

jaa-mon

moo-tev

tew-zxy

Algorithms
• Binary search algorithm
• So, to search a 40,000,000 record database
• for a single term –
• T(40,000,000) = log2(40,000,000)
• = 26 read compare cycles
• at 1000 read/compare cycles/sec = 0.026 seconds
Algorithms
• Binary Search Algorithm
• So, what about 730,000,000 records
• Search for a single keyword –
Pseudo-code
• Like English – easily readable
• Clear and consistent
• Rough correspondence to language implementation
• Should give a clear understanding of what the algorithm does
Using Pseudo-code
• Use indentation to indicate block structure. Blocks of code at the same level of indentation.
• Do not use “extra” statements like begin-end
• Looping constructs and conditionals are similar to Pascal (while, for, repeat, if-then-else). In for loops the loop counter is persistent
Using Pseudo-code
• Use a consistent symbol to indicate comments. Anything on line after this symbol is a comment, not code
• Multiple assignment is allowed
• Variables are local to a procedure unless explicitly declared as global
• Array elements are specified by the array name followed by indices in square brackets… A[i]
Pseudo-code
• .. indicates a range of values A[1..4] means elements 1,2,3,and 4 of array A
• Compound data can be represented as objects with attributes or fields. Reference these attributes array references. For example a variable that is the length of the array A is length[A]
Pseudo-code
• An array reference is a pointer
• Parameters are passed by value
• assignments to parameters within a procedure are local to the procedure
• Boolean operators short-circuit
• Be consistent
• don’t use read one place and input another unless they have functionally different meaning
Insertion-Sort Algorithm

INSERTION-SORT(A)

for j = 2 to length[A]

do key = A[j]

C* Insert A[j] into the sorted sequence A[1..j-1]

i=j-1

while i > 0 and A[i]> key

do A[i+1] = A[i]

i=i-1

A[i+1]=key

Analysis of Algorithms
• Analysis may be concerned with any resources
• memory
• bandwidth
• runtime
• Need a model for describing runtime performance of an algorith
• RAM – Random Access Machine
RAM
• There are other models but for now…
• Assume that all instructions are sequential
• All data is accessible in one step
• Analyze performance (run-time) in terms of inputs
• meaning of inputs varies – size of an array, number of bits, vertices and edges, etc.
• Machine independent
• Language independent
RAM
• Need to base analysis on cost of instruction execution
• assign costs (run-time) to each instruction
INSERTION-SORT
• Run-time = sum of products of costs (instruction runtimes) and execution occurrences
• T(n)= c1n + c2(n-1) + c4(n-1) +

c5nj=2tj +c6nj=2(tj-1) + c7nj=2(tj-1)

+c8(n-1)

INSERTION-SORT
• Best case vs Worst Case
• Best case
• Worst case
• Input array sorted in reverse order
INSERTION-SORT
• For sake of discussion…
• assume that all c=2
• then, for best case
• T(n) = 10n-8
• n=1000, T(n) = 9992
• for worst case …
• T(n) = 3n2+7n-8
• n=1000, T(n) = 3006992
Insertion-sort Performance

* Best case is a linear function of n

So, what are we really interested in?
• the big picture
• the trend in run-time performance as the problem grows
• not concerned about small differences in algorithms
• what happens to the algorithm as the problem gets explosively large
• the order of growth
Abstractions and assumptions
• The cost coefficients will not vary that much… and will not contribute significantly to the growth of run-time performance
• so we can set them to a constant
• … and we can ignore them
• remember the earlier example –
• c1 = c2 = … = 2
Abstractions and assumptions
• In a polynomial run-time function the order of growth is controlled by the higher order term
• T(n) = 3n2+7n-8
• so we can ignore (discard) the lower order terms
• T(n) = 3n2
Abstractions and assumptions
• It turns out that with sufficiently large n the coefficient of the high order term is not that important in characterizing the order of growth of a run-time function
• So, from that perspective the run-time function of the Insertion-Sort algorithm (worst-case) is -
• T(n) = n2
Abstractions and assumptions
• Are these abstraction assumptions correct?
• for small problems – no
• but for sufficiently large problem
• they do a pretty good job of characterizing the run-time function of algorithms
Design of Algoritms
• Incremental approach to algorithm design
• Design for a very small case
• expand the complexity of the problem and algorithm
• Divide and Conquer
• Divide it into smaller problems
• Solve smaller problems
• Combine results from smaller problems
Another look at Sort algorithms
• Suppose:
• you have an array evenly divisible by two
• in each half (left and right) values are already sorted in order
• but not in order across the whole array
• task: sort the array so that it is in order across the entire array
Merge Sorted subarrays
• Split the array into two subarrays
• Add a marker to each subarrays to indicate the end
• Set index to first value of each subarray
• Compare indexed (pointed to) value of each subarray
• If either indexed value is an end-marker: move all remaining values (except the end-mark from the other subarray to the output array; Stop
• Move the smallest of the two values to the output array (sorted); increment the index to that subarray
• Go to step 4
Merge(A, p, q, r)
• Where A is the array containing values to be sorted, each half is already sorted from smallest to largest
• p = is the starting point index for the array A
• q = is the end point index for the left side of array A (end of first half… sort of)
• r = end index for array A
• So, sort values from p to r from two halves of array A where q marks where to split the array into subarray
Merge(A, p, q, r)
• n1 = q – p + 1
• n2 = r – q
• c* create subarrays L[1..n1+1] and R[1..n2+1]
• for i = 1 to n1
• do L[i] = A[p+i-1]
• for j = 1 to n2
• do R[j] = A[q+j]
• L[n1+1] = 
• R[n2+1] = 
• i = 1
• j = 1
• for k = p to r
• do if L[i] <= R[j]
• then A[k] = L[i]
• i = i + 1
• else A[k] = R[j]
• j = j + 1
MERGE_SORT(A,p,r)
• if p < r
• then q = (p+r)/2
• MERGE_SORT(A, p, q)
• MERGE_SORT(A, q+1, r)
• MERGE(A, p, q, r)
Asymptotic Notation
• Big  (theta)
• (g(n)) = {f(n) : there exists two constants c1 and c2, n0 such that

0<=c1g(n)<=f(n)<=c2g(n) for all n >=n0}

Asymptotic Notation
• Big O (oh)
• O(g(n)) = {f(n) : there positive constants c and n0 such that 0<=f(n)<=cg(n) for all n >=n0}
Asymptotic Notation
• Big  (Omega)
• (g(n)) = {f(n) : there positive constants c and n0 such that 0<=cg(n)<=f(n) for all n >=n0}
Asymptotic Notation
• Little o (oh)
• o(g(n)) = {f(n) : there positive constants c>0 and n0>0 such that 0<=f(n)<cg(n) for all n >=n0}
Asymptotic Notation
• Little  (omega)
• (g(n)) = {f(n) : there positive constants c>0 there exists a constant n0 such that

0<=cg(n)<f(n) for all n >=n0 }