1 / 42

# - PowerPoint PPT Presentation

CS 420 – Design of Algorithms. Basic Concepts. Design of Algorithms. We need mechanism to describe/define algorithms Independent of the language implementation of the algorithm Pseudo-code. Algorithms.

Related searches for

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about '' - Gideon

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### CS 420 – Design of Algorithms

Basic Concepts

• We need mechanism to describe/define algorithms

• Independent of the language implementation of the algorithm

• Pseudo-code

• Algorithm – “any well defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output” Cormen, et a.

• Algorithm – “is a procedure (a finite set of well-defined instructions) for accomplishing some task which, given an initial state, will terminate in a defined end-state. “

• from Wikipedia.org

• http://en.wikipedia.org/wiki/Algorithm

• Human Genome Project

• Security/encryption for e-commerce

• Pulsar searches

• Search Engines

• Search engines

• Search algorithms

• linear search-

• run-time – linear function of n (n=size of database)

• suppose the DB has 40,000,000 records

• at 1000 read-compare cycles per second = 40,000 seconds = 667 minutes ~ 11 hours

• 730,000,000 hits in 0.1 seconds

• Binary tree search algorithm

• The keyword is indexed in a set of binary indexes – is keyword in left or right half of database?

Database

aaa-mon

moo-zxy

aaa-jaa

jaa-mon

moo-tev

tew-zxy

• Binary search algorithm

• So, to search a 40,000,000 record database

• for a single term –

• T(40,000,000) = log2(40,000,000)

• = 26 read compare cycles

• at 1000 read/compare cycles/sec = 0.026 seconds

• Binary Search Algorithm

• So, what about 730,000,000 records

• Search for a single keyword –

• Like English – easily readable

• Clear and consistent

• Rough correspondence to language implementation

• Should give a clear understanding of what the algorithm does

• Use indentation to indicate block structure. Blocks of code at the same level of indentation.

• Do not use “extra” statements like begin-end

• Looping constructs and conditionals are similar to Pascal (while, for, repeat, if-then-else). In for loops the loop counter is persistent

• Use a consistent symbol to indicate comments. Anything on line after this symbol is a comment, not code

• Multiple assignment is allowed

• Variables are local to a procedure unless explicitly declared as global

• Array elements are specified by the array name followed by indices in square brackets… A[i]

• .. indicates a range of values A[1..4] means elements 1,2,3,and 4 of array A

• Compound data can be represented as objects with attributes or fields. Reference these attributes array references. For example a variable that is the length of the array A is length[A]

• An array reference is a pointer

• Parameters are passed by value

• assignments to parameters within a procedure are local to the procedure

• Boolean operators short-circuit

• Be consistent

• don’t use read one place and input another unless they have functionally different meaning

INSERTION-SORT(A)

for j = 2 to length[A]

do key = A[j]

C* Insert A[j] into the sorted sequence A[1..j-1]

i=j-1

while i > 0 and A[i]> key

do A[i+1] = A[i]

i=i-1

A[i+1]=key

• Analysis may be concerned with any resources

• memory

• bandwidth

• runtime

• Need a model for describing runtime performance of an algorith

• RAM – Random Access Machine

• There are other models but for now…

• Assume that all instructions are sequential

• All data is accessible in one step

• Analyze performance (run-time) in terms of inputs

• meaning of inputs varies – size of an array, number of bits, vertices and edges, etc.

• Machine independent

• Language independent

• Need to base analysis on cost of instruction execution

• assign costs (run-time) to each instruction

• Run-time = sum of products of costs (instruction runtimes) and execution occurrences

• T(n)= c1n + c2(n-1) + c4(n-1) +

c5nj=2tj +c6nj=2(tj-1) + c7nj=2(tj-1)

+c8(n-1)

• Best case vs Worst Case

• Best case

• Worst case

• Input array sorted in reverse order

• For sake of discussion…

• assume that all c=2

• then, for best case

• T(n) = 10n-8

• n=1000, T(n) = 9992

• for worst case …

• T(n) = 3n2+7n-8

• n=1000, T(n) = 3006992

* Best case is a linear function of n

• the big picture

• the trend in run-time performance as the problem grows

• not concerned about small differences in algorithms

• what happens to the algorithm as the problem gets explosively large

• the order of growth

• The cost coefficients will not vary that much… and will not contribute significantly to the growth of run-time performance

• so we can set them to a constant

• … and we can ignore them

• remember the earlier example –

• c1 = c2 = … = 2

• In a polynomial run-time function the order of growth is controlled by the higher order term

• T(n) = 3n2+7n-8

• so we can ignore (discard) the lower order terms

• T(n) = 3n2

• It turns out that with sufficiently large n the coefficient of the high order term is not that important in characterizing the order of growth of a run-time function

• So, from that perspective the run-time function of the Insertion-Sort algorithm (worst-case) is -

• T(n) = n2

• Are these abstraction assumptions correct?

• for small problems – no

• but for sufficiently large problem

• they do a pretty good job of characterizing the run-time function of algorithms

• Incremental approach to algorithm design

• Design for a very small case

• expand the complexity of the problem and algorithm

• Divide and Conquer

• Divide it into smaller problems

• Solve smaller problems

• Combine results from smaller problems

• Suppose:

• you have an array evenly divisible by two

• in each half (left and right) values are already sorted in order

• but not in order across the whole array

• task: sort the array so that it is in order across the entire array

• Split the array into two subarrays

• Add a marker to each subarrays to indicate the end

• Set index to first value of each subarray

• Compare indexed (pointed to) value of each subarray

• If either indexed value is an end-marker: move all remaining values (except the end-mark from the other subarray to the output array; Stop

• Move the smallest of the two values to the output array (sorted); increment the index to that subarray

• Go to step 4

• Where A is the array containing values to be sorted, each half is already sorted from smallest to largest

• p = is the starting point index for the array A

• q = is the end point index for the left side of array A (end of first half… sort of)

• r = end index for array A

• So, sort values from p to r from two halves of array A where q marks where to split the array into subarray

• n1 = q – p + 1

• n2 = r – q

• c* create subarrays L[1..n1+1] and R[1..n2+1]

• for i = 1 to n1

• do L[i] = A[p+i-1]

• for j = 1 to n2

• do R[j] = A[q+j]

• L[n1+1] = 

• R[n2+1] = 

• i = 1

• j = 1

• for k = p to r

• do if L[i] <= R[j]

• then A[k] = L[i]

• i = i + 1

• else A[k] = R[j]

• j = j + 1

• if p < r

• then q = (p+r)/2

• MERGE_SORT(A, p, q)

• MERGE_SORT(A, q+1, r)

• MERGE(A, p, q, r)

• Big  (theta)

• (g(n)) = {f(n) : there exists two constants c1 and c2, n0 such that

0<=c1g(n)<=f(n)<=c2g(n) for all n >=n0}

• Big O (oh)

• O(g(n)) = {f(n) : there positive constants c and n0 such that 0<=f(n)<=cg(n) for all n >=n0}

• Big  (Omega)

• (g(n)) = {f(n) : there positive constants c and n0 such that 0<=cg(n)<=f(n) for all n >=n0}

• Little o (oh)

• o(g(n)) = {f(n) : there positive constants c>0 and n0>0 such that 0<=f(n)<cg(n) for all n >=n0}

• Little  (omega)

• (g(n)) = {f(n) : there positive constants c>0 there exists a constant n0 such that

0<=cg(n)<f(n) for all n >=n0 }