Cs 420 design of algorithms
Download
1 / 42

CS 420 – Design of Algorithms - PowerPoint PPT Presentation


  • 246 Views
  • Updated On :
  • Presentation posted in: Internet / Web

CS 420 – Design of Algorithms. Basic Concepts. Design of Algorithms. We need mechanism to describe/define algorithms Independent of the language implementation of the algorithm Pseudo-code. Algorithms.

Related searches for CS 420 – Design of Algorithms

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha

Download Presentation

CS 420 – Design of Algorithms

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


CS 420 – Design of Algorithms

Basic Concepts


Design of Algorithms

  • We need mechanism to describe/define algorithms

  • Independent of the language implementation of the algorithm

  • Pseudo-code


Algorithms

  • Algorithm – “any well defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output” Cormen, et a.


Algorithms

  • Algorithm – “is a procedure (a finite set of well-defined instructions) for accomplishing some task which, given an initial state, will terminate in a defined end-state. “

    • from Wikipedia.org

      • http://en.wikipedia.org/wiki/Algorithm


Algorithms

  • Human Genome Project

  • Security/encryption for e-commerce

  • Spacecraft navigation

  • Pulsar searches

  • Search Engines


Algorithms

  • Search engines

  • Search algorithms

    • linear search-

      • read-compare-read-…

      • run-time – linear function of n (n=size of database)

      • suppose the DB has 40,000,000 records

      • then 40,000,000 read-compare cycles

      • at 1000 read-compare cycles per second = 40,000 seconds = 667 minutes ~ 11 hours


Algorithms

  • Search Google for “house”

    • 730,000,000 hits in 0.1 seconds


Algorithms

  • Binary tree search algorithm

  • The keyword is indexed in a set of binary indexes – is keyword in left or right half of database?

Database

aaa-mon

moo-zxy

aaa-jaa

jaa-mon

moo-tev

tew-zxy


Algorithms

  • Binary search algorithm

  • So, to search a 40,000,000 record database

  • for a single term –

  • T(40,000,000) = log2(40,000,000)

  • = 26 read compare cycles

  • at 1000 read/compare cycles/sec = 0.026 seconds


Algorithms

  • Binary Search Algorithm

  • So, what about 730,000,000 records

  • Search for a single keyword –

  • 30 read/compare cycles

  • or about 0.03 seconds


Pseudo-code

  • Like English – easily readable

  • Clear and consistent

  • Rough correspondence to language implementation

  • Should give a clear understanding of what the algorithm does


Using Pseudo-code

  • Use indentation to indicate block structure. Blocks of code at the same level of indentation.

    • Do not use “extra” statements like begin-end

  • Looping constructs and conditionals are similar to Pascal (while, for, repeat, if-then-else). In for loops the loop counter is persistent


Using Pseudo-code

  • Use a consistent symbol to indicate comments. Anything on line after this symbol is a comment, not code

  • Multiple assignment is allowed

  • Variables are local to a procedure unless explicitly declared as global

  • Array elements are specified by the array name followed by indices in square brackets… A[i]


Pseudo-code

  • .. indicates a range of values A[1..4] means elements 1,2,3,and 4 of array A

  • Compound data can be represented as objects with attributes or fields. Reference these attributes array references. For example a variable that is the length of the array A is length[A]


Pseudo-code

  • An array reference is a pointer

  • Parameters are passed by value

    • assignments to parameters within a procedure are local to the procedure

  • Boolean operators short-circuit

  • Be consistent

    • don’t use read one place and input another unless they have functionally different meaning


Insertion-Sort Algorithm

INSERTION-SORT(A)

for j = 2 to length[A]

do key = A[j]

C* Insert A[j] into the sorted sequence A[1..j-1]

i=j-1

while i > 0 and A[i]> key

do A[i+1] = A[i]

i=i-1

A[i+1]=key


Analysis of Algorithms

  • Analysis may be concerned with any resources

    • memory

    • bandwidth

    • runtime

  • Need a model for describing runtime performance of an algorith

  • RAM – Random Access Machine


RAM

  • There are other models but for now…

  • Assume that all instructions are sequential

  • All data is accessible in one step

  • Analyze performance (run-time) in terms of inputs

    • meaning of inputs varies – size of an array, number of bits, vertices and edges, etc.

  • Machine independent

  • Language independent


RAM

  • Need to base analysis on cost of instruction execution

  • assign costs (run-time) to each instruction


INSERTION-SORT


INSERTION-SORT

  • Run-time = sum of products of costs (instruction runtimes) and execution occurrences

  • T(n)= c1n + c2(n-1) + c4(n-1) +

    c5nj=2tj +c6nj=2(tj-1) + c7nj=2(tj-1)

    +c8(n-1)


INSERTION-SORT

  • Best case vs Worst Case

  • Best case

    • Input array already sort

  • Worst case

    • Input array sorted in reverse order


INSERTION-SORT

  • For sake of discussion…

  • assume that all c=2

  • then, for best case

    • T(n) = 10n-8

    • n=1000, T(n) = 9992

  • for worst case …

    • T(n) = 3n2+7n-8

    • n=1000, T(n) = 3006992


Insertion-sort Performance

* Best case is a linear function of n


So, what are we really interested in?

  • the big picture

  • the trend in run-time performance as the problem grows

  • not concerned about small differences in algorithms

  • what happens to the algorithm as the problem gets explosively large

  • the order of growth


Abstractions and assumptions

  • The cost coefficients will not vary that much… and will not contribute significantly to the growth of run-time performance

    • so we can set them to a constant

    • … and we can ignore them

    • remember the earlier example –

      • c1 = c2 = … = 2


Abstractions and assumptions

  • In a polynomial run-time function the order of growth is controlled by the higher order term

    • T(n) = 3n2+7n-8

    • so we can ignore (discard) the lower order terms

    • T(n) = 3n2


Abstractions and assumptions

  • It turns out that with sufficiently large n the coefficient of the high order term is not that important in characterizing the order of growth of a run-time function

    • So, from that perspective the run-time function of the Insertion-Sort algorithm (worst-case) is -

    • T(n) = n2


Abstractions and assumptions

  • Are these abstraction assumptions correct?

  • for small problems – no

  • but for sufficiently large problem

  • they do a pretty good job of characterizing the run-time function of algorithms


Design of Algoritms

  • Incremental approach to algorithm design

    • Design for a very small case

    • expand the complexity of the problem and algorithm

  • Divide and Conquer

    • Start with a large (full)problem

    • Divide it into smaller problems

    • Solve smaller problems

    • Combine results from smaller problems


Another look at Sort algorithms

  • Suppose:

    • you have an array evenly divisible by two

    • in each half (left and right) values are already sorted in order

    • but not in order across the whole array

    • task: sort the array so that it is in order across the entire array


Merge Sorted subarrays

  • Split the array into two subarrays

  • Add a marker to each subarrays to indicate the end

  • Set index to first value of each subarray

  • Compare indexed (pointed to) value of each subarray

  • If either indexed value is an end-marker: move all remaining values (except the end-mark from the other subarray to the output array; Stop

  • Move the smallest of the two values to the output array (sorted); increment the index to that subarray

  • Go to step 4


Merge(A, p, q, r)

  • Where A is the array containing values to be sorted, each half is already sorted from smallest to largest

  • p = is the starting point index for the array A

  • q = is the end point index for the left side of array A (end of first half… sort of)

  • r = end index for array A

  • So, sort values from p to r from two halves of array A where q marks where to split the array into subarray


Merge(A, p, q, r)

  • n1 = q – p + 1

  • n2 = r – q

  • c* create subarrays L[1..n1+1] and R[1..n2+1]

  • for i = 1 to n1

  • do L[i] = A[p+i-1]

  • for j = 1 to n2

  • do R[j] = A[q+j]

  • L[n1+1] = 

  • R[n2+1] = 

  • i = 1

  • j = 1

  • for k = p to r

  • do if L[i] <= R[j]

  • then A[k] = L[i]

  • i = i + 1

  • else A[k] = R[j]

  • j = j + 1


MERGE_SORT(A,p,r)

  • if p < r

  • then q = (p+r)/2

  • MERGE_SORT(A, p, q)

  • MERGE_SORT(A, q+1, r)

  • MERGE(A, p, q, r)


Asymptotic Notation

  • Big  (theta)

  • (g(n)) = {f(n) : there exists two constants c1 and c2, n0 such that

    0<=c1g(n)<=f(n)<=c2g(n) for all n >=n0}


Asymptotic Notation

  • Big O (oh)

  • O(g(n)) = {f(n) : there positive constants c and n0 such that 0<=f(n)<=cg(n) for all n >=n0}


Asymptotic Notation

  • Big  (Omega)

  • (g(n)) = {f(n) : there positive constants c and n0 such that 0<=cg(n)<=f(n) for all n >=n0}


Asymptotic Notation

  • Little o (oh)

  • o(g(n)) = {f(n) : there positive constants c>0 and n0>0 such that 0<=f(n)<cg(n) for all n >=n0}


Asymptotic Notation

  • Little  (omega)

  • (g(n)) = {f(n) : there positive constants c>0 there exists a constant n0 such that

    0<=cg(n)<f(n) for all n >=n0 }


ad
  • Login