Introduction

1 / 91

# Introduction - PowerPoint PPT Presentation

Introduction. Thinking about Algorithms Abstractly. Credits: Steve Rudich, Jeff Edmond, Ping Xuan. So you want to be a bioinformatician /mathematician /computer scientist? What is an Algorithm ? Grade school revisited: How to multiply two numbers. So you want to be a computer scientist?.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Introduction' - ori-bridges

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Introduction

Credits: Steve Rudich, Jeff Edmond, Ping Xuan

So you want to be a bioinformatician /mathematician /computer scientist?
• What is an Algorithm ?
• Grade school revisited: How to multiply two numbers

### Original Thinking

• Given today’s prices of pork, grain, sawdust, …
• Given constraints on what constitutes a hotdog.
• Make the cheapest hotdog.

• Um? Tell me what to code.

With more suffocated software engineering systems,the demand for mundane programmers will diminish.

• I learned this great algorithm that will work.

Soon all known algorithms will be available in libraries.

• I can develop a new algorithm for you.

Great thinkers will always be needed.

The future belongs to the computer scientist who has
• Content: An up to date grasp of fundamental problems and solutions
• Method: Principles and techniques to solve the vast array of unfamiliar problems that arise in a rapidly changing field
Course Content
• A list of algoirthms.
• Learn their code.
• Trace them until you are convenced that they work.
• Impliment them.

class InsertionSortAlgorithm extends SortAlgorithm {

void sort(int a[]) throws Exception {

for (int i = 1; i < a.length; i++) {

int j = i;

int B = a[i];

while ((j > 0) && (a[j-1] > B)) {

a[j] = a[j-1];

j--;}

a[j] = B;

}}

Course Content
• A survey of algorithmic design techniques.
• Abstract thinking.
• How to develop new algorithms for any problem that may arise.
Study:
• Many experienced programmers were asked to code up binary search.
Study:
• Many experienced programmers were asked to code up binary search.

80% got it wrong

Good thing is was not for a nuclear power plant.

What did they lack?
• Formal proof methods?
What did they lack?
• Formal proof methods?

Yes, likely

Industry is starting to realize that formal methods are important.

But even without formal methods …. ?

What did they lack?
• Fundamental understanding of the algorithmic design techniques.
• Abstract thinking.
Course Content

Notations, analogies, and abstractions

for developing,

and describing algorithms

Time Complexity

t(n) = Q(n2)

Recurrence Relations

T(n) = a T(n/b) + f(n)

You will see some Math …
What is an Algorithm?
• A step-by-step description of procedures performing certain task.
• Example: Sorting.

Given a list of numbers, put them in increasing

(non-decreasing) order.

• Properties: Generality, Termination, Correctness, Feasibility.
• Feasibility  Analysis of Algorithm

Complexity theory

Analysis of Algorithm
• Running time analysis
• Input size N
• Running time is a function of N: T(N)
• Worst case, average case, … , etc
• Time measured by number of computer operations
• Actual clock time differs by a constant factor
• Usually very complex
• The use of asymptotic bounds
• To study the rate of growth of T(N) compared to simpler functions f(N), such as N, N2, log(N), etc
• A constant factor can be ignored
Some Definitions: Big O Notation
• T(N) = O( f(N) )
• Exists constant c and n0 such that when N>n0, T(N) <= c * f(N)
• Asymptotic Upper bound

c f(N)

T(N)

n0

N

Some Definitions: Big Omega
• T(N) = (g(N))
• Exists constant c and n0 such that when N>n0, T(N) >= c * g(N)
• Asymptotic Lower bound

T(N)

c g(N)

n0

N

Some Definitions: Big Theta
• T(N) = ( h(N) )
• if and only if T(N)=O(h(N)) and T(N)= (h(N))
• tight bound

c1 h(N)

T(N)

c2 h(N)

N

c p(N)

T(N)

N

Some Definitions: Little “o”
• T(N) = o(p(N))

if lim N  ∞ = o. E.g. log(N) = o(N).

Example: Insertion Sort Algorithm
• class InsertionSortAlgorithm extends

SortAlgorithm {

• void sort(int a[ ]) throws Exception {
• for (int i = 1; i < a.length; i++) {
• int j = i;
• int B = a[i];
• while ((j > 0) && (a[j-1] > B)) {
• a[j] = a[j-1];
• j--;}
• a[j] = B;
• }}

0

i-1

i

T+1

i

i

9 km

5 km

Iterative Algorithms

<preCond>

codeA

loop

exit when <exit Cond>

codeB

codeC

<postCond>

One step at a time

Relay Race

Code

52

88

14

14,23,25,30,31,52,62,79,88,98

31

98

25

30

23

62

79

Problem Specification

• Precondition: The input is a list of n values with the same value possibly repeated.
• Post condition: The output is a list consisting of the same n values in non-decreasing order.

30

14

25

23,31,52,88

98

79

Define Step
• Select arbitrary element from side.
• Insert it where it belongs.

30

14

62

25

98

79

23,31,52,62,88

Exit

79 km

75 km

6 elements

to school

5 elements

to school

30

14

25

23,31,52,88

98

79

Making progress while Maintaining the loop invariant

30

14

62

25

98

79

23,31,52,62,88

Sorted sub-list

52

52

88

88

14

14

31

31

62

62

25

25

30

30

Exit

Exit

0 km

23

23

98

98

79

79

n elements

to school

0 elements

to school

14,23,25,30,31,52,62,79,88,98

14,23,25,30,31,52,62,79,88,98

Beginning &

Ending

30

14

25

23,31,52,88

98

79

Running Time

Inserting an element into a list of size i

takes O(i) time in the worst case.

Total = 1+2+3+…+n = n(n+1)/2 = (n2)

30

14

62

25

98

79

23,31,52,62,88

Explaining Insertion Sort

We maintain a subset of elements sorted within alist. The remaining elements are off to the sidesomewhere. Initially,think of the first element in the array as a sorted list of lengthone. One at a time, we take one of the elements that is off to theside and we insert it into the sorted list where itbelongs. This gives a sorted list that is one element longer than itwas before. When the last element has been inserted, the array iscompletely sorted.

23,25,31,52,88

52,23,88,31,25,30,98,62,14,79

23,31,52,88

Insertion Sort

The input consists of an array of integers

We read in the i+1st object.

We will pretend that this larger prefix is the entire input.

We extend the solution we have by one for this larger prefix.

Insertion Sort Algorithm: Pseudo-Code
• Input: an array (or list) a of numbers

// data structure: a[0] …a[n-1]

• Operations:
• Leave the first element (location 0) untouched.
• for each element from location 1 on

insert it to the appropriate place in the

(sorted) sub-array to its left.

a

0

1

2

n-1

Operations in an Algorithm
• Decision Making
• branching operation
• if … then … else …
• Repetition
• loops: for each element do { … }
• conditional loops: while ( …. ) do { … }
Insertion Sort Algorithm: Code
• class InsertionSortAlgorithm extends

SortAlgorithm {

• void sort(int a[ ]) throws Exception {
• for (int i = 1; i < a.length; i++) {
• int j = i;
• int b = a[i];
• while ( (j > 0) && (a[j-1] > b) ) {
• a[j] = a[j-1];
• j--;}
• a[j] = B;
• }}}

2 X 2 = 5

Another Algorithm

### Grade School Revisited:How To Multiply Two Numbers

Complex Numbers
• Remember how to multiply 2 complex numbers?
• (a+bi)(c+di) = [ac –bd] + [ad + bc] i
• Input: a,b,c,d Output: ac-bd, ad+bc
• If a real multiplication costs \$1 and an addition cost a penny. What is the cheapest way to obtain the output from the input?
• Can you do better than \$4.02?
Gauss’ \$3.05 Method:Input: a,b,c,d Output: ac-bd, ad+bc
• m1 = ac
• m2 = bd
• A1 = m1 – m2 = ac-bd
• m3 = (a+b)(c+d) = ac + ad + bc + bd
• A2 = m3 – m1 – m2 = ad+bc
Question:
• The Gauss “hack” saves one multiplication out of four. It requires 25% less work.
• Could there be a context where performing 3 multiplications for every 4 provides a more dramatic savings?

Odette

Bonzo

How to add 2 n-bit numbers.

**

**

**

**

**

**

**

**

**

**

**

+

How to add 2 n-bit numbers.

**

**

**

**

**

**

**

**

**

* **

**

*

+

How to add 2 n-bit numbers.

**

**

**

**

**

**

**

**

* **

* **

*

**

*

+

How to add 2 n-bit numbers.

**

**

**

**

**

**

**

* **

* **

*

* **

*

**

*

+

How to add 2 n-bit numbers.

**

**

**

**

**

**

* **

* **

*

* **

*

* **

*

**

*

+

How to add 2 n-bit numbers.

*

*

* **

*

* **

*

* **

*

* **

*

* **

*

***

*

* **

*

* **

*

* **

*

* **

*

**

*

+

*

*

* **

*

* **

*

* **

*

* **

*

* **

*

* **

*

* **

*

* **

*

* **

*

* **

*

**

*

+

On any reasonable computer adding 3 bits can be done in constant time.

= θ(n) = linear time.

Is there a faster way to add?
• QUESTION: Is there an algorithm to add two n-bit numbers whose time grows sub-linearly in n?
• Suppose there is a mystery algorithm that does not examine each bit
• Give the algorithm a pair of numbers. There must be some unexamined bit position i in one of the numbers
• If the algorithm is not correct on the numbers, we found a bug
• If the algorithm is correct, flip the bit at position i and give the algorithm the new pair of numbers. It give the same answer as before so it must be wrong since the sum has changed

So any algorithm for addition must use time at least linear in the size of the numbers.

n2

How to multiply 2 n-bit numbers.

* * * * * * * *

X

* * * * * * * *

* * * * * * * *

* * * * * * * *

* * * * * * * *

* * * * * * * *

* * * * * * * *

* * * * * * * *

* * * * * * * *

* * * * * * * *

* * * * * * * * * * * * * * * *

• No matter how dramatic the difference in the constants the quadratic curve will eventually dominate the linear curve

time

# of bits in numbers

Neat! We have demonstrated that multiplication is a harder problem than addition.Mathematical confirmation of our common sense.

To argue that multiplication is an inherently harder problem than addition we would have to show that no possible multiplication algorithm runs in linear time.

Is there a clever algorithm to multiply two numbers in linear time?

Tunghai will give you a PhD!

Divide and Conquer(an approach to faster algorithms)
• DIVIDE my instance to the problem into smaller instances to the same problem.
• Have a friend (recursively) solve them.Do not worry about it yourself.
Multiplication of 2 n-bit numbers

a

b

• X =
• Y =
• X = a 2n/2 + b Y = c 2n/2 + d
• XY = ac 2n + (ad+bc) 2n/2 + bd

c

d

Multiplication of 2 n-bit numbers

a

b

c

d

• X =
• Y =
• XY = ac 2n + (ad+bc) 2n/2 + bd

MULT( X, Y ) :

If |X| = |Y| = 1 then RETURN XY

Break X into a;b and Y into c;d

RETURN

MULT(a,c) 2n + (MULT(a,d) + MULT(b,c)) 2n/2 + MULT(b,d)

Time required by MULT
• T( n ) = time taken by MULT on two n-bit numbers
• What is T(n)? What is its growth rate? Is it θ(n2)?
Recurrence Relation
• T(1) = k for some constant k
• T(n) = 4 T(n/2) + k n + l for some constants k and l

MULT(X,Y):

If |X| = |Y| = 1 then RETURN XY

Break X into a;b and Y into c;d

RETURN

MULT(a,c) 2n + (MULT(a,d) + MULT(b,c)) 2n/2 + MULT(b,d)

Let’s be concrete
• T(1) = 1
• T(n) = 4 T(n/2) + n
• How do we unravel T(n) so that we can determine its growth rate?
Technique 1Guess and Verify
• Recurrence Relation:

T(1) = 1 & T(n) = 4T(n/2) + n

• Guess: G(n) = 2n2 – n
• Verify:

T(1) = 1

Technique 2: Decorate The Tree

T(1)

=

1

• T(n) = n + 4 T(n/2)
• T(n) = n + 4 T(n/2)

T(n)

T(n)

n

n

=

=

T(n/2)

T(n/2)

T(n/2)

T(n/2)

T(n/2)

T(n/2)

T(n/2)

T(n/2)

T(n)

n

=

T(n/2)

T(n/2)

T(n/2)

T(n/2)

n/2

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n)

n

=

T(n/2)

T(n/2)

T(n/2)

n/2

n/2

n/2

n/2

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n)

n

=

T(n)

n

=

n/2

n/2

n/2

n/2

n/4

n/4

n/4

n/4

n/4

n/4

n/4

n/4

n/4

n/4

n/4

n/4

n/4

n/4

n/4

n/4

11111111111111111111111111111111 . . . . . . 111111111111111111111111111111111

=1×n

= 4×n/2

= 16×n/4

= 4i ×n/2i

= 4logn×n/2logn =nlog4×1

Total: θ(nlog4) = θ(n2)

Divide and Conquer MULT: θ(n2) time Grade School Multiplication: θ(n2) time

All that work for nothing!

MULT revisited

MULT(X,Y):

If |X| = |Y| = 1 then RETURN XY

Break X into a;b and Y into c;d

RETURN

MULT(a,c) 2n + (MULT(a,d) + MULT(b,c)) 2n/2 + MULT(b,d)

• MULT calls itself 4 times. Can you see a way to reduce the number of calls?
Gauss’ Hack:Input: a,b,c,d Output: ac, ad+bc, bd
• A1 = ac
• A3 = bd
• m3 = (a+b)(c+d) = ac + ad + bc + bd
• A2 = m3– A1- A3 = ad + bc
Gaussified MULT(Karatsuba 1962)

MULT(X,Y):

If |X| = |Y| = 1 then RETURN XY

Break X into a;b and Y into c;d

e = MULT(a,c) and f =MULT(b,d)

RETURN e2n + (MULT(a+b, c+d) – e - f) 2n/2 + f

• T(n) = 3 T(n/2) + n
• Actually: T(n) = 2 T(n/2) + T(n/2 + 1) + kn

n/2

n/2

n/2

n/2

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n)

n

=

T(n)

n

=

T(n/2)

T(n/2)

T(n/2)

n/2

T(n/4)

T(n/4)

T(n/4)

T(n)

n

=

T(n/2)

T(n/2)

n/2

n/2

n/2

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n/4)

T(n)

n

=

0

1

2

n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4

i

=1×n

= 3×n/2

= 9×n/4

= 3i ×n/2i

= 3logn×n/2logn =nlog3×1

Total: θ(nlog3) = θ(n1.58..)

Dramatic improvement for large n

Not just a 25% savings!

θ(n2) vs θ(n1.58..)

You’re cool! Are you free sometime this weekend?

Not interested, Bonzo. I took the initiative and asked out a guy in my 4277 class.