Mergesort, Analysis of Algorithms

1 / 22

# Mergesort, Analysis of Algorithms - PowerPoint PPT Presentation

Mergesort, Analysis of Algorithms. Jon von Neumann and ENIAC (1945). Why Does It Matter?. Run time (nanoseconds). 1.3 N 3. 10 N 2. 47 N log 2 N. 48 N. Time to solve a problem of size. 1000. 1.3 seconds. 10 msec. 0.4 msec. 0.048 msec. 10,000. 22 minutes. 1 second. 6 msec.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Mergesort, Analysis of Algorithms' - norm

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Mergesort, Analysis of Algorithms

Jon von Neumann and ENIAC (1945)

Why Does It Matter?

Run time(nanoseconds)

1.3 N3

10 N2

47 N log2N

48 N

Time tosolve aproblemof size

1000

1.3 seconds

10 msec

0.4 msec

0.048 msec

10,000

22 minutes

1 second

6 msec

0.48 msec

100,000

15 days

1.7 minutes

78 msec

4.8 msec

million

41 years

2.8 hours

0.94 seconds

48 msec

10 million

41 millennia

1.7 weeks

11 seconds

0.48 seconds

Max sizeproblemsolvedin one

second

920

10,000

1 million

21 million

minute

3,600

77,000

49 million

1.3 billion

hour

14,000

600,000

2.4 trillion

76 trillion

day

41,000

2.9 million

50 trillion

1,800 trillion

N multiplied by 10,time multiplied by

1,000

100

10+

10

Orders of Magnitude

Seconds

Equivalent

Meters PerSecond

ImperialUnits

Example

1

1 second

10-10

Continental drift

10

10 seconds

10-8

1 ft / year

Hair growing

102

1.7 minutes

10-6

3.4 in / day

Glacier

103

17 minutes

10-4

1.2 ft / hour

Gastro-intestinal tract

104

2.8 hours

10-2

2 ft / minute

Ant

105

1.1 days

1

2.2 mi / hour

Human walk

106

1.6 weeks

102

220 mi / hour

Propeller airplane

107

3.8 months

104

370 mi / min

Space shuttle

108

3.1 years

106

620 mi / sec

Earth in galactic orbit

109

108

62,000 mi / sec

1/3 speed of light

1010

3.1 centuries

. . .

forever

Powersof 2

210

thousand

1021

age ofuniverse

220

million

230

billion

Impact of Better Algorithms
• Example 1: N-body-simulation.
• Simulate gravitational interactions among N bodies.
• physicists want N = # atoms in universe
• Brute force method: N2 steps.
• Appel (1981). N log N steps, enables new research.
• Example 2: Discrete Fourier Transform (DFT).
• Breaks down waveforms (sound) into periodic components.
• foundation of signal processing
• CD players, JPEG, analyzing astronomical data, etc.
• Grade school method: N2 steps.
• Runge-König (1924), Cooley-Tukey (1965).FFT algorithm: N log N steps, enables new technology.
Mergesort

A

L

G

O

R

I

T

H

M

S

divide

• Mergesort (divide-and-conquer)
• Divide array into two halves.

A

L

G

O

R

I

T

H

M

S

Mergesort
• Mergesort (divide-and-conquer)
• Divide array into two halves.
• Recursively sort each half.

A

L

G

O

R

I

T

H

M

S

A

L

G

O

R

I

T

H

M

S

divide

A

G

L

O

R

H

I

M

S

T

sort

Mergesort
• Mergesort (divide-and-conquer)
• Divide array into two halves.
• Recursively sort each half.
• Merge two halves to make sorted whole.

A

L

G

O

R

I

T

H

M

S

A

L

G

O

R

I

T

H

M

S

divide

A

G

L

O

R

H

I

M

S

T

sort

A

G

H

I

L

M

O

R

S

T

merge

Mergesort Analysis
• How long does mergesort take?
• Bottleneck = merging (and copying).
• merging two files of size N/2 requires N comparisons
• T(N) = comparisons to mergesort N elements.
• to make analysis cleaner, assume N is a power of 2
• Claim. T(N) = N log2 N.
• Note: same number of comparisons for ANY file.
• We'll prove several different ways to illustrate standard techniques.
Proof by Picture of Recursion Tree

T(N)

N

2(N/2)

T(N/2)

T(N/2)

T(N/4)

T(N/4)

T(N/4)

4(N/4)

T(N/4)

log2N

. . .

2k (N / 2k)

T(N / 2k)

. . .

T(2)

T(2)

T(2)

T(2)

T(2)

T(2)

T(2)

T(2)

N/2(2)

Nlog2N

Proof by Telescoping
• Claim. T(N) = N log2 N (when N is a power of 2).
• Proof. For N > 1:
Mathematical Induction
• Mathematical induction.
• Powerful and general proof technique in discrete mathematics.
• To prove a theorem true for all integers k  0:
• Base case: prove it to be true for N = 0.
• Induction hypothesis: assuming it is true for arbitrary N
• Induction step: show it is true for N + 1
• Claim: 0 + 1 + 2 + 3 + . . . + N = N(N+1) / 2 for all N  0.
• Proof: (by mathematical induction)
• Base case (N = 0).
• 0 = 0(0+1) / 2.
• Induction hypothesis: assume 0 + 1 + 2 + . . . + N = N(N+1) / 2
• Induction step: 0 + 1 + . . . + N + N + 1 = (0 + 1 + . . . + N) + N+1 = N (N+1) /2 + N+1 = (N+2)(N+1) / 2
Proof by Induction
• Claim. T(N) = N log2 N (when N is a power of 2).
• Proof. (by induction on N)
• Base case: N = 1.
• Inductive hypothesis: T(N) = N log2 N.
• Goal: show that T(2N) = 2N log2 (2N).
Proof by Induction
• What if N is not a power of 2?
• T(N) satisfies following recurrence.
• Claim. T(N)  Nlog2 N.
• Proof. See supplemental slides.
Computational Complexity
• Framework to study efficiency of algorithms. Example = sorting.
• MACHINE MODEL = count fundamental operations.
• count number of comparisons
• UPPER BOUND = algorithm to solve the problem (worst-case).
• N log2 N from mergesort
• LOWER BOUND = proof that no algorithm can do better.
• N log2 N - N log2 e
• OPTIMAL ALGORITHM: lower bound ~ upper bound.
• mergesort
Decision Tree

a1 < a2

YES

NO

a2 < a3

a1 < a3

YES

NO

YES

NO

a1 < a3

a2 < a3

YES

NO

YES

NO

printa1, a2, a3

printa2, a1, a3

printa1, a3, a2

printa3, a1, a2

printa2, a3, a1

printa3, a2, a1

Comparison Based Sorting Lower Bound
• Theorem. Any comparison based sorting algorithm must use(N log2N) comparisons.
• Proof. Worst case dictated by tree height h.
• N! different orderings.
• One (or more) leaves corresponding to each ordering.
• Binary tree with N! leaves must have height
• Food for thought. What if we don't use comparisons?
• Stay tuned for radix sort.

Stirling's formula

### Extra Slides

Proof by Induction
• Claim. T(N)  Nlog2 N.
• Proof. (by induction on N)
• Base case: N = 1.
• Define n1 = N / 2 , n2 = N / 2.
• Induction step: assume true for 1, 2, . . . , N – 1.
Implementing Mergesort

Item aux[MAXN];

void mergesort(Item a[], int left, int right) {

int mid = (right + left) / 2;

if (right <= left)

return;

mergesort(a, left, mid);

mergesort(a, mid + 1, right);

merge(a, left, mid, right);

}

mergesort (see Sedgewick Program 8.3)

uses scratch array

Implementing Mergesort

void merge(Item a[], int left, int mid, int right) {

int i, j, k;

for (i = mid+1; i > left; i--)

aux[i-1] = a[i-1];

for (j = mid; j < right; j++)

aux[right+mid-j] = a[j+1];

for (k = left; k <= right; k++)

if (ITEMless(aux[i], aux[j]))

a[k] = aux[i++];

else

a[k] = aux[j--];

}

merge (see Sedgewick Program 8.2)

copy to temporary array

merge two sorted sequences

Profiling Mergesort Empirically

void merge(Item a[], int left, int mid, int right) <999>{

int i, j, k;

for (<999>i = mid+1; <6043>i > left; <5044>i--)

<5044>aux[i-1] = a[i-1];

for (<999>j = mid; <5931>j < right; <4932>j++)

<4932>aux[right+mid-j] = a[j+1];

for (<999>k = left; <10975>k <= right; <9976>k++)

if (<9976>ITEMless(aux[i], aux[j]))

<4543>a[k] = aux[i++];

else

<5433>a[k] = aux[j--];

<999>}

void mergesort(Item a[], int left, int right) <1999>{

int mid = <1999>(right + left) / 2;

if (<1999>right <= left)

return<1000>;

<999>mergesort(a, aux, left, mid);

<999>mergesort(a, aux, mid+1, right);

<999>merge(a, aux, left, mid, right);

<1999>}

Mergesort prof.out

Striking feature:All numbers SMALL!

# comparisonsTheory ~ N log2 N = 9,966Actual = 9,976

Sorting Analysis Summary
• Running time estimates:
• Home pc executes 108 comparisons/second.
• Supercomputer executes 1012 comparisons/second.
• Lesson 1: good algorithms are better than supercomputers.
• Lesson 2: great algorithms are better than good ones.

Insertion Sort (N2)

Mergesort (N log N)

computer

thousand

million

billion

thousand

million

billion

home

instant

2.8 hours

317 years

instant

1 sec

18 min

super

instant

1 second

1.6 weeks

instant

instant

instant

Quicksort (N log N)

thousand

million

billion

instant

0.3 sec

6 min

instant

instant

instant