This presentation is the property of its rightful owner.
1 / 28

# CSE 326: Data Structures: Advanced Topics PowerPoint PPT Presentation

CSE 326: Data Structures: Advanced Topics. Lecture 26: Wednesday, March 12 th , 2003. Today. Dynamic programming for ordering matrix multiplication Very similar to Query Optimization in databases String processing Final review. Ordering Matrix Multiplication.

CSE 326: Data Structures: Advanced Topics

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## CSE 326: Data Structures: Advanced Topics

Lecture 26:Wednesday, March 12th, 2003

### Today

• Dynamic programming for ordering matrix multiplication

• Very similar to Query Optimization in databases

• String processing

• Final review

### Ordering Matrix Multiplication

• Need to compute A  B  C  D

=

### Ordering Matrix Multiplication

• One solution: (A  B)  (C  D):

)(

(

)=

=

Cost: (3  2  4) + (4  2  3) + (3  4  3) = 84

### Ordering Matrix Multiplication

• Anoter solution: (A  (B  C))  D:

(

))

(

=

)

=...

(

=

Cost: (2  4  2) + (3  2  2) + (3  2  3) = 46

### Ordering Matrix Multiplication

Problem:

• Given A1 A2 . . .  An, compute optimal ordering

Solution:

• Dynamic programming

• Compute cost[i][j]

• the minimum cost to compute Ai Ai+1 . . .  Aj

• Proceed iteratively, increasing the gap = j – i

### Ordering Matrix Multiplication

/* initialize */

for i = 1 to n-1 do cost[i][i] = 0 /* why ? */

/* dynamic programming */

for gap = 1 to n do {

for i = 1 to n – gap do {

j = i + gap;

c = ;

for k = i to j-1 do

/* how much would it cost to do

(Ai . . .  Ak )  (Ak+1 . . .  Aj) ? */

c = min(c, cost[i][k] + cost[k+1][j] +

A[i].rows * A[k].columns * A[j].columns)

cost[i][j] = c;

}

}

= A[k+1].rows

### Ordering Matrix Multiplication

• Running time: O(n3)

Important variation:

• Database systems do join reordering

• A very similar algorithm

• Come to CSE 544...

### String Matching

• The problem

• Given a text T[1], T[2], ..., T[n]and a pattern P[1], P[2], ..., P[m]

• Find all positions s such that P “occurs” in T at position s:(T[s], T[s+1], ..., T[s+m-1]) = (P[1], ..., P[m])

• Where do we need this ?

• text editors (e.g. emacs)

• grep

• XML processing

• Example:

### Naive String Matching

/* initialize */

for i = 1 to n-m do

if (T[i], T[i+1], ..., T[i+m-1]) = (P[1], P[2], ..., P[m])

then print i

running time: O(mn)

### Knuth-Morris-Pratt String Matching

• main idea: reuse the work, after a failure

fail !

precompute on P

reuse !

### Knuth-Morris-Pratt String Matching

• The Prefix-Function:[q] = the largest k < q s.t.(P[1], P[2], ..., P[k-1]) = (P[q-k+1], P[q-k+2], ..., P[q-1])

[8] = 2

[7] = 1

[6] = 4

[5] = 3

[3] = [2] = [1] = 1

[4] = 2

### Knuth-Morris-Pratt String Matching

/* compute  */

. . . .

/* do the matching */

q = 0; /* q = where we are in P */

for i = 1 to n do {

q = q+1;

while (q > 1 and P[q] != T[i])

q = [q];

if (P[q] = T[i]) {

if (q=m) print(i – m+1);

q = q+1;

}

}

Time = O(n) (why ?)

### Knuth-Morris-Pratt String Matching

/* compute  */

[1] = 0;

for q = 2 to m+1 do {

k = [q – 1];

while (k > 1 and P[k – 1] != P[q – 1])

k = [k];

if (k> 1 and P[k – 1] = P[q – 1]) then k = k+1;

[q] = k;

}

/* do the matching */

. . .

Time = O(m) (why ?)

Total running time of KMP algoritm: O(m+n)

### Final Review

• Basic math

• logs, exponents, summations

• proof by induction

• asymptotic analysis

• big-oh, theta, omega

• how to estimate running times

• need sums

• need recurrences

### Final Review

• Lists, stacks queues

• Array, v.s. pointer implementation

• Trees:

• definitions/terminology (root, parent, child, etc)

• relationship between depth and size of a tree

• depth is between O(log N) and O(N)

### Final Review

• Binary Search Trees

• basic implementations of find, insert, delete

• worst case performance: O(N)

• average case performance: O(log N) (inserts only)

• AVL trees

• balance factor +1, 0, -1

• known single and double rotations to keep it balanced

• all operations are O(log N) worst case time

• Splay trees

• good amortized performance

• single operation may take O(N)

• know the zig-zig, zig-zag, etc

• B-trees: know basic idea behind insert/delete

### Final Review

• Priority Queues

• binary heaps: insert/deleteMin, percolate up/down

• array implementation

• buildheap takes only O(N) !! Used in HeapSort

• Binomial queues

• merge is fast: O(log N)

• insert, deleteMin are based on merge

### Final Review

• Hashing

• hash functions based on the mod function

• collision resolution strategies

• chaining, linear and quadratic probing, double hashing

• load factor of a hash table

### Final Review

• Sorting

• elementary sorting algorithm: bubble sort, selection sort, insertion sort

• heapsort O(N log N)

• mergesort O(N log N)

• quicksort O(N log N) average

• fastest in practice, but O(N2) worst case performance

• pivot selection – median of the three works well

• known which of these are stable and in-place

• lower bound on sorting

• external memory sort

### Final Review

• Disjoint sets and Union-Find

• up-trees and their array-based implementation

• know how union-by-size and path compression work

• know the running time (not the proof)

### Final Review

• graph algorithms

• topological sort in O(n+m) time using a queue

• Breadth-First-Search (BFS) for unweighted shortest path

• Dijkstra’s shortest path algorithm

• DFS

• minimum spanning trees: Prim, Kruskal

### Final Review

• Graph algorithms (cont’d)

• Euler v.s. Hamiltonian circuits

• Know what P, NP and NP-completeness mean

### Final Review

• Algorithm design techniques

• greedy: bin packing

• divide and conquer

• solving various types of recurrence relations for T(N)

• dynamic programming (memoization)

• DP-Fibonacci

• Ordering matrix multiplication

• randomized data structures

• treaps

• primality testing

• string matching

• Backtracking and game trees

### The Final

• Details:

• covers chapters 1-10, 12.5, and some extra material

• closed book, closed notes except:

• you may bring one sheet of notes

• time: 1 hour and 50 minutes

• Monday, 3/17/2003, 2:30 – 4:20, this room

• bring pens/pencils/etc

• sleep well the night before