- 76 Views
- Uploaded on
- Presentation posted in: General

CSE 326: Data Structures: Advanced Topics

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

CSE 326: Data Structures: Advanced Topics

Lecture 26:Wednesday, March 12th, 2003

- Dynamic programming for ordering matrix multiplication
- Very similar to Query Optimization in databases

- String processing
- Final review

- Need to compute A B C D

=

- One solution: (A B) (C D):

)(

(

)=

=

Cost: (3 2 4) + (4 2 3) + (3 4 3) = 84

- Anoter solution: (A (B C)) D:

(

))

(

=

)

=...

(

=

Cost: (2 4 2) + (3 2 2) + (3 2 3) = 46

Problem:

- Given A1 A2 . . . An, compute optimal ordering
Solution:

- Dynamic programming
- Compute cost[i][j]
- the minimum cost to compute Ai Ai+1 . . . Aj

- Proceed iteratively, increasing the gap = j – i

/* initialize */

for i = 1 to n-1 do cost[i][i] = 0 /* why ? */

/* dynamic programming */

for gap = 1 to n do {

for i = 1 to n – gap do {

j = i + gap;

c = ;

for k = i to j-1 do

/* how much would it cost to do

(Ai . . . Ak ) (Ak+1 . . . Aj) ? */

c = min(c, cost[i][k] + cost[k+1][j] +

A[i].rows * A[k].columns * A[j].columns)

cost[i][j] = c;

}

}

= A[k+1].rows

- Running time: O(n3)
Important variation:

- Database systems do join reordering
- A very similar algorithm
- Come to CSE 544...

- The problem
- Given a text T[1], T[2], ..., T[n]and a pattern P[1], P[2], ..., P[m]
- Find all positions s such that P “occurs” in T at position s:(T[s], T[s+1], ..., T[s+m-1]) = (P[1], ..., P[m])
- Where do we need this ?
- text editors (e.g. emacs)
- grep
- XML processing

- Example:

/* initialize */

for i = 1 to n-m do

if (T[i], T[i+1], ..., T[i+m-1]) = (P[1], P[2], ..., P[m])

then print i

running time: O(mn)

- main idea: reuse the work, after a failure

fail !

precompute on P

reuse !

- The Prefix-Function:[q] = the largest k < q s.t.(P[1], P[2], ..., P[k-1]) = (P[q-k+1], P[q-k+2], ..., P[q-1])

[8] = 2

[7] = 1

[6] = 4

[5] = 3

[3] = [2] = [1] = 1

[4] = 2

/* compute */

. . . .

/* do the matching */

q = 0; /* q = where we are in P */

for i = 1 to n do {

q = q+1;

while (q > 1 and P[q] != T[i])

q = [q];

if (P[q] = T[i]) {

if (q=m) print(i – m+1);

q = q+1;

}

}

Time = O(n) (why ?)

/* compute */

[1] = 0;

for q = 2 to m+1 do {

k = [q – 1];

while (k > 1 and P[k – 1] != P[q – 1])

k = [k];

if (k> 1 and P[k – 1] = P[q – 1]) then k = k+1;

[q] = k;

}

/* do the matching */

. . .

Time = O(m) (why ?)

Total running time of KMP algoritm: O(m+n)

- Basic math
- logs, exponents, summations
- proof by induction

- asymptotic analysis
- big-oh, theta, omega
- how to estimate running times
- need sums
- need recurrences

- Lists, stacks queues
- ADT definition
- Array, v.s. pointer implementation
- variations: headers, doubly linked, etc

- Trees:
- definitions/terminology (root, parent, child, etc)
- relationship between depth and size of a tree
- depth is between O(log N) and O(N)

- Binary Search Trees
- basic implementations of find, insert, delete
- worst case performance: O(N)
- average case performance: O(log N) (inserts only)

- AVL trees
- balance factor +1, 0, -1
- known single and double rotations to keep it balanced
- all operations are O(log N) worst case time

- Splay trees
- good amortized performance
- single operation may take O(N)
- know the zig-zig, zig-zag, etc

- B-trees: know basic idea behind insert/delete

- Priority Queues
- binary heaps: insert/deleteMin, percolate up/down
- array implementation
- buildheap takes only O(N) !! Used in HeapSort

- Binomial queues
- merge is fast: O(log N)
- insert, deleteMin are based on merge

- Hashing
- hash functions based on the mod function
- collision resolution strategies
- chaining, linear and quadratic probing, double hashing

- load factor of a hash table

- Sorting
- elementary sorting algorithm: bubble sort, selection sort, insertion sort
- heapsort O(N log N)
- mergesort O(N log N)
- quicksort O(N log N) average
- fastest in practice, but O(N2) worst case performance
- pivot selection – median of the three works well

- known which of these are stable and in-place
- lower bound on sorting
- bucket sort, radix sort
- external memory sort

- Disjoint sets and Union-Find
- up-trees and their array-based implementation
- know how union-by-size and path compression work
- know the running time (not the proof)

- graph algorithms
- adjacency matrix v.s. adjacency list representation
- topological sort in O(n+m) time using a queue
- Breadth-First-Search (BFS) for unweighted shortest path
- Dijkstra’s shortest path algorithm
- DFS
- minimum spanning trees: Prim, Kruskal

- Graph algorithms (cont’d)
- Euler v.s. Hamiltonian circuits
- Know what P, NP and NP-completeness mean

- Algorithm design techniques
- greedy: bin packing
- divide and conquer
- solving various types of recurrence relations for T(N)

- dynamic programming (memoization)
- DP-Fibonacci
- Ordering matrix multiplication

- randomized data structures
- treaps
- primality testing

- string matching
- Backtracking and game trees

- Details:
- covers chapters 1-10, 12.5, and some extra material
- closed book, closed notes except:
- you may bring one sheet of notes

- time: 1 hour and 50 minutes
- Monday, 3/17/2003, 2:30 – 4:20, this room
- bring pens/pencils/etc
- sleep well the night before

- I will cover some of the problems on the website
- I will take your questions