1 / 21

Static Dictionaries

Static Dictionaries. Collection of items. Each item is a pair. (key, element) Pairs have different keys. Operations are: initialize/create get (search). Hashing. Perfect hashing (no collisions). Minimal perfect hashing (space = n ). CHD algorithm.

casper
Download Presentation

Static Dictionaries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Static Dictionaries • Collection of items. • Each item is a pair. • (key, element) • Pairs have different keys. • Operations are: • initialize/create • get (search)

  2. Hashing • Perfect hashing (no collisions). • Minimal perfect hashing (space = n). • CHD algorithm. • O(n) time to construct the perfect or minimal perfect hash function. • O(1) search time. • Bothelo, Belazzougui & Dietzfelbinger. Compress, hash, and displace.  17th European Symposium on Algorithms, 2009.

  3. Search Tree • Hashing not efficient for extended operations such as range search and nearest match. • Will examine a binary search tree structure for static dictionaries. • Each item/key/element has an estimated access frequency (or probability).

  4. a b b c a c Example a < b < c Cost = 0.8 * 1 + 0.1 * 2 + 0.1 * 3 = 1.2 Cost = 0.8 * 2 + 0.1 * 1 + 0.1 * 2 = 1.9

  5. f0 f1 f2 f3 b c a Search Types • Successful. • Search for a key that is in the dictionary. • Terminates at an internal node. • Unsuccessful. • Search for a key that is not in the dictionary. • Terminates at an external/failure node.

  6. f0 f1 f2 f3 b c a Internal And External Nodes • A binary tree with n internal nodes has n + 1 external nodes. • Let s1, s2, …, sn be the internal nodes, in inorder. • key(s1) < key(s2) < … < key(sn). • Let key(s0) = –infinity and key(sn+1) = infinity. • Let f0, f1, …, fn be the external nodes, in inorder. • fi is reached iff key(si)< search key <key(si+1).

  7. f0 f1 f2 f3 b c a Cost Of Binary Search Tree • Let pi = probability for key(si). • Let qi = probability for key(si)< search key <key(si+1). • Sum of ps and qs = 1. • Cost of tree = S0 <= i <= n qi (level(fi) – 1) + S1<= i <= n pi *level(si) • Cost = weighted path length.

  8. Brute Force Algorithm • Generate all binary search trees with n internal nodes. • Compute the weighted path length of each. • Determine tree with minimum weighted path length. • Number of trees to examine is O(4n/n1.5). • Brute force approach is impractical for large n.

  9. a5 a6 a3 a7 a4 f5 f2 f7 f6 f4 f3 Dynamic Programming • Keys are a1< a2 < …<an. • Let Ti j= least cost tree for ai+1, ai+2, …, aj. • T0n= least cost tree for a1, a2, …, an. T2,7 • Ti j includes pi+1, pi+2, …, pj and qi, qi+1, …, qj.

  10. a5 a6 a3 a7 a4 f5 f2 f7 f6 f4 f3 Terminology T2,7 • Ti j= least cost tree for ai+1, ai+2, …, aj. • ci j= cost of Ti j =Si <= u <= j qu (level(fu) – 1) + Si < u <= j pu *level(su). • ri j= root of Ti j. • wi j= weight of Ti j = sum of ps and qs in Ti j = pi+1+ pi+2+ …+ pj + qi + qi+1 + … + qj

  11. Tii fi i = j • Ti j includes pi+1, pi+2, …, pj and qi, qi+1, …, qj. • Ti i includes qi only. • ci i = cost of Ti i = 0. • ri i = root of Ti i = 0. • wi i = weight of Ti i • = sum of ps and qs in Ti i • = qi

  12. a5 ak a6 a3 Ti j a7 L R a4 f5 f2 f7 f6 f4 f3 i < j • Ti j= least cost tree for ai+1, ai+2, …, aj. • Ti j includes pi+1, pi+2, …, pj and qi, qi+1, …, qj. • Let ak, i < k <= j, be in the root of Ti j. • L includes pi+1, pi+2, …, pk-1 and qi, qi+1, …, qk-1. • R includes pk+1, pk+2, …, pj and qk, qk+1, …, qj.

  13. a5 a6 a3 a7 L a4 f5 f2 f7 f6 f4 f3 cost(L) • L includes pi+1, pi+2, …, pk-1 and qi, qi+1, …, qk-1. • cost(L) = weighted path length ofL when viewed as a stand alone binary search tree.

  14. ak Ti j L L R Contribution To cij • ci j =Si <= u <= j qu (level(fu) – 1) + Si < u <= j pu *level(su). • When L is viewed as a subtree of Ti j , the level of each node is 1 more than when L is viewed as a stand alone tree. • So, contribution of L to cij is cost(L) + wi k-1.

  15. a5 ak a6 a3 Ti j a7 L R a4 f5 f2 f7 f6 f4 f3 cij • Contribution of L to cij is cost(L) + wi k-1. • Contribution of R to cij is cost(R) + wkj. • cij = cost(L) + wi k-1 +cost(R) + wkj + pk = cost(L) + cost(R) + wij

  16. a5 ak a6 a3 Ti j a7 L R a4 f5 f2 f7 f6 f4 f3 cij • cij = cost(L) + cost(R) + wij • cost(L) = cik-1 • cost(R) = ckj • cij = cik-1 + ckj + wij • Don’tknow k. • cij = mini < k <= j{cik-1 + ckj}+ wij

  17. ak Ti j L R cij • cij = mini < k <= j{cik-1 + ckj}+ wij • rij = k that minimizes right side.

  18. Computation Of c0n Andr0n • Start with ci i =0, ri i =0, wi i =qi, 0 <= i <= n (zero-key trees). • Use cij = mini < k <= j{cik-1 + ckj}+ wij to compute cii+1, ri i+1, 0 <= i <= n – 1 (one-key trees). • Now use the equation to compute cii+2, ri i+2, 0 <= i <= n – 2 (two-key trees). • Now use the equation to compute cii+3, ri i+3, 0 <= i <= n – 3 (three-key trees). • Continue until c0n andr0n(n-key tree) have been computed.

  19. 1 2 3 4 0 0 1 2 3 4 cij, rij, i <= j Computation Of c0n Andr0n

  20. Complexity • cij = mini < k <= j{cik-1 + ckj}+ wij • O(n) time to compute one cij. • O(n2)cijs to compute. • Total time is O(n3). • May be reduced to O(n2) by using cij = min ri,j-1 < k <= ri+1,j{cik-1 + ckj}+ wij

  21. T0n a10 T09 T10,n Construct T0n • Root is r0n. • Suppose that r0n = 10. • Construct T09 and T10,n recursively. • Time is O(n).

More Related