1 / 40

Hashing Technology: Past, Now, and Future

Hashing Technology: Past, Now, and Future. Chin-Chen Chang, Ph.D. Chair Professor Dept. of Information Engineering and Computer Science, Feng Chia University, Taiwan. Outlines. 1. Introduction 2. Conventional Methods 3. Collision Resolution Strategies 4. Evaluation 5. Perfect Hashing

elom
Download Presentation

Hashing Technology: Past, Now, and Future

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hashing Technology: Past, Now, and Future Chin-Chen Chang, Ph.D. Chair Professor Dept. of Information Engineering and Computer Science, Feng Chia University, Taiwan

  2. Outlines • 1. Introduction • 2. Conventional Methods • 3. Collision Resolution Strategies • 4. Evaluation • 5. Perfect Hashing • 5.1 Ranking • 5.2 Table look up • 5.3 Rehashing • 5.4 Quotients and Remainders • 6. Minimal Perfect Hashing • 7. Conclusions

  3. 1. Introduction • Non-linear Organization • Ex: binary tree • Linear Organization

  4. Introduction (Cont.) • hashing • key to address transformation • problem: collision • particular cases (1) one-to-one mapping and | key space | ≤ | address space | h Perfect hashing

  5. Hashing (2)one-to-one mapping and | key space | = | address space| Minimal perfect hashing Ex: key set={20, 21, …, 217}h(k) = k mod 19 is a perfect hashing

  6. 2. Conventional Methods • 2.1 Division • H(x) = (x mod m) + 1 • 2.2 Midsquare hashing method • key: x • square: x2 m: even ; not goodm: prime ; good m: odd , no factors less 20; good Ex: x = 113586 M = 9999 △)△)△△△△(△(△ H(x) = 1)2)9)0)1779(3(9(6 ≤m

  7. Conventional Methods (Cont.) • 2.3 Folding method • 2.4 Radix transformation 187 249 653 ____________ 1089 _____ H(x) 781 249 356 ____________ 1386 _____ H(x) key: x Δ Δ Δ | Δ Δ Δ | Δ Δ Δ | Δ Δ Δ | Δ Δ Δ Ex: x = 187 | 249 | 653 m = 999 (x)p Ex: (530476)10 (530476)11 = 5*115+3*114+4*112+7*111+6*110 = (849745)10 H(530476) = 745 p> q, relatively prime (x')q Take (x'') to be H(x)

  8. Conventional Methods (Cont.) • 2.5 Algebraic Coding [Knuth 1973] = Construct a polynomial Construct a polynomial

  9. Conventional Methods (Cont.) • 2.6 Multiplication Hashing [by Knott and Knuth]

  10. 3. Collision Resolution Strategies • 3.1 linear probing (linear open addressing) • 3.2 quadratic probing ) mod b, mod b, ) mod b, & • Note: If b=4j+3 (e.g., 7,127) • then every location will be probed.

  11. Collision Resolution Strategies (Cont.) • 3.3 random probing • 3.4 rehashing use a series of hash functions f1, f2, …, fm 0 is a pseudo random number mod

  12. 4. Evaluation • (i) fast? • (ii) insertion? • (iii) deletion? • (iv) large memory space? • (v) key value used? • (vi) key access frequency used? Yes • (vii) unsuccessful search? Hard to say, but can be enhanced

  13. Evaluation (Cont.) Explanation: (vi) freq. = 2 freq.=65 133 78 freq. = 50 89 , store 78. collision (vii) • Static • no collision, no unsuccessful search. If collision, then sequential search. • Enhancement: static sort Ex: 16, 23, 45, 78, 103, 151, 179 (mod 11) Search 92: check location 4. ∵ 92<103 ∴ stop

  14. Dynamic 78, 45, 23, 103, 179, 151, 16 • Adv. and Disadv. • Disadvantage: (1) re-order for printing; (2) If hash fn. lost, loss all data. • Advantage: security • Goal: • Perfect hashing  Minimal perfect hashing

  15. 5. Perfect Hashing A record The set of keys • Perfect Hashing • Minimal Perfect Hashing

  16. 5.1 Ranking [Ghosh 1977] k1=11001k2=10100k3=10111k4=11101k5=10010k6=11011k7=10001 k7=10001k5=10010k2=10100k3=10111k1=11001k6=11011k4=11101 sort

  17. Ranking (Cont.)

  18. 5.2 Table look up • [Cichelli 1980] Ex: PASCAL 36 reserved wordsh(k) = k’s length + value of (k’s first character) + value of (k’s last character)A B C D E F G H I J K L M N O P Q R S T U V W X Y Z11 15 1 0 0 15 3 15 3 0 0 15 15 13 0 15 0 14 6 6 14 10 6 0 13 0DO, END, ELSE, CASE, DOWNTO, GOTO, TO, OTHERWISE, TYPE, WHILE, CONST, DIV, AND , SET, OR, OF, MOD, FILE, RECORD, PACKED, NOT, THEN, PROCEDURE, WITH, REPEAT, VAR, IN, ARRAY, IF, NIL, FOR, BEGIN, UNTIL, LABEL, FUNCTION, PROGRAM

  19. Table look up (Cont.) Ex: Month abbreviationsJAN, FEB, MAR, APR, MAY, JUN, JLY, AUG, SEP, OCT, NOV, DEC h(k)= value of (k’s 2nd char.) + value of (k’s 3rd char.) Letter frequency AEUCNPRBGLQTVY UNEPECUGANEBULPRCTAYARQV 33322221111111 555454443452 UNEPECAN AR UGEBULPRAY CTOV Get Sort

  20. Table look up (Cont.) Assign letter value Y=1T=7O=0V=11 AY=4CT=9 U=0N=0E=0P=1C=2A=3R=1G=5B=6L=7 UN=0 EP=1EC=2AN=3AR=4UG=5EB=6UL=7PR=2 OV=11 become 10 Change to 7 Contradiction !Change to 8

  21. 5.3 Rehashing [Du 1980] • First level rehash Ex:

  22. Rehashing (Cont.) HIT Ex: n=14, m=17, s=1, 3, 7

  23. Rehashing (Cont.) • Second level rehash

  24. Rehashing (Cont.)

  25. 5.4 Quotients and Remainders • (a) Sprugnoli 1977 • (b) Jaeschke 1981 • (a) Quotient reduction and remainder reduction • Quotient reduction Ex: {17, 138, 173, 294, 306, 472, 540, 551, 618} 17 138 173 294 306 472 540 551 618 0 2 3 4 5 7 8 9 10 Disadvantage: key set={1, 2, 3, 1000, 3000, 9000}, not good

  26. Quotients and Remainders (Cont.) • Remainder reduction • (b) Reciprocal hashing Ex: key set={10, 3, 14, 11, 6, 0, 21, 9, 1, 7, 20, 4} minimal perfect hashing function

  27. 6. Minimal Perfect Hashing since

  28. Minimal Perfect Hashing (Cont.) • Some questions C C mod P(ki)

  29. Minimal Perfect Hashing (Cont.) • Ans(1): Chinese Remainder Theorem

  30. Minimal Perfect Hashing (Cont.) • Ans(2): Use Prime number Functions • Ans(3): bi Mi i mod M

  31. Minimal Perfect Hashing (Cont.) • Ex: m1=4, m2=5, m3=7, m4=9 C' bi Mi i

  32. Applications-12 months English identifiers JANUARY FEBRUARY MARCH APRIL MAY JUNE JULY AUGUST SEPTEMBER OCTOBER NOVEMBER DECEMBER

  33. Applications-12 months English identifiers (Cont.) • Extract (The 2nd char., The 3rd char.) (A, N) (E, B) (A, R) (P, R) (A, Y) (U, N) (U, L) (U, G) (E, P) (C, T) (O, V) (E, C)

  34. Applications-12 months English identifiers (Cont.)

  35. Applications-12 months English identifiers (Cont.) H(P, R) = d(P) + (c(P) mod p(R)) = 8 + (1 mod 61) = 8 +1 = 9

  36. 36 Pascal Reserved Words ARRAY, AND, BEGIN, CASE, CONST, DOWNTO, DO, DIV, END, ELSE, FUNCTION, FILE, FOR, GOTO, IF, IN, LABEL, MOD, NIL, NOT, OTHERWISE, OF, OR, PROCEDURE, PROGRAM, PACKED, REPEAT, RECORD, SET, TYPE, THEN, TO, UNTIL, VAR, WITH, WHILE

  37. AA, AD, BI, CE, CS, DN DO, DV, ED, EE, FC, FE GO, IF, IN, LE, MD, NL NT, OE, OF, OR, PC, PG PK, RE, RO, ST, TE, TN TO, UI, VR, WH, WL

  38. H(W, L) = d(W) + (c(W) mod p(L)) = 34 + (39 mod 37) = 34 + 2 = 36

  39. 7. Conclusions • Small key set: • [Sprugnoli 1997] • [Cichelli 1980] • [Jaeschke 1981] • [Chang 1984] • Large key set: • [Du, Jea and Shieh 1980] • Open problems • Design a perfect hashing function to allow insertion and deletion of keys • Multi-key hashing • Find an efficient algorithm for determining whether or not a minimal perfect hash table exists…[for Cichelli’s]

  40. Hashing Technology: Past, Now, and Future Thank you so much!

More Related