1 / 45

Balanced Families of Perfect Hash Functions and Their Applications

Balanced Families of Perfect Hash Functions and Their Applications. Lecturer: Ofer Rothschild Noga Alon Shai Gutner Tel Aviv University, 2007. Families of perfect hash functions. Functions from [n] to [k] For every S ⊆ [n], |S| = k: Standard notion: At least one 1-1 function New notion:

burton
Download Presentation

Balanced Families of Perfect Hash Functions and Their Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Balanced Families of Perfect Hash Functions and Their Applications Lecturer: Ofer Rothschild Noga Alon Shai Gutner Tel Aviv University, 2007

  2. Families of perfect hash functions • Functions from [n] to [k] • For every S ⊆ [n], |S| = k: • Standard notion: • At least one 1-1 function • New notion: • About the same number of 1-1 functions Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  3. Motivation • Approximation of counting problems • The number of times that: • Simple cycles of size k • Simple paths of size k • Some fixed subgraph appear in a graph • Also in weighted graphs Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  4. Previous results and Background • k-restriction problems • Color-coding [Alon et al. 1995] • Computational biology Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  5. Previous results - Explicit constructions of perfect hash functions • [Alon et al. 1995]: • Size: 2O(k) log n • Best known explicit construction: • Size: ekkO(log k)log n • Lower bound [Naor et al. 1995]: • Ω(eklog n /√k ) Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  6. Previous results –Finding and counting paths and cycles • Path: 2O(k)|E|, 2O(k)|V| expected time • Cycle: 2O(k)|V||E|, 2O(k)|V|ω expected time • Derandomization: extra log|V| factor • Counting: • K ≤ 7: O(|V|ω) • Exactly: #W[1]-complete • Randomized: tractable Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  7. Previous results –Splitters • [Naor et al. 1995] Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  8. Results • A δ-Balanced (n,k)-family of perfect hash functions: (1 < δ ≤ 2) • Non-constructive upper bound • Explicit construction: • Size: 2O(k log log k)(δ −1)−O(log k) log n • Time: 2O(k log log k)(δ − 1)−O(log k)n log n + (δ − 1)−O(k/ log k) Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  9. Results - Applications • Counting simple paths of length k-1: • 2O(k log log k)(δ − 1)−O(log k)|E| log |V| + (δ − 1)−O(k/ log k) • Counting simple cycles of length k: • 2O(k log log k)(δ − 1)−O(log k)|E||V| log |V| + (δ − 1)−O(k/ log k) • Polynomial if k ≤ O(logn/logloglogn) and δ is fixed Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  10. Definitions Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  11. Hashing Huge universe U Here U=[n]:={1,2,…,n} Hash function h Collisions 0 1 m-1 Illustrations from Uri Zwick 2008 Hash table Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  12. 0 1 m-1 i Hashing with chaining Illustrations from Uri Zwick 2008 Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  13. Perfect hash functions [n] S Perfect hashing:No collisions Illustrations from Uri Zwick 2008 Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  14. Families of perfect hash functions [n] S T Usually this array will be [k] Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  15. δ-balanced (n,k)-familyδ-balanced (n,k,l)-splitter [n] S f1f2 Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  16. Definitions • For a group of functions and • inj(S)=the number of 1-1 functions on S • split(S)=the number of functions that divide S almost equally: Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  17. Definition 2.1 • δ-balanced (n,k)-family: • Functions from [n] to [k] • The number of 1-1 functions is almost equal for all Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  18. Definition 2.2 • δ-balanced (n,k,l)-splitter: • Function from [n] to [l] • The number of functions that divide S to almost equal sets is almost equal for all Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  19. Theorems Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  20. Lemma 2.3 • For any k < l, let H be an explicit δ-balanced (n, k, l)-splitter of size N and let G be an explicit γ-balanced (l, k)-family of perfect hash functions of size M. We can use H and G to get an explicit δγ-balanced (n, k)-family of perfect hash functions of size NM. • (n,k,l)-Splitter * (l,k)-Family = (n,k)-Family • Proof: Compose the functions. • Lemma 2.4: A similar lemma for k>l Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  21. Probabilistic constructions • Theorem 3.3. For any 1 < δ ≤ 2, there exists a δ-balanced (n, k)-family of perfect hash functions of size: • Proof plan: • p=k!/kk • Take M random functions • Prove that the probability that M is such a family Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  22. Brave Sir Robbin ran away • Chernoff: Let Y be the sum of mutually independent indicator random variables, μ = E[Y ]. For all 1 ≤ δ ≤ 2, Pr[μδ ≤ Y ≤ δμ] > 1 − 2e−((δ−1)^2)μ/8. • Robbins: For every integer n ≥ 1, √(2π)nn+1/2e−n+1/(12n+1) < n! < √(2π)nn+1/2e−n+1/(12n). • E[inj(S)]=pM • Therefore, the chance that for at least one set S, the number of 1-1 functions will not be as needed is at most: QED Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  23. Probabilistic constructions • Similar Theorems for splitters Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  24. Explicit constructions • Theorem 4.1. For any fixed 1 < δ ≤ 2, a δ-balanced (n,k)-family of perfect hash functions of size: can be constructed deterministically within time: Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  25. Proof begins • p=k!/kk Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  26. For any choise of M functions and a set S: • XS,i := Is fi 1-1 on S? • XS =How many functions are 1-1 on S? Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  27. What do you expect? • Let’s show that even if we take M independent random functions – usually it will be OK • Later we shall improve it by determinism Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  28. Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  29. Let’s bound • 1+u<=eu • e-u <=1-u+u2/2 Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  30. Are we there yet? Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  31. The deterministic construction • This is what we expect in random. • That will be our upper bound for the deterministic construction • We shall use a greedy algorithm • We shall find the functions in this order: • for(i=1; i<=M; i++) { //f1, f2, …, fM for(j=1; j<=n; j++) { //fi(1), fi(2), …, fi(n) find fi(j) }} Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  32. Every step we shall find the value that will give the minimal conditional expectancy • The conditional expectancy can be computed each step in time • We start with • Every step the conditional expectancy decreases • Particularly at the end, Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  33. What is this Φ again? • In our case • In particular, for every S: • And with simple manipulations we get to: • pM/δ≤ XS≤δpM QED (Theorem 4.1) Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  34. For example Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  35. We are only interested in: 1? 2? 5? E+=e0=1 3?4?  E+=1/5*e-λ +4/5*e0 Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  36. Some more theorems • . is too much. • We shall use (4.1) only as a part of the construction • Theorem 4.2. For any 1 < δ ≤ 2, a δ-balanced (n, k, ⌈2k2/(δ−1) ⌉)-splitter of size: kO(1) log n/(δ−1)O(1) can be constructed in time kO(1)n log n/(δ−1)O(1) . • (Using error correcting codes) Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  37. Some more theorems • Theorem 4.3. • For any k ≥ l and 1 < δ ≤ 2, a δ-balanced (n, k, l)-splitter of size: 2O(k log l−log(δ−1))log n can be constructed in time: 2O(k log l−log(δ−1))n log n. • (Using almost k-wise independence) • Corollary 4.4. • For any fixed c > 0, a (1 + c−k)-balanced (n, k, 2)-splitter of • size 2O(k) log n can be constructed in time 2O(k)n log n. Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  38. Theorem 4.5 • For 1 < δ ≤ 2, a δ-balanced (n, k)-family of perfect hash functions of size: 2O(k log log k)/(δ−1)O(log k) log n can be constructed in time: 2O(k log log k) n log n /(δ−1)O(log k) + (δ −1)−O(k/ log k). • In particular, for any fixed 1 < δ ≤ 2, the size is 2O(k log log k) log n and the time is 2O(k log log k)n log n. • Proof: Using all the theorems we construct balanced families and splitters, and then we compose them using the lemmas. Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  39. Applications Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  40. Applications • Counting simple paths of length k-1: • 2O(k log log k)(δ − 1)−O(log k)|E| log |V| + (δ − 1)−O(k/ log k) • Counting simple cycles of length k: • 2O(k log log k)(δ − 1)−O(log k)|E||V| log |V| + (δ − 1)−O(k/ log k) • Polynomial if k ≤ O(logn/logloglogn) and δ is fixed Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  41. 1 2 1 2 3 3 2 3 2 How many paths must a man walk down before you call him a man? • Build a δ-balanced (|V|, k)-family of perfect hash functions using theorem 4.5 – these are the colourings • Compute the number of colourful paths for every S • T/δ*(number of paths) ≤ Σ{for all S} (colourful paths) ≤ δT*(number of paths) • Divide by T Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008 Illustration from Shirly Zilkha, 2008

  42. Concluding remarks Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  43. Other counting problems • The constant T can be easily computed Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  44. What’s next? • Can we decrease the balanced family to size 2O(k)logn? • What about k=Θ(logn)? Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

  45. References • NogaAlon and ShaiGutner.Balanced families of perfect hash functions and their applications, In ICALP, volume 4596 of Lecture Notes in Computer Science, pages 435-446. Springer, 2007, http://courses.cs.tau.ac.il/combsem/09a/combsem.html • A lecture on hashing from the course in Data Structures, Uri Zwick, Tel Aviv University, 2007, http://www.cs.tau.ac.il/courses/0368-2158/08a/ • ShirlyZilkha’s presentation on Color Coding, 2008, http://courses.cs.tau.ac.il/combsem/09a/combsem.html Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008

More Related