1 / 35

Improving Bloom Filter Configuration for Lazy Transactional Memory

Improving Bloom Filter Configuration for Lazy Transactional Memory. Mark Jeffrey and J. Gregory Steffan ECE, University of Toronto November 10, 2011. Parallel Programming is Hard. T 1. T 3. T 2. Rd(a). Rd(a). Rd(x). Rd(b). Wr (c). Rd(a). Wr (a). Rd(a).

dora
Download Presentation

Improving Bloom Filter Configuration for Lazy Transactional Memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving Bloom Filter Configuration for Lazy Transactional Memory Mark Jeffrey and J. Gregory Steffan ECE, University of Toronto November 10, 2011

  2. Parallel Programming is Hard T1 T3 T2 Rd(a) Rd(a) Rd(x) Rd(b) Wr(c) Rd(a) Wr(a) Rd(a) • Many tools are using Bloom filters Tools offload some burden of managing data accesses: • Memory Race Replay • Atomicity Violation Survival • Transactional Memory • Speculative Optimizations Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  3. Bloom Filter & We show new practices are inefficient! (in theory and empirically) • Bit-vector-based data structure [1970] • offers fast set operations • in exchange for some imprecision • Recently used to compare memory accesses • With unconventional practices: Intersection Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  4. Bloom Filters in Concurrency Tools Our propositions will improve parallelism! Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  5. Tracking Address-Set Conflicts Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  6. Address-Sets T1 T3 T2 Rd(a) Rd(x) Rd(a) Rd(b) Wr(c) Wr(a) Rd(a) Rd(a) Read Set: memory locations read RT1={a,b} Write Set: memory locations written WT1={a} Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  7. Burden: Address-Set Conflicts T1 T3 T2 Rd(a) Rd(x) Rd(a) Rd(b) Wr(c) Wr(a) Rd(a) Rd(a) Conflicts • address accesses are dependent • independence -> parallelism! • address conflicts -> no parallelism Conflict Detection requires • read and write set comparison Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  8. Lazy Conflict Detection T1 T2 Rd(a)- -Rd(a) Wr(b)- -Rd(b) R1={a,c}W1={b} Rd(c)- Test address-sets for null-intersections Detect conflicts at the end of a transaction Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  9. Bloom Filters (BF) Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  10. Bloom Filter Background x h() • Bloom filter is a compact set representation • bit vector - much smaller than address space Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  11. Bloom Filter Background y h() {Yes, No} Query for an address, y Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  12. Bloom Filter False Positives (FPs) x ? is y in y • Encode a large address space into a bit-vector • response to query is actually No or Maybe • False Positives – when “maybe” is wrong Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  13. Partitioned Bloom Filter x h1() h2() … hk() … • Insert an address, x: • k hash functions encode k bit indices to set Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  14. Partitioned Bloom Filter y h1() h2() … hk() … {Maybe, No} Probability of False Positives is well understood Query for an address, y: Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  15. UnconventionalBloom Filter Null-Intersection Tests Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  16. Bloom Filter Null-Intersection Tests a1 ? a5 a4 a3 a2 Two existing approaches: • build a Queue of Queries (QoQ) • combine queries into distinct Bloom filter • replace many queries with 1 intersection! Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  17. Partitioned BF Intersection & … … … {Disjoint, Maybe Overlap} Do two sets share any elements? Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  18. Unpartitioned BF Intersection … … & … {Disjoint, Maybe Overlap} Any asserted bits indicate set overlap Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  19. Imprecision in BF Intersection Understand and improve Bloom filter intersection • Bloom filter was intended for fast Querying • Recent systems use filter for Intersection • Imprecision can produce False Set-Overlaps (FSO) • We are the first to study Bloom filter FSOs • Our goal is to Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  20. Important Questions When using BFs for testing null-intersection • How do BF Intersection and QoQ compare? • theoretical study [SPAA ‘11] • Can we compromise? • new Bloom filter design • Does theory work in practice? • empirical study Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  21. Bloom Filters for Null-Intersection Tests How do BF Intersection and QoQ compare? Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  22. Definitions Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  23. Definitions h1() h2() … hk() … Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  24. Probability of FSO[SPAA ‘11] b1 ϵ? b5 b4 b3 b2 h1 h2 … hk h1 h2 … hk • Unpartitioned BF Intersection • Partitioned BF Intersection • Queue of BF Queries Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  25. Comparing FSOs [SPAA ’11] b4 b3 b2 h1 … hk h1 … hk b1 ϵ? • Queue of Queries gives the fewest false conflicts • Partitionedintersection improves on Unpartitioned For any length m, and k > 1 hash functions, Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  26. Bloom Filters for Null-Intersection Tests Can we compromise? A new Bloom filter design Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  27. Batch-of-Bloom-filters (BoB) hpre x … x h1 hk … … … … h1 hk h1 hk … … Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  28. BoB Intersection & … … … … … … … BoB: compromise between QoQ and Intersect {Disjoint, Maybe Overlap} Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  29. Bloom Filters for Null-Intersection Tests Does theory work in practice? Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  30. Methodology • Augment RingSTM with alternate BF configs [Spear et al. SPAA ’08] • unpartitionedBloom filterintersection • Stress BF configurations using STAMP bench • 8-core Intel Xeon with SSE2 ISA • 32-bit Linux 2.6.32-5-686 Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  31. Performance Results: Labyrinth Execution Time Aborts 21% Speedup Better QoQ, BoB, part. intersect outperform baseline Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  32. Performance Results: Kmeans-low Execution Time Aborts >25% slowdown Better Querying overhead counteracts reduced aborts Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  33. Conclusion Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  34. Conclusion Conflict detection often applies Bloom filters • for fast set operations: y ϵ S and S1∩S2 • unconventionally using BFs for null-intersection Our recommendations (from theory & practice) • strongly consider querying before intersection • in hardware, consider intersecting BoBs • build adaptive systems for application behaviors Mark Jeffrey, Improving Bloom Filter Configuration for Lazy TM

  35. Improving Bloom Filter Configuration for Lazy Transactional Memory Thank you! markj@eecg.toronto.edu

More Related