- 48 Views
- Uploaded on
- Presentation posted in: General

A Power Management Proxy with a New Best-of-N Bloom Filter Design to Reduce False Positives

A Power Management Proxy with a New Best-of-N Bloom Filter Design to Reduce False Positives

Miguel Jimeno

Ken Christensen

Department of Computer Science and Engineering

University of South Florida

Tampa, FL 33620

{mjimeno, [email protected]

- Introduction & Background
- Research Problem
- The SmartNIC
- The new Design: Best-of-N Method
- Analysis of Best-of-N Method
- Numerical Results & Experiments Evaluation
- Summary & Future Work

- The internet consumes 2% of all the electricity consumed in the US.[1]
- An average PC consumes 120 W when fully powered-on.[10]
- PCs could add 10% to the typical US residential consumption.
- P2P Applications make the PC remain “on the net” all the time, (they are idle 99% of the time)

[1]K. Kawamoto, J. Koomey, B. Nordman, R. Brown, M. Piette, M. Ting, and A. Meier, “Electricity Used by Office Equipment and Network Equipment in the U.S.: Detailed Report and Appendices,” Technical Report LBNL-45917, Energy Analysis Department, Lawrence Berkeley National Laboratory, 2001.

- Can a P2P application can be run in small, low-power microcontroller?
- The PC could then be power managed.
- The microcontroller can’t store large list of file names.
Bloom Filters:

- Bloom filters are a well known probabilistic data structure for representing a list of file name strings.

Bloom Filters:

- A group of hash functions are used to map elements into an array of bits.

- False negatives are not possible, but there is a probability of generating false positives.

where m = size of the Bloom filter in bits,

k = number of hash functions used to calculate a Bloom filter, and s = number of bits set.

Figure 1. Bloom filter of size mbits, and k = 4 hash functions. Image Taken from [9]

- Bloom filters were first proposed by Bloom [2]
- Kirsch et. al. proposed a way to calculate bloom filter with less hashing [7]
- Lumetta et. al. used the Power of Two Choices to calculate the bloom filter [8]

[2] B. Bloom, “Space/Time Tradeoffs in Hash Coding with Allowable Errors,” Communications of the ACM, Vol. 13, No. 7, pp. 422-426, 1970.

- Introduction & Background
- Research Problem
- The SmartNIC
- The new Design: Best-of-N Method
- Analysis of Best-of-N Method
- Numerical Results & Experiments Evaluation
- Summary & Future Work

- We investigated new methods for reducing the probability of false positives for a Bloom filter for fixed m and n.
- The target is the implementation of this structure in a power management proxy.

- Introduction & Background
- Research Problem
- The SmartNIC
- The new Design: Best-of-N Method
- Analysis of Best-of-N Method
- Numerical Results & Experiments Evaluation
- Summary & Future Work

- NICs support up to MAC layer, but can’t respond to higher-layer packets.
- A PC needs to be fully powered-on in order to respond to packets.
- Applications like P2P file sharing require the PC to be fully powered-on all the time.
- To manage power in PCs running P2P applications:
- We are studying the idea of using small controller to proxy for a sleeping PC.

- This proxy will be able to maintain P2P TCP connections and respond to query messages.
- We are exploring locating the controller on the NIC, so it’s a “SmartNIC”.

- Introduction & Background
- Research Problem
- The SmartNIC
- The new Design: Best-of-N Method
- Analysis of Best-of-N Method
- Numerical Results & Experiments Evaluation
- Summary & Future Work

- Best-of-N method: N instances of a Bloom filter are generated and the instance with the least number of bits set to 1 is selected.
- The “winner” hash group is used to test the bloom filter.

- What improvement in Pr[false positive] can be achieved?
- 2) What is the computational cost to generate the filter?

- In order to compute N instances quickly, we developed a new pseudo-hashing method called “RNG hashing”.
- This method, based on a Random Number Generator, generates multiple hashes from one initial “seed” hash.

- Introduction & Background
- Research Problem
- The SmartNIC
- The new Design: Best-of-N Method
- Analysis of Best-of-N Method
- Numerical Results & Experiments Evaluation
- Summary & Future Work

- We define S to be the random variable for the number of bits set in a Bloom filter.
- Using order statistics we can determine the distribution of the minimum value of the independent samples S1, S2, …, SN (selected as Best-of-N).
- For order statistics, if f(s) and F(s) are known, then

- For a continuous distribution,

- The mean can be computed as

- Based on heuristic and empirical evidence, the distribution of S appears to be close to normal. Now we have that

- where μ=E[S] and σ= σ[S]. We know that

- We derive

- The probability of false positive for our method is then:

where E[Smin] is computed by substituting above.

- Introduction & Background
- Research Problem
- The SmartNIC
- The new Design: Best-of-N Method
- Analysis of Best-of-N Method
- Numerical Results & Experiments Evaluation
- Summary & Future Work

- For a given m and n where k is chosen optimally, we study the probability of false positive as a function of N.

30%

For Figure 5, n = 1000 and m = 16,000. For Figure 6, same n, but m = 32,000

- Introduction & Background
- Research Problem
- The SmartNIC
- The new Design: Best-of-N Method
- Analysis of Best-of-N Method
- Numerical Results & Experiments Evaluation
- Summary & Future Work

- Environment
- Dell OptiPlex GX620 PC (Pentium4, 3.4 Ghz, 2 MBytes cache) with 1 GByte RAM.
- WindowsXP, gcc compiler (version 3.4.2 mingw-special from Dev C++.
- A list of 25,000 strings of unique music file names was obtained using Bearshare 5.2.

- Response Variables
- Probability of false positive for the Bloom filter.
- Execution time to generate a Bloom filter.

- Control variables
- Hashing method used.
- CRC32, Md5, RNG Method, Kirsch Method

- Bloom filter parameters m, n, and k.
- Best-of-N parameter N.
- Number of strings used in the string test set.

- Hashing method used.
- Experiments Description
- False Positive Exp 1: Vary N, measure Prob. of False Positive.
- False Positive Exp 2: Vary N, measure False Pos.
- Run-time experiment: Collect CPU time for each N.

- The experimental results for probability of false positive perfectly agree with the analysis.
- CPU time results of RNG method were as good as Kirsch method, and better than CRC32.

Kirsch and RNG

- Introduction & Background
- Research Problem
- The SmartNIC
- The new Design: Best-of-N Method
- Analysis of Best-of-N Method
- Numerical Results & Experiments Evaluation
- Summary & Future Work

- Two Improvements to Bloom filters
- A new Best-of-N method that reduces the probability of false positive by generating N instances of a Bloom filter and selecting the best one.
- A new RNG hashing method that generates pseudo hashes given a single seed hash.

- Bloom filters could be implemented in a power management proxy for P2P applications.
- Savings of up to 85 Mill. could be obtained if 25% of PCs running P2P applications use SmartNICs.

- A. Broder and M. Mitzenmacher, “Network Applications of Bloom Filters: A Survey,” Internet Mathematics, Vol. 1, No. 4, pp. 485-509, 2005.
- Energy Information Administration, “U.S Household Electricity Report,” July 2005. Available: http://www.eia.doe.gov/emeu/reps/enduse/er01_us.html.
- L. Fan, P. Cao, and J. Almeida, “Bloom Filters - The Math,” 2000. Available: http://www.cs.wisc.edu/~cao/ papers/summary-cache/node8.html.
- A. Kirsch and M. Mitzenmacher, “Less Hashing, Same Performance: Building a Better Bloom Filter,” Technical Report TR-02-5, Computer Science Group, Harvard University, 2005.
- S. Lumetta and M. Mitzenmacher, “Using the Power of Two Choices to Improve Bloom Filters,” unpublished, 2006. Available: http://www.eecs.harvard.edu/~michaelm/ postscripts/bftwo.ps.
- A. Pagh, R. Pagh, and S. Rao, “An Optimal Bloom Filter Replacement,” Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 823-829, 2005.
- http://www.cs.wisc.edu/~cao/papers/summary-cache/node8.html
- US Department of Energy, Energy Efficiency and Renewable Energy, “Estimating Appliance and Home Electronic Energy Use,” 2005. Available: http://www.eere.energy.gov/consumer/your_home/appliances/index.cfm/mytopic=10040.

I’ll be happy to answer any questions.