A power management proxy with a new best of n bloom filter design to reduce false positives
This presentation is the property of its rightful owner.
Sponsored Links
1 / 29

A Power Management Proxy with a New Best-of-N Bloom Filter Design to Reduce False Positives PowerPoint PPT Presentation


  • 48 Views
  • Uploaded on
  • Presentation posted in: General

A Power Management Proxy with a New Best-of-N Bloom Filter Design to Reduce False Positives. Miguel Jimeno Ken Christensen Department of Computer Science and Engineering University of South Florida Tampa, FL 33620 {mjimeno, [email protected] Outline. Introduction & Background

Download Presentation

A Power Management Proxy with a New Best-of-N Bloom Filter Design to Reduce False Positives

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


A power management proxy with a new best of n bloom filter design to reduce false positives

A Power Management Proxy with a New Best-of-N Bloom Filter Design to Reduce False Positives

Miguel Jimeno

Ken Christensen

Department of Computer Science and Engineering

University of South Florida

Tampa, FL 33620

{mjimeno, [email protected]


Outline

Outline

  • Introduction & Background

  • Research Problem

  • The SmartNIC

  • The new Design: Best-of-N Method

  • Analysis of Best-of-N Method

  • Numerical Results & Experiments Evaluation

  • Summary & Future Work


Introduction

Introduction

  • The internet consumes 2% of all the electricity consumed in the US.[1]

  • An average PC consumes 120 W when fully powered-on.[10]

  • PCs could add 10% to the typical US residential consumption.

  • P2P Applications make the PC remain “on the net” all the time, (they are idle 99% of the time)

[1]K. Kawamoto, J. Koomey, B. Nordman, R. Brown, M. Piette, M. Ting, and A. Meier, “Electricity Used by Office Equipment and Network Equipment in the U.S.: Detailed Report and Appendices,” Technical Report LBNL-45917, Energy Analysis Department, Lawrence Berkeley National Laboratory, 2001.


Introduction1

Introduction

  • Can a P2P application can be run in small, low-power microcontroller?

  • The PC could then be power managed.

  • The microcontroller can’t store large list of file names.

    Bloom Filters:

  • Bloom filters are a well known probabilistic data structure for representing a list of file name strings.


Introduction2

Introduction

Bloom Filters:

  • A group of hash functions are used to map elements into an array of bits.

  • False negatives are not possible, but there is a probability of generating false positives.

where m = size of the Bloom filter in bits,

k = number of hash functions used to calculate a Bloom filter, and s = number of bits set.

Figure 1. Bloom filter of size mbits, and k = 4 hash functions. Image Taken from [9]


Background

Background

  • Bloom filters were first proposed by Bloom [2]

  • Kirsch et. al. proposed a way to calculate bloom filter with less hashing [7]

  • Lumetta et. al. used the Power of Two Choices to calculate the bloom filter [8]

[2] B. Bloom, “Space/Time Tradeoffs in Hash Coding with Allowable Errors,” Communications of the ACM, Vol. 13, No. 7, pp. 422-426, 1970.


Outline1

Outline

  • Introduction & Background

  • Research Problem

  • The SmartNIC

  • The new Design: Best-of-N Method

  • Analysis of Best-of-N Method

  • Numerical Results & Experiments Evaluation

  • Summary & Future Work


Research problem

Research Problem

  • We investigated new methods for reducing the probability of false positives for a Bloom filter for fixed m and n.

  • The target is the implementation of this structure in a power management proxy.


Outline2

Outline

  • Introduction & Background

  • Research Problem

  • The SmartNIC

  • The new Design: Best-of-N Method

  • Analysis of Best-of-N Method

  • Numerical Results & Experiments Evaluation

  • Summary & Future Work


The smartnic

The SmartNIC

  • NICs support up to MAC layer, but can’t respond to higher-layer packets.

  • A PC needs to be fully powered-on in order to respond to packets.

  • Applications like P2P file sharing require the PC to be fully powered-on all the time.

  • To manage power in PCs running P2P applications:

    • We are studying the idea of using small controller to proxy for a sleeping PC.


The smartnic1

The SmartNIC

  • This proxy will be able to maintain P2P TCP connections and respond to query messages.

  • We are exploring locating the controller on the NIC, so it’s a “SmartNIC”.


Outline3

Outline

  • Introduction & Background

  • Research Problem

  • The SmartNIC

  • The new Design: Best-of-N Method

  • Analysis of Best-of-N Method

  • Numerical Results & Experiments Evaluation

  • Summary & Future Work


The new design best of n method

The New Design: Best-of-N method

  • Best-of-N method: N instances of a Bloom filter are generated and the instance with the least number of bits set to 1 is selected.

  • The “winner” hash group is used to test the bloom filter.

  • What improvement in Pr[false positive] can be achieved?

  • 2) What is the computational cost to generate the filter?


The new design best of n method1

The New Design: Best-of-N method

  • In order to compute N instances quickly, we developed a new pseudo-hashing method called “RNG hashing”.

  • This method, based on a Random Number Generator, generates multiple hashes from one initial “seed” hash.


Outline4

Outline

  • Introduction & Background

  • Research Problem

  • The SmartNIC

  • The new Design: Best-of-N Method

  • Analysis of Best-of-N Method

  • Numerical Results & Experiments Evaluation

  • Summary & Future Work


Analysis of best of n method

Analysis of Best-of-N Method

  • We define S to be the random variable for the number of bits set in a Bloom filter.

  • Using order statistics we can determine the distribution of the minimum value of the independent samples S1, S2, …, SN (selected as Best-of-N).

  • For order statistics, if f(s) and F(s) are known, then


Analysis of best of n method1

Analysis of Best-of-N Method

  • For a continuous distribution,

  • The mean can be computed as

  • Based on heuristic and empirical evidence, the distribution of S appears to be close to normal. Now we have that

  • where μ=E[S] and σ= σ[S]. We know that


Analysis of best of n method2

Analysis of Best-of-N Method

  • We derive

  • The probability of false positive for our method is then:

where E[Smin] is computed by substituting above.


Outline5

Outline

  • Introduction & Background

  • Research Problem

  • The SmartNIC

  • The new Design: Best-of-N Method

  • Analysis of Best-of-N Method

  • Numerical Results & Experiments Evaluation

  • Summary & Future Work


Numerical results

Numerical Results

  • For a given m and n where k is chosen optimally, we study the probability of false positive as a function of N.

30%


Numerical results1

Numerical Results

For Figure 5, n = 1000 and m = 16,000. For Figure 6, same n, but m = 32,000


Outline6

Outline

  • Introduction & Background

  • Research Problem

  • The SmartNIC

  • The new Design: Best-of-N Method

  • Analysis of Best-of-N Method

  • Numerical Results & Experiments Evaluation

  • Summary & Future Work


Experiments evaluation

Experiments Evaluation

  • Environment

    • Dell OptiPlex GX620 PC (Pentium4, 3.4 Ghz, 2 MBytes cache) with 1 GByte RAM.

    • WindowsXP, gcc compiler (version 3.4.2 mingw-special from Dev C++.

    • A list of 25,000 strings of unique music file names was obtained using Bearshare 5.2.

  • Response Variables

    • Probability of false positive for the Bloom filter.

    • Execution time to generate a Bloom filter.


Experiments evaluation1

Experiments Evaluation

  • Control variables

    • Hashing method used.

      • CRC32, Md5, RNG Method, Kirsch Method

    • Bloom filter parameters m, n, and k.

    • Best-of-N parameter N.

    • Number of strings used in the string test set.

  • Experiments Description

    • False Positive Exp 1: Vary N, measure Prob. of False Positive.

    • False Positive Exp 2: Vary N, measure False Pos.

    • Run-time experiment: Collect CPU time for each N.


Experiments evaluation2

Experiments Evaluation

  • The experimental results for probability of false positive perfectly agree with the analysis.

  • CPU time results of RNG method were as good as Kirsch method, and better than CRC32.

Kirsch and RNG


Outline7

Outline

  • Introduction & Background

  • Research Problem

  • The SmartNIC

  • The new Design: Best-of-N Method

  • Analysis of Best-of-N Method

  • Numerical Results & Experiments Evaluation

  • Summary & Future Work


Summary future work

Summary & Future Work

  • Two Improvements to Bloom filters

    • A new Best-of-N method that reduces the probability of false positive by generating N instances of a Bloom filter and selecting the best one.

    • A new RNG hashing method that generates pseudo hashes given a single seed hash.

  • Bloom filters could be implemented in a power management proxy for P2P applications.

  • Savings of up to 85 Mill. could be obtained if 25% of PCs running P2P applications use SmartNICs.


References

References

  • A. Broder and M. Mitzenmacher, “Network Applications of Bloom Filters: A Survey,” Internet Mathematics, Vol. 1, No. 4, pp. 485-509, 2005.

  • Energy Information Administration, “U.S Household Electricity Report,” July 2005. Available: http://www.eia.doe.gov/emeu/reps/enduse/er01_us.html.

  • L. Fan, P. Cao, and J. Almeida, “Bloom Filters - The Math,” 2000. Available: http://www.cs.wisc.edu/~cao/ papers/summary-cache/node8.html.

  • A. Kirsch and M. Mitzenmacher, “Less Hashing, Same Performance: Building a Better Bloom Filter,” Technical Report TR-02-5, Computer Science Group, Harvard University, 2005.

  • S. Lumetta and M. Mitzenmacher, “Using the Power of Two Choices to Improve Bloom Filters,” unpublished, 2006. Available: http://www.eecs.harvard.edu/~michaelm/ postscripts/bftwo.ps.

  • A. Pagh, R. Pagh, and S. Rao, “An Optimal Bloom Filter Replacement,” Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 823-829, 2005.

  • http://www.cs.wisc.edu/~cao/papers/summary-cache/node8.html

  • US Department of Energy, Energy Efficiency and Renewable Energy, “Estimating Appliance and Home Electronic Energy Use,” 2005. Available: http://www.eere.energy.gov/consumer/your_home/appliances/index.cfm/mytopic=10040.


Thanks

Thanks!

I’ll be happy to answer any questions.


  • Login