1 / 33

On Designing Fast Nonuniformly Distributed IP Address Lookup Hashing Algorithms

On Designing Fast Nonuniformly Distributed IP Address Lookup Hashing Algorithms. Author: Christopher J. Martinez, Devang K. Pandya, and Wei-Ming lin Publisher: IEEE/ACM Transactions on Networking, 2009 Presenter : Yuen- Shuo Li Date: 2013/01/09. Outline. Introduction

craig
Download Presentation

On Designing Fast Nonuniformly Distributed IP Address Lookup Hashing Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. On Designing Fast Nonuniformly Distributed IP Address Lookup Hashing Algorithms • Author: Christopher J. Martinez, Devang K. Pandya, and Wei-Ming lin • Publisher: IEEE/ACM Transactions on Networking, 2009 • Presenter: Yuen-Shuo Li • Date: 2013/01/09

  2. Outline • Introduction • Proposed Hashing Algorithm • Simulation Results • Implementation

  3. Introduction(1/4) Hashing has been widely used for fast IP address, but performance from known hashing schemes is far from optimal due to the nonuniformity in actual IP address distribution.

  4. Introduction(2/4) • there exist a set of well-established hash algorithms such as MD4, MD5, SHA-1, and SHA-2, which have found use in the cryptography field. • These algorithms rely on a series of addition, bit rotation, and logic operations through many cycles. Too slow!

  5. Introduction(3/4) • CRC-based hash functions have proven to be excellent means, but have some potential shortcomings. • Compared to a simple XOR folding hash algorithm that can be implemented in a fast parallel circuit, the CRC-based hash function requires a sequential circuit and a much longer time to determine the hash value. can’t be implement in parallel !

  6. Introduction(4/4) • The goal of this paper is to develop a universal hashing methodology applicable to nonuniformly distributed data sets. • Our proposed designs allow the application of a standard XOR folding hashing to produce a significantly improved performance. A New Hash Function (improve XOR folding hashing) balance!

  7. Proposed Hashing Algorithm(1/13) The hashing process is to hash each of the n-bit entries into an m-bit hash value. n bits hash m bits

  8. Proposed Hashing Algorithm(2/13) Intuitively, using the bits with smaller d values for hashing would lead to a probabilistically better hash distribution. n bits 1 1 0 1 0 0 0 1 0 1 0 1 d= 2 0 2 d: the difference between the number of 0’s and 1’s

  9. Proposed Hashing Algorithm(3/13) Employ a simple preprocessing step in rearranging the n-bit vectors according to their d values sorted into a increasing order. n bits

  10. Proposed Hashing Algorithm(4/13) A bit-extraction hashing is to simply extract m bits from the n-bit entry as its hash value sort by d n bits n bits m bits m bits EXT d-EXT

  11. Proposed Hashing Algorithm(5/13) n=32, m= varied MSL: the largest number of entries that are mapped into any hash bin. ASL: the average maximum number of matching steps needed for any given record to match.

  12. Proposed Hashing Algorithm(6/13) Group-XOR is a commonly used hashing technique by simply grouping the n-bit key into m-bit hash result through a simple process XORing every n/m key bits into a final hash bit. m bits n bits m bits m bits m bits 12

  13. Proposed Hashing Algorithm(7/13) The goal of this paper is to use the extracted information from the preprocessing (d values) to facilitate a better hash design with the XOR operator.

  14. Proposed Hashing Algorithm(8/13) In order not to degrade the hash performance, every intended XOR operation to be taken between two bits should lead to a value such that .

  15. Proposed Hashing Algorithm(9/13) • Bit vectors with smaller d values are XORed with larger d-value bits in order to have a better chance for further reduction. • Bit vectors in the middle range are XORed together to provide the most reductions available.

  16. Proposed Hashing Algorithm(10/13) Two straightforward ways to exploit the benefit from the d-value-based sorted sequence are to perform XOR hashing on the preprocessed database.

  17. Proposed Hashing Algorithm(11/13) The traditional group-XOR process may easily lead to detrimental effect, while both d-IOX and d-SOX avoid XORing two bits –- • both with small values (the worst possible XORing) • both with large values (the XORing leading to minimal gain).

  18. Proposed Hashing Algorithm(12/13) • Natural-Fold XOR(d-NFX) folds the sorted bit sequence from both ends’ matching pair of bits accordingly. • Natural-Fold with Duplication XOR(d-NFD) duplicates the middle subsegmentsto patch up the missing portion for uniformity.

  19. Proposed Hashing Algorithm(13/13) d-NFD may lead to overduplicationor underduplication on the center subsegments. A simple method is adopted in simply truncating the bits overshot or duplicating more the once.

  20. Simulation Results(1/12) The data set used for our simulation is randomly generated such that the value for each bit position is uniformly distributed. 16384(214)entries

  21. Simulation Results(2/12) • The simulation results for n = 32 and are given in Fig. 12 in terms of MSL and ASL by taking an average of results from 1000 runs. RS hash MSL: the largest number of entries that are mapped into any hash bin. ASL: the average maximum number of matching steps needed for any given record to match.

  22. Simulation Results(3/12) RS Hash(additional)

  23. Simulation Results(4/12) a summary of performance gain in MSL from each of the three proposed techniques and the two reference techniques over the group-XOR.

  24. Simulation Results(5/12) RS Hash The RS is a multiplicative hash algorithm that requires two multiply and one addition steps for every 8 bits of hash key to generate a hash value. CRC-32 Hash The CRC-32 requires 32 iterations to generate the final hash value for a given hash key, requiring additional control logic to properly maintain the sequential process.

  25. Simulation Results(6/12) the average d value of each final hash bit for m=14

  26. Simulation Results(7/12) a collection of real IP addresses gathered from three different sources: • general IP traffic addresses; • ad/spam IP addresses; • P2P IP addresses.

  27. Simulation Results(8/12) Performance comparison in terms of MSL and ASL on general IP traffic addresses.

  28. Simulation Results(9/12) Performance comparison in terms of MSL and ASL on AD/SPAM IP traffic addresses.

  29. Simulation Results(10/12) Performance comparison in terms of MSL and ASL on P2P IP traffic addresses.

  30. Simulation Results(11/12) To further analyze potential performance difference between the d-value XOR folding algorithms and the well-established CRC and RS hashing algorithms, the 2 analysis is conducted.

  31. Simulation Results(12/12) the 2 analysis

  32. Implementation(1/2) The mapping from the original bit position to the sorted position and then through the d-SOX hashing.

  33. Implementation(2/2) d-NFD

More Related