On Designing Fast Nonuniformly Distributed IP Address Lookup Hashing Algorithms

On Designing Fast Nonuniformly Distributed IP Address Lookup Hashing Algorithms • Author: Christopher J. Martinez, Devang K. Pandya, and Wei-Ming lin • Publisher: IEEE/ACM Transactions on Networking, 2009 • Presenter: Yuen-Shuo Li • Date: 2013/01/09

Outline • Introduction • Proposed Hashing Algorithm • Simulation Results • Implementation

Introduction(1/4) Hashing has been widely used for fast IP address, but performance from known hashing schemes is far from optimal due to the nonuniformity in actual IP address distribution.

Introduction(2/4) • there exist a set of well-established hash algorithms such as MD4, MD5, SHA-1, and SHA-2, which have found use in the cryptography field. • These algorithms rely on a series of addition, bit rotation, and logic operations through many cycles. Too slow!

Introduction(3/4) • CRC-based hash functions have proven to be excellent means, but have some potential shortcomings. • Compared to a simple XOR folding hash algorithm that can be implemented in a fast parallel circuit, the CRC-based hash function requires a sequential circuit and a much longer time to determine the hash value. can’t be implement in parallel !

Introduction(4/4) • The goal of this paper is to develop a universal hashing methodology applicable to nonuniformly distributed data sets. • Our proposed designs allow the application of a standard XOR folding hashing to produce a significantly improved performance. A New Hash Function (improve XOR folding hashing) balance!

Proposed Hashing Algorithm(1/13) The hashing process is to hash each of the n-bit entries into an m-bit hash value. n bits hash m bits

Proposed Hashing Algorithm(2/13) Intuitively, using the bits with smaller d values for hashing would lead to a probabilistically better hash distribution. n bits 1 1 0 1 0 0 0 1 0 1 0 1 d= 2 0 2 d: the difference between the number of 0’s and 1’s

Proposed Hashing Algorithm(3/13) Employ a simple preprocessing step in rearranging the n-bit vectors according to their d values sorted into a increasing order. n bits

Proposed Hashing Algorithm(4/13) A bit-extraction hashing is to simply extract m bits from the n-bit entry as its hash value sort by d n bits n bits m bits m bits EXT d-EXT

Proposed Hashing Algorithm(5/13) n=32, m= varied MSL: the largest number of entries that are mapped into any hash bin. ASL: the average maximum number of matching steps needed for any given record to match.

Proposed Hashing Algorithm(6/13) Group-XOR is a commonly used hashing technique by simply grouping the n-bit key into m-bit hash result through a simple process XORing every n/m key bits into a final hash bit. m bits n bits m bits m bits m bits 12

Proposed Hashing Algorithm(7/13) The goal of this paper is to use the extracted information from the preprocessing (d values) to facilitate a better hash design with the XOR operator.

Proposed Hashing Algorithm(8/13) In order not to degrade the hash performance, every intended XOR operation to be taken between two bits should lead to a value such that .

Proposed Hashing Algorithm(9/13) • Bit vectors with smaller d values are XORed with larger d-value bits in order to have a better chance for further reduction. • Bit vectors in the middle range are XORed together to provide the most reductions available.

Proposed Hashing Algorithm(10/13) Two straightforward ways to exploit the benefit from the d-value-based sorted sequence are to perform XOR hashing on the preprocessed database.

Proposed Hashing Algorithm(11/13) The traditional group-XOR process may easily lead to detrimental effect, while both d-IOX and d-SOX avoid XORing two bits –- • both with small values (the worst possible XORing) • both with large values (the XORing leading to minimal gain).

Proposed Hashing Algorithm(12/13) • Natural-Fold XOR(d-NFX) folds the sorted bit sequence from both ends’ matching pair of bits accordingly. • Natural-Fold with Duplication XOR(d-NFD) duplicates the middle subsegmentsto patch up the missing portion for uniformity.

Proposed Hashing Algorithm(13/13) d-NFD may lead to overduplicationor underduplication on the center subsegments. A simple method is adopted in simply truncating the bits overshot or duplicating more the once.

Simulation Results(1/12) The data set used for our simulation is randomly generated such that the value for each bit position is uniformly distributed. 16384(214)entries

Simulation Results(2/12) • The simulation results for n = 32 and are given in Fig. 12 in terms of MSL and ASL by taking an average of results from 1000 runs. RS hash MSL: the largest number of entries that are mapped into any hash bin. ASL: the average maximum number of matching steps needed for any given record to match.

Simulation Results(3/12) RS Hash(additional)

Simulation Results(4/12) a summary of performance gain in MSL from each of the three proposed techniques and the two reference techniques over the group-XOR.

Simulation Results(5/12) RS Hash The RS is a multiplicative hash algorithm that requires two multiply and one addition steps for every 8 bits of hash key to generate a hash value. CRC-32 Hash The CRC-32 requires 32 iterations to generate the final hash value for a given hash key, requiring additional control logic to properly maintain the sequential process.

Simulation Results(6/12) the average d value of each final hash bit for m=14

Simulation Results(7/12) a collection of real IP addresses gathered from three different sources: • general IP traffic addresses; • ad/spam IP addresses; • P2P IP addresses.

Simulation Results(8/12) Performance comparison in terms of MSL and ASL on general IP traffic addresses.

Simulation Results(9/12) Performance comparison in terms of MSL and ASL on AD/SPAM IP traffic addresses.

Simulation Results(10/12) Performance comparison in terms of MSL and ASL on P2P IP traffic addresses.

Simulation Results(11/12) To further analyze potential performance difference between the d-value XOR folding algorithms and the well-established CRC and RS hashing algorithms, the 2 analysis is conducted.

Simulation Results(12/12) the 2 analysis

Implementation(1/2) The mapping from the original bit position to the sorted position and then through the d-SOX hashing.

Implementation(2/2) d-NFD

On Designing Fast Nonuniformly Distributed IP Address Lookup Hashing Algorithms

On Designing Fast Nonuniformly Distributed IP Address Lookup Hashing Algorithms

Presentation Transcript

ip lookup

Discussion on Distributed Genetic Algorithms for Designing Truss Structure

Optimal Fast Hashing

Fast Routing Table Lookup Based on Deterministic Multi-hashing

Address Lookup and Classification

Optimal Fast Hashing

An On-Chip IP Address Lookup Algorithm

Example RAD Design: IP Router using Fast IP Lookup

Users Guide: Fast IP Lookup (FIPL) in the FPX

PARALLEL-SEARCH TRIE-BASED SCHEME FOR FAST IP LOOKUP

Address Lookup and Classification

Optimal XOR Hashing for a Linearly Distributed Address Lookup in Computer Networks

Best IP Address Lookup Tool

What is My IP Address Lookup

Fast IP Address Lookup Algorithms

IP Address Lookup

Parallel-Search Trie-based Scheme for Fast IP Lookup