200 likes | 211 Views
This paper introduces the SAIL framework for improving IP lookup performance by leveraging on-chip memory and an ideal IP lookup algorithm. It presents a pivot-pushing technique and optimizations for lookup and update operations. The evaluation shows the effectiveness of the SAIL framework compared to other techniques.
E N D
Guarantee-IP-Lookup-Performance-with-FIB-Explosion Author: Tong Yang, GaogangXie, YanBiao Li, Qiaobin Fu, Alex X. Liu, Qi Li, Laurent Mathy Publisher/Conf.: SIGCOMM '14 Proceedings of the 2014 ACM conference on SIGCOMM Pages 39-50 Presenter: 林鈺航 Date: 2019/2/20 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.
Motivation • On-chip vs. Off-chip memory. 10 times faster, but limited in size. • With FIBs increasing, for almost all packets Constant yet fast lookup speed: Low Time Complexity Constant yet small footprint for FIB: On-chip Memory + Ideal IP Lookup Algorithm
SAIL Framework • Observation: almost all packets hit 0~24 prefixes • Two Splitting for a given IP address • Splitting lookup process • finding its longest matching prefix length • finding the next hop • Splitting prefix length • length ≤ 24 • length ≥ 25 Finding prefix length Finding next hop On-chip Off-chip Prefix length 0~24 Off-chip Off-chip Prefix length 25~32
Bitmap arrays 1 6 1 0 1 1 8 0 3 1 0 0 1 1 0 1 0 0 9 2 0 3 … … … 1 0 1 1 11 … 1 0 1 1 1 1 Splitting Next hop arrays Level 0~24 Short prefixes … 3 0 4 5 1 1 Level 25~32 Long prefixes … 1 0 7 1 2 1 Bit Maps 0-24 On-Chip How to avoid searching both short and long prefixes?
Pivot Pushing & Lookup Pivot push: Lookup 001010 Pivot level: 4 B4 [001010 >> 2] = 1 N4 [2] = 0 long prefix
Level 25 ~32 • Let the number of internal nodes on level 24 be n. We can push all solid nodes on levels 25∼31 to level 32. Afterwards, the number of nodes on level 32 is 256 ∗ n because each internal node on level 24 has a complete subtree with 256 leaf nodes, each of which is called a chunk. • For each leaf node, its corresponding entry in bit map is 1 and its corresponding entry innext hop array is the next hop of this node. • For each internal node, its corresponding entry in bit map is 1 and its corresponding entry in next hop array is the chunk ID in , multiplying which by 256 plus the last 8 bits of the given IP address locates the next hop in .
To distinguish these two cases, we let the next hop be a positive number and the chunk ID to be a negative number whose absolute value is the real chunk ID value. • With our pivot pushing technique, looking up an IP address a is simple: if [a >> 8] = 0, then we know the longest matching prefix length is within [0, 23] and further test whether [a >> 9] = 1; if [a >> 8] = 1 ∧ [a >> 8] > 0, then we know that the longest matching prefix length is 24 and the next hop is [a >> 8]; if [a >> 8] = 1 ∧ [a >> 8] < 0, then we know that the longest matching prefix length is longer than 24 and the next hop is [(| [a >> 8]| −1) ∗ 256 + (a&255)].
Update of SAIL_B Insert 10* B2[10]=1 delete111* B3[111]=0 1 0 changing 001*, or inserting 0010* only need to update off-chip tables
SAIL_U • Pushing to levels 6, 12, 18, and 24. • One update at most affects 2^6= 64 bits in the bitmap array. Still at most one on-chip memory access is enough for each update.
SAIL_L(1/2) Y If B16==1 N N16 If B24==1 Level 16 Y N N24 N32 Level 24 Level 32
Optimization • SAIL_B • Lookup: 25 on-chip memory accesses in worst case • Update: 1 on-chip memory access • Lookup Oriented Optimization (SAIL_L) • Lookup: 2 on-chip memory accesses in worst case • Update: unbounded, low average update complexity • Update Oriented Optimization (SAIL_U) • Lookup: 4 on-chip memory accesses in worst case • Update: 1 on-chip memory access • Extension: SAIL for Multiple FIBs (SAIL_M)
SAILs in worst case Worst case: 2 off-chip memory accesses for lookup
Implementations • FPGA: Xilinx ISE 13.2 IDE; Xilinx Virtex 7 device; On-chip memory is 8.26MB • SAIL_B, SAIL_U, and SAIL_L • Intel CPU: Core(TM) i7-3520M 2.9 GHz; 64KB L1, 512KB L2, 4MB L3; DRAM 8GB • SAIL_L and SAIL_M • GPU: NVIDIA GPU (Tesla C2075, 1147 MHz, 5376 MB device memory, 448 CUDA cores), Intel CPU (Xeon E5-2630, 2.30 GHz, 6 Cores). • SAIL_L • Many-core: TLR4-03680, 36 cores, each 256K L2 cache. • SAIL_L
Evaluation • FIBs • Real FIB from a tier-1 router in China • 18 real FIBs from www.ripe.net • Traces • Real packet traces from the same tier-1 router • Generating random packet traces • Generating packer traces according to FIBs • Comparing with • PBF [sigcomm 03] • LC-trie [applied in Linux Kernel] • Tree Bitmap • Lulea [sigcomm 97 best paper]