180 likes | 428 Views
Performance Evaluation of Packet Classification on FPGA-based TCAM Emulation Architectures. GLOBECOM (Global Communications Conference), 2012 . Presenter: NTHU 101062607 李若萍. Outline. Introduction Related Work TCAM Emulation RAM-based TCAM Architecture Performance Evaluation Conclusion.
E N D
Performance Evaluation of Packet Classification on FPGA-based TCAM Emulation Architectures GLOBECOM (Global Communications Conference), 2012 Presenter: NTHU 101062607 李若萍
Outline Introduction Related Work TCAM Emulation RAM-based TCAM Architecture Performance Evaluation Conclusion
Introduction • Packet fieldsare used as keys to determine the best matching rule and apply a corresponding action. • Exact matching • Prefix matching • Range matching • How to find the best matching rule? • Each rule is assigned a cost.
Introduction(cont.) CAM(Content Addressable Memories) TCAM(Ternary Content Addressable Memories) KEY KEY VCC VCC Match line Match line Match Match SRAM cell SRAM cell ≠ ≠ Data Data Mask SRAM & Match = (key ≠ Data) Match line = !Match Mask Match = (key ≠ Data) & Mask Match line = !Match
Introduction(cont.) Compared key 2 Memory address: 1 3 N store rules TCAM Compared result: X 0 1 1 Priority Encoder memory address as index to find responding action Capacity constraints Storage inefficiency High power consumption Limited scalability RAM TCAMs (Ternary Content Addressable Memories)
Introduction (cont.) (Not ASIC: Application-Specific Integrated Circuits) Purpose : we investigated performance and trade-offs related to TCAM emulation in FPGAs (Field-Programmable Gate Array). We considered the impact of encoding different key ranges on rules for different configurations in terms of the search key length and the number of rules.
Related Work • Hardware-assisted packet classification • Decision tree • Hierarchically split rule pattern straitens incremental updates. • Decomposition • The cross-producting stage issue. • Exhaustive search • Predictable memory requirements.
TCAM Emulation Native TCAM Emulated TCAM
RAM-based TCAM Architecture m-bit key (m = 10) Full address expansion w = m-1 = 9 w = m-2 = 8 w = 2 w = m = 10 w = 1 m/w = 10/9 = 1 RAM block block size = 2^w = 2^9 ( 0~2^9-1 ) m/w = 10/8 = 1 RAM block block size = 2^w = 2^8 ( 0~2^8-1 ) m/w = 10/2 = 5 RAM block block size = 2^w = 2^2 = 4 ( 0~3 ) m/w = 1 RAM block block size = 2^w = 2^10 ( 0~2^10-1 ) m/w = 10/1 = 10 RAM block block size = 2^w = 2 ( 0~1 ) BRAMs demands (m/w) * 2^w bits BRAMs modes = depth*width native TCAM
RAM-based TCAM Architecture (cont.) 16 –bit key 2^16*64 m/w = 16/6 = 2 w = 6 2^8*32*4 m/w = 16/6 = 2 RAM block block size = 2^w = 2^6 = 64 n = 64, m = 16
Performance Evaluation emulated one (m/w)*(2^w)*6 TCAM w*m*16 • Resource Utilization • A TCAM bit typically demands 16 transistors, while a RAM bit, only 6 • TCAM => w*m*16 • TCAM emulation => (m/w)*(2^w)*6
Performance Evaluation (cont.) (m/w)*(2^w) bits
Performance Evaluation (cont.) • Classification Throughput • a crucial factor for evaluating emulated TCAM performance on FPGA is the actual classification throughput in terms of packets per second (pps).
Performance Evaluation (cont.) • Range Impact • we assess the impact of supporting different ranges in terms of memory requirements and classification rate.
Conclusion Classification rates above 300Mppsfor both large keys and rule sets can be implemented with only a few megabits of RAM when considering up to medium size range intervals (512-2048). Support for both large ranges and large rule sets tends to demand much memory resources, which also penalizes the resulting classification rate.
Thank you! The End.