1 / 25

A PACKET CLASSIFIER USING LUT CASCADES BASED ON EVMDDS (K)

A PACKET CLASSIFIER USING LUT CASCADES BASED ON EVMDDS (K). Author : Hiroki Nakahara, Tsutomu Sasao , Munehiro Matsuura Publisher : FPL 2013 Presenter : Pei- Hua Huang Date : 2013/11/06. INTRODUCTION.

cullen
Download Presentation

A PACKET CLASSIFIER USING LUT CASCADES BASED ON EVMDDS (K)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A PACKET CLASSIFIER USING LUT CASCADES BASED ON EVMDDS (K) Author : Hiroki Nakahara, Tsutomu Sasao, Munehiro Matsuura Publisher : FPL 2013 Presenter : Pei-Hua Huang Date : 2013/11/06

  2. INTRODUCTION • With the rapid increase of traffic, core routers dissipate the major part the total network power • With the rapid growth of the Internet, packet classifiers have become the bottleneck in the network traffic management • Contributions • use available FPGA resources effectively • throughput is more than 300 Gbps

  3. 5-tuple Packet Classification • A packet classification table consists of a set of rules. Each rule has five input fields: Source address (SA), destination address (DA), source port (SP), destination port (DP), and protocol number (PRT) Example 2.1 SA = 0000, DA = 1010, SP = 8, DP = 8, and PRT = TCP

  4. Decomposition of Packet Classification Table by Cartesian Product Method • Let p be the number of rules. Since |XSA| = |XDA| = 32, |XSP | = |XDP | = 16, and |XPRT | = 8, the direct memory realization requires 2104⌈ log2(p+1)⌉bits, which is too large to implement.

  5. Decomposition of Packet Classification Table by Cartesian Product Method • An entry of a rule can be represented by an intervalfunction [16]:

  6. vectorizedinterval function • For each value of H(X), we assign a segment • a fieldfunction F(X), which generates an unique integer index Iicorresponding to the i-th segment [Ci,Di] satisfying Ci≤X ≤ Di

  7. Cartesian product function G : Y → Z,where Y = I1×I2×· · ·×Ikis a set of Cartesian products ofindices generated by field functions

  8. LUT CASCADE BASED ON MTMDD (K)

  9. LUT CASCADE BASED ON MTMDD (K)

  10. LUT CASCADE BASED ON MTMDD (K) • The amount of memory for LUTibased on an MTMDD(k) is ri·2(k+ri+1) • the total amount of memory for an LUT cascade isM=Σ ri·2(k+ri+1) • Example 3.3 The amount of memory for the LUT cascade is 22×2+24×3 = 54 bits

  11. Partition of Rules by Greedy Algorithm • To reduce the number of segments, we partition rules into subrules • Let [x, y] be an entry for a field. Then, y−x is the size of the interval

  12. LUT CASCADE BASED ON AN EVMDD (K) • To reduce the amount of memory for an LUT cascade, we introduce an LUT cascade based on an edge-valued multivalued decision diagram (EVMDD (k))

  13. LUT CASCADE BASED ON AN EVMDD (K)

  14. LUT CASCADE BASED ON AN EVMDD (K)

  15. Let |X| = n be the number of inputs, and k = |Xi|. The LUT cascade has u = ⌈ n/k⌉ LUTs • an increase of k increases the amount of memory, while decreases the number of adders

  16. EXPERIMENTAL RESULTS • Implementation Setup • Virtex 7 VC707 evaluation board (FPGA: Xilinx, XC7VX485T-2FFG, 75,900 Slices, 1,030 36KbBRAMs, and 2,800 DSP48E Blocks) • use the Xilinx PlanAhead version 14.4 for the synthesis • LUTisize • >= 36Kb implemented by 36Kb BRAMs • < 36Kb implemented by distributed RAMs using Slices • partitioned 9,816 ACL rules generated by ClassBench [21] into two: Subrule 1 (9,600 rules) and Subrule 2 (216 rules)

  17. EXPERIMENTAL RESULTS • realize the packet classifier by three different methods: • A single memory. • LUT cascades based on MTMDDs (k). • LUT cascades based on an EVMDDs (k).

  18. EXPERIMENTAL RESULTS

  19. EXPERIMENTAL RESULTS

  20. Consume 2,024 Slices (6.7%), 37BRAMs (3.6%), and 105 DSP48E blocks (3.8%) • the system throughput is 0.54 (MHz)×2 (ports)×320 (Bits)= 345.60 Gbps for minimum packet size (40 Bytes)

  21. EXPERIMENTAL RESULTS

More Related