1 / 23

SSA: A Power and Memory Efficient Scheme to Multi-Match Packet Classification

SSA: A Power and Memory Efficient Scheme to Multi-Match Packet Classification. Fang Yu 1 T. V. Lakshman 2 Marti Austin Motoyama 1 Randy H. Katz 1 1 EECS Department, UC Berkeley , 2 Bell Laboratories, Lucent Technologies. Outline.

iain
Download Presentation

SSA: A Power and Memory Efficient Scheme to Multi-Match Packet Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SSA: A Power and Memory Efficient Scheme to Multi-Match Packet Classification Fang Yu1 T. V. Lakshman2 Marti Austin Motoyama1 Randy H. Katz1 1EECS Department, UC Berkeley , 2Bell Laboratories, Lucent Technologies

  2. Outline • Introduction to multi-match classification • Multi-match classification using TCAM • May consume a large amount of TCAM memory • May consume high power • Set Splitting Algorithm (SSA) • A memory and power efficient scheme for multi-match classification • Simulation results • Conclusions

  3. Packet Classification • Single-Match classification • Assumption: all the filters are associated with priorities • Only the highest priority match matters • E.g., longest prefix match Packetheader Packet Payload • Multi-Match classification • Report all matching results • No priority among filters • Intrusion detection system: identify all the related rules • Also required by accounting applications

  4. Ternary-CAM (TCAM) • Fully associative memory: compare input string with all the entries in parallel • For multiple matches, report the index of the first match • Each cell takes one of three logic states • ‘0’, ‘1’, and ‘?’(don’t care) cell entry width

  5. Challenges of Multi-match Classification using TCAM • Memory efficient • 9Mbits – 18Mbits priced at $200-$300 • Power efficient • Easy update • High speed • TCAM is fast e.g., 4 ns, However,TCAM only returns the first match result • We want all the matching results within a few cycles • If returning a bit vector of the matching result? • Processing the bit vector can take time if the bit vector is long • Not efficient it is a sparse vector in most of the cases

  6. Previous Solutions: Geometric Intersection-based Solution [Hot Interconnects 04] • Add additional intersection filters • High speed • Return all the matching results within one cycle • Memory efficient • Create ~10N intersection filters for the Snort rule set • May create O(NF) intersection filters in the worst case • Energy efficient • Easily updatable

  7. Previous Solution: MUD [ Sigcomm’05] • Encode the index of the entry and include the encoded value in each TCAM entry • Search the TCAM with initial MUD as all don’t cares • After finding a matching result at index j, search again with discriminator field value ‘> j’

  8. Previous Solution: MUD (Cont.) • High speed • 1+d+(k-2)*(d-1) = O(dk) TCAM lookups to get k matching results • d is the logarithm of the number of entries in TCAM (d=log2N) • Decreased to 1+d*(k-1)/r with DIRPE, where r (smaller than d) • Memory efficient • Energy efficient • All the entries in TCAMs are accessed each time  high power consumption. • Easily updatable Our Goal: Find a memory and power efficient solution

  9. Observation Original • Split filters into two sets to reduce intersections • Report the union of results from all sets • No need to include the intersections of the filters from different sets • Decrease the number of filters in TCAM, decrease power consumption • Increase the number of TCAM access FN Two sets Matching FN F1 Matching F1 and FN Matching F1 N filters +O(N2) intersection 1 TCAM lookup N filters + 1 intersection 2 TCAM lookups

  10. Problem Definition • Given a set of filters F(F1,F2, …., FN) • Filters create a set of intersections I(I1,I2, …., IM) • e.g., I1= intersection of (F1, F5, F6) • How to divide the filters into several sets • Residual intersection set I’: intersections from filters in the same set • N + |I’| < TCAM size • Number of sets (TCAM accesses) is minimum • NP hard problem!

  11. Split Rules into Two Sets • Still an NP hard problem (known as maximum set splitting or maximum hypergraph cut) • Best known approximation algorithms • Yield a performance ratio of 0.72 to the optimum solution • Require quadratic programming slow when the number of filters is large • Our SSA algorithm • Remove at least half of the intersections • O(NM) complexity, where N is the total number of filters, and M is the total number of intersections

  12. Maximum Satisfiability Problem • Maximum Satisfiability Problem • A set of literals {F1, F1, F2, F2,.., FN, FN} • A set of clauses, each clause is a subset of literals • E.g., C1={F1 F5 F6} • Goal: Find an assignment of F to satisfy a maximum number of clauses

  13. Johnson’s Algorithm to Maximum Satisfiability Problem • Assign each clause a weight = 2-|c| • E.g., weight of C1={F1, F5 F6} is 2-3 • Let Fi be any literal which hasn’t been assigned a value yet • If the weight of all clauses containing Fi is higher than those containing Fi • Assign Fi a true value and remove all clauses containing Fi • Multiply the weight of all the clauses containing Fi by 2 • Otherwise • Assign Fi a false value and remove all clauses containing Fi • Multiply the weight of all the clauses containing Fi by 2

  14. Johnson’s Theorem • If all the clauses have at least k literals • Johnson’s algorithm can satisfy at least (2K-1)/ 2Kpercent of the total clauses • e.g., k=2, satisfy at least ¾ of the clauses • It is proved that (2K-1)/ 2Kis the best approximation bound for k>2

  15. Filter Set Split Algorithm (SSA) • Convert set splitting problem into maximum satisfiablity problem • Each filter corresponds to a literal • For any intersection (e.g., I1= intersection of F1,, F5, and F6), add two clauses • C={F1, F5 F6} and C’={F1, F5 F6} • Total number of clauses is 2M, M is the number of intersections • Run Johnson’s algorithm and assign each filter Fi either a true (put in set one) or a false value (put in set two)

  16. Filter Set Split Algorithm (SSA) (cont.) • According to Johnson’s theorem • At least ¾ of the clauses are satisfied  2M*3/4=1.5M At least 0.5M of the intersections have both clauses satisfied • Suppose for intersection of F1,, F5, and F6 , C={F1 F5 F6} and C’={F1 F5 F6} both are satisfied • At least one of F1,, F5, F6 is true and at least one is false • F1,, F5, F6 are split into different sets, thus this intersection doesn’t need to be presented in TCAM At least 50% of the intersections are removed!

  17. Review of the SSA Scheme • High speed • Deterministic lookup rate. E.g., if filters are split into two sets, only 2 TCAM lookups per packet are needed. • Sets are logically independent  Lookups can be parallelized • Memory efficient • Guarantee the removal of at least 50% of the intersections each time the filter set is split into two sets • Energy efficient • Low memory requirement • Access each filter only once per packet • Easily updatable • Updates can be inserted to one of the set that creates the least number of intersections

  18. Simulation Setup • Tests on the Snort rule header sets • Compare SSA with two TCAM-based solutions: • MUD • Geometric Intersection-based solution • Compare SSA with two representative software-based solutions: • Hicuts • EGT-PC • Evaluation metrics • Memory consumption • Lookup rate • Power consumption • Update cost

  19. Memory Usage Total number of extra intersections filters in TCAMs. Total number of TCAM entries used.

  20. Classification Speed • MUD • One packet may match up to 12 unique filters, and requires a maximum of 20 TCAM lookups • Common packets like http packets match 4 unique filters and may require 5-9 TCAM lookups. A Napster packet requires 9 to 15 TCAM lookups • Geometric Intersection-based solution • 1 TCAM lookup per packet • SSA-2 • 2 TCAM lookups per packet • SSA-4 • 4 TCAM lookups per packet • If average packet size is 402.7 bytes, SSA-4 operates at 201.35 Gbps classification rate • Worst case, if every packet is 40 bytes, SSA-4 achieves 20Gbps rate

  21. Update Cost • Update cost in terms of newly inserted filters

  22. Power Consumption • Energy used by a TCAM is linear to • The number of entries searched in parallel • The number of TCAM accesses per packet • Metric: total TCAM entries accessed per packet

  23. Conclusions • SSA is a memory and power efficient solution to multi-match classification problem • O(NM) complexity • Guarantee to remove 50% of the intersections each time the filter set splits • Comparing to MUD • Use a similar amount of TCAM memory • Yield a 75% to 95% reduction in power consumption • Comparing to the Geometric Intersection-based Solution • Use 90% less TCAM memory and power • Require one additional TCAM lookup per packet

More Related