240 likes | 416 Views
Space-Time Tradeoffs in Software-based Deep Packet Inspection. Author : Anat Bremler-Barr, Yotam Harchol, and David Hay Published in Proc. IEEE HPSR 2011. Goal. Software based DPI AC based (Exact Matching) Reduced memory size Fit in CPU cache Worst case throughput. Aho-Corasick.
E N D
Space-Time Tradeoffs in Software-based Deep Packet Inspection Author: Anat Bremler-Barr, Yotam Harchol, and David Hay Published in Proc. IEEE HPSR 2011
Goal • Software based DPI • AC based (Exact Matching) • Reduced memory size • Fit in CPU cache • Worst case throughput
Aho-Corasick Given a states s, Depth(s): Depth(S4) = 2, Depth(S13) = 3 Label(s): Label(S4) = BD, Label(S13) = BCA Label(S12) = CDBCAB Forward Transitions (To Deeper states) Failure Transitions Failure Transitions to S0 are omitted
State Structure(1/3)Lookup Table Format Lookup Table format used in: (# of Forward transitions) more than 64.
State Structure(2/3)Linear Format S4 (S0) S5 (S7) D S6 S2 (S0) C S5 D S4 E S3
State Structure(3/3)Bitmap Format S5 (S7) D S6 00010 S6 S7 S2 (S0) C S5 D S4 E S3 00111 S5 S4 S3 S0
Path-Compression (1/3) • One-way branch states are compressed. • Problem: • Incoming Failure Transition • Outgoing Failure Transition • Solution: • No incoming failure transition is allowed • Multiple outgoing transition Fields
Path-Compression (2/3) Sa Sb Sc Sd A B C Sx Sy Sz A, Sb B, Sc C, Sd *, Sx *, Sy *, Sz Sa Sd ABC 3, Sd A, Sx B, Sy C, Sz
Path-Compression (3/3)Tuck. (INFOCOM 2004) Sa Sb Sc Sd T A B C Sa Sd Sx Sy Sz ABC 3, Sd A, Sb B, Sc C, Sd A, Sx *, Sx *, Sy *, Sz B, Sy C, Sz Si Sk Si Sj Sk TA S T A 2, Sk ??? T, Sj A, Sk *, Sb T, Sp *, Sp *, Sq A, Sq Before After
Aho-Corasick Path Compression: Before and After Text: CDBCAB Text: CDBCAA
Leaves-Compression • Trie leaves consists only failure transition. • Adding one bit for each forward transition => indicate an accept state • The process can be applied recursively Sa Sb Sc Original A B A, Sb B, Sc *, Sx Sa Sb 1st process A A, Sb, 0 B, Sx, 1 Sa 2nd process AB, Sx, 1
Use both techniques • Add one bit for every symbol of compressed path. A B, 0 C, 1 D, 1 S0 Sa Sb Sc Sd B Sp Sq E Set the bit of i-th symbol when: (1) when a transition with the first i symbols of the path is to an accepting state (2) if the failure transition of the pre-compressed state reached after the first i symbols of the path, is to a leaf
Pointer Compression • There are many transitions that go to states whose depth is small. • 31% of the failure transitions go to depth 1 states • Additional 35% of the failure transitions go to depth 2 states.
Variable-Size Pointers • Two lengths: 2 and 2+log2|S| • 00: Go to state S0 • 01: Go to depth 1 states (S0 occurs current symbols) • 10: Go to depth 2 states (S0 occurs last symbols + current symbols) (Valid pairs are less, thus use hashing) • 11: Go to next states as regular pointer
Huffman Coding • Huffman coding allocates short code for frequent symbols and long code for infrequent ones. • A lookup table is used to provide symbol-to-Huffman-code conversion. • The idea is not used.
Evaluation Environment Two Environment: • Core 2 Duo 2.53 GHz (2 Core), 32KB L1, 3MB L2. • Core i7 2.93 GHz (4 Core), 32 KB L1, 256 KB L2, 8MB L3.
Evaluation Traffic Pattern: • Snort • ClamAV (Partial) Traffic: • DARPA (Real Life) • Exhaustive Traversal • Failure path Traversal Worst Case