1 / 21

Memory Efficient Regular Expression Search Using State Merging

Memory Efficient Regular Expression Search Using State Merging. Michela Becchi Washington University in St. Louis Srihari Cadambi NEC Laboratories America. Matching Engine and RegEx set. Safe packets. Safe pay1. Safe pay2. Incoming packets. FTP.OPEN.* www.spyware Host=.*HTTP.

urania
Download Presentation

Memory Efficient Regular Expression Search Using State Merging

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Memory Efficient Regular Expression Search Using State Merging Michela Becchi Washington University in St. Louis Srihari Cadambi NEC Laboratories America

  2. Matching Engine and RegEx set Safe packets Safe pay1 Safe pay2 Incoming packets FTP.OPEN.* www.spyware Host=.*HTTP Hosxyz blaBLAb Malicious packets xHost= FTP.OPEN Context • Regular expression matching is a critical operation in networking • Intrusion detection • Context based billing • Peer-to-peer traffic detection and prioritization • Application level filtering • Challenge: perform regular expression matching at line rate • Processing time • Memory requirement (occupancy and bandwidth) Michela Becchi

  3. Background • Two algorithmic solutions • Non deterministic finite automata (NFAs) • High time complexity • Compact representation • Deterministic finite automata (DFAs) • Low time complexity • Potentially exponential number of states w/ respect to NFAs • Multiple implementation approaches • FPGA [Sidhu FCCM 2001, Clark 2003] • Software [Paxson 1998, Roesh 1999, Tuck 2004] • Custom hardware [Kumar 2006] • Problem: given a DFA, how to compactly represent it without violating the processing time bound Michela Becchi

  4. In this paper • New method to compact a DFA called state merging • Data structure to support state merging • Algorithm to perform state merging • Evaluation on real security rule-sets (from Bro and Snort NIDS) Michela Becchi

  5. Outline • The idea • The algorithm • The data structure • Experimental evaluation Michela Becchi

  6. Non-equivalent Automata! State Merging: the idea pattern: ((a[b-e][g-i])|(f[g-h]j))k+ 0 a a 1 a 3 1 [b-e] a [b-e] a .0 /0,1 a a a [g-i] f a f f a /0 [g-i] j k 3_4 5 0 6 k 0 5 6 k k /1 j a a a f f [g-h] .1 f f [g-h] f f /0,1 2 4 f 2 f f f Input text: acjk • common outgoing transitions are compressed • input labels keep 1-step history information • outgoing conditional transition ensure functional equivalence Michela Becchi

  7. State Merging – selecting the states DFA pattern: ((a[b-e][g-i])|(f[g-h]j))k+ a a [b-e] 1 3 a [g-i] f a f a 0 k 5 6 k j a a f Space reduction graph f [g-h] 2 4 f 3 1 f f 6 5 0 4 2 • bold edge has weight 3 • remaining edges have weight 2 Michela Becchi

  8. 1 6 0 3_4 5 2 State Merging – selecting the states (cont’d) a DFA 1 a [b-e].0 a/0,1 a a f [g-i]/0 j/1 k 3_4 5 6 0 k a f [g-h].1 f f/0,1 f 2 Space reduction graph f State 1 and 2 have now one more target in common: merged state 3_4! State merging can create new merging opportunities. Michela Becchi

  9. a.0 a.0 a.0/0,1, f.1/0,1 a.0 0 a.0, f.1 1_2 3_4 5 6 [b-e].0/0 [g-i]/0 j/1 k k [g-h].1/1 f.1 f.1 f.1 State Merging – selecting the states (cont’d) DFA • Key point: Labels can be reused • State merging stops when label overhead exceeds potential saving • Old and new DFA are functionally equivalent Michela Becchi

  10. Outline • The idea • The algorithm • The data structure • Experimental evaluation Michela Becchi

  11. 0 … 0 1 1 1 1 1 1 0 0 0 0 0 ... 0 Bitmap a 1 a [b-e] 1 3 256 bits Pointer Indirection a [g-i] f 0 1 1 1 1 2 a f a 0 k 5 6 k Pointer Indirection + Label # 1 in bitmap 0 0 0 0 0 0 0 1 1 1 1 2 j a a f f [g-h] 2 4 f # 1 in bitmap f log2(distinct targets) Transition Table f 1 # distinct targets 3 log2(distinct targets)+log2(labels) 2 potential saving through state merging 32 bit A data structure to support state merging b 1 pattern: ((a[b-e][g-i])|(f[g-h]j))k+ 1 • Bitmap: • No replication of frequent transitions • Pointer indirection: • No pointer replication w/in a state • Character-transition target decoupling 3 Michela Becchi

  12. 0 … 0 1 1 1 1 1 1 0 0 0 … 0 1 0 0 … 0 1 0 0 0 0 1 1 1 0 … 0 0 b, 0 1 1 1 1 0 0 0 0 0 0 1 0 1 1 1 0 0 1 1 1_2 3_4 Data structure after state merging a.0 a.0/0,1 f.1/0,1 a.0 a.0 Saving: combined transition table Overhead: labels a.0, f.1 [b-e].0/0 [g-i]/0 j/1 k 0 1_2 3_4 5 6 k [g-h].1/1 f.1 f.1 f.1 Bitmap 0 Bitmap 1 1 1_2 Pointer Indirection + Label Pointer Indirection + Label Combined Transition Table 0 3_4 Michela Becchi

  13. Outline • The idea • The algorithm • The data structure • Experimental evaluation Michela Becchi

  14. State reduction 20x Michela Becchi

  15. Transition reduction 1000x Michela Becchi

  16. Memory requirement 25x Michela Becchi

  17. Summary • Regular expression matching: critical operation in many networking applications • Two classical solutions: NFAs and DFAs • NFAs slow, DFAs fast but impractical • In this paper, we present a new method to compact a DFA called state merging • Data structure and fast algorithm to support state merging • Evaluation on real security rule-sets (from Bro and Snort NIDS) • 1000x reduction in number of transitions • 20x reduction in number of states • 25x memory reduction Michela Becchi

  18. Questions? Michela Becchi

  19. Experimental evaluation Michela Becchi

  20. cj/0, cm/1 ck/0 S1,2 Sy Sw cn/1 Sz State Merging: the Idea Sx 0 ci c1 cj Sx Sy S1 ci/0, cl/1 c1.0 ck SW c2.1 Sx cl c2 cm Sy S2 cn 1 Sz • common outgoing transitions are compressed • input labels keep 1-step history information • outgoing conditional transition ensure functional equivalence Michela Becchi

  21. 0 … 0 1 1 1 1 1 1 0 0 0 0 0 ... 0 Bitmap a a [b-e] 1 3 256 bits Pointer Indirection a [g-i] f 0 1 1 1 1 2 a f a 0 k 5 6 k Pointer Indirection + Label Transition Table # 1 in bitmap 0 0 0 0 0 0 0 1 1 1 1 2 1 3 3 3 3 2 j a a f f [g-h] 2 4 f # 1 in bitmap # 1 in bitmap f log2(distinct targets) Transition Table f 1 # distinct targets 3 log2(distinct targets)+log2(labels) 2 potential saving through state merging 32 bit 32 bit A data structure to support state merging 1 pattern: ((a[b-e][g-i])|(f[g-h]j))k+ • Bitmap: • No replication of frequent transitions • Pointer indirection: • No pointer replication w/in a state • Character-transition target decoupling Michela Becchi

More Related