1 / 15

StriD 2 FA: Scalable Regular Expression Matching for Deep Packet Inspection

StriD 2 FA: Scalable Regular Expression Matching for Deep Packet Inspection. Author: Xiaofei Wang, Junchen Jiang, Yi Tang, Bin Liu, and Xiaojun Wang Publisher: 2011 IEEE International Conference on Communications Presenter : Ching-Hsuan Shih Date: 2014/06/11.

adamma
Download Presentation

StriD 2 FA: Scalable Regular Expression Matching for Deep Packet Inspection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. StriD2FA: Scalable Regular Expression Matching for Deep Packet Inspection Author: Xiaofei Wang, Junchen Jiang, Yi Tang, Bin Liu, and Xiaojun Wang Publisher: 2011 IEEE International Conference on Communications Presenter: Ching-HsuanShih Date: 2014/06/11 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.

  2. Outline • Introduction • Related Work • System Design Principles and Challenges • Building StriD2FAs from Regex • Optimization of False Positive • Evaluation National Cheng Kung University CSIE Computer & Internet Architecture Lab

  3. Introduction (1/2) • Signature-based deep packet inspection has taken root as a dominant security mechanism in networking devices and computer systems. • Regular expressions are more expressive than simple patterns of strings and therefore able to describe a wider variety of payload signatures. National Cheng Kung University CSIE Computer & Internet Architecture Lab

  4. Introduction (2/2) • A novel length-based matching (LBM) is presented for accelerating regex matching. LBM has a DFA-like matcher called Stride-DFA (StriD2FA). • Causing false positive. National Cheng Kung University CSIE Computer & Internet Architecture Lab

  5. Related Work • Dharmapurikar et al. presented a scheme [7] that can process multiple characters per clock cycle with Bloom-filter. • A recent method [4] introduces the sampling techniques to accelerate regex matching, but it not all kinds of regex are supported. National Cheng Kung University CSIE Computer & Internet Architecture Lab

  6. System Design Principles and Challenges (1/5) A. Converting input stream into stride lengths (SL) stream • In this manner, any SL sent to a StriD2FA must be in a finite alphabet set Σ= {1, …, w}. National Cheng Kung University CSIE Computer & Internet Architecture Lab

  7. System Design Principles and Challenges(2/5) B. An Example of StriD2FA • Suppose the regex rule is “.*abba.{2}caca”. • Here ‘a’ is chosen as the tag and the window size is 3. • Fa(.*abba) = (1 | 2 | 3)+3 • Fa(.{2}caca) = 312 | 132 | 222 | 1122 • Finally the regex Fa(.*abba.{2}caca)= (1 | 2 | 3)+3 (312 | 132 | 222 | 1122), where the alphabet set is {1, 2, 3}. National Cheng Kung University CSIE Computer & Internet Architecture Lab

  8. System Design Principles and Challenges(3/5) • Given an byte stream T = “abcababbabccacabc”. • It is first converted into SL stream Fa(T) = 323312 • And it matched by the StriD2FA, then the input stream is sent to the verification module to make an accurate match by using some traditional methods (e.g., reversed DFA in [4]) National Cheng Kung University CSIE Computer & Internet Architecture Lab

  9. System Design Principles and Challenges (4/5) C. Benefits of LBM • Increased speed: According to the statistics in Section VI, average SLs of some characters are larger than 100. • Small memory consumption: • Firstly, the number of states is generally less than traditional DFA (e.g., StriD2FA has 5 less states than the traditional DFA in Figure 2). • Secondly, the fanout of each state is controlled by the window size. National Cheng Kung University CSIE Computer & Internet Architecture Lab

  10. System Design Principles and Challenges(5/5) D. Challenges • Regex converting: In Section IV, a formal method to efficiently construct StriD2FA from any regex is described. • False positive rate National Cheng Kung University CSIE Computer & Internet Architecture Lab

  11. Building StriD2FAs from Regex (1/2) • Compile Regex to standard DFA. • Restructure the DFA by classifying all the transitions. • All labels are removed on transitions and mark each transition whether its character is the tag (solid transition if true and dashed transition otherwise). National Cheng Kung University CSIE Computer & Internet Architecture Lab

  12. Building StriD2FAs from Regex (2/2) • Transform the restructured DFA to a non-deterministic StriD2FA by the depth first search (DFS) algorithm. • If a solid transition (pointing to state q’) is reachable in L steps where L≦w, add a transition labeled L from q to q’. • Otherwise (i.e., there is an all-dashed-transition path of length w to state q’), add a transition labeled w from q to q’. • Determinize to the final StriD2FA (similar to the determinization in traditional DFA) National Cheng Kung University CSIE Computer & Internet Architecture Lab

  13. Optimization of False Positive • It is easy to find that choosing “frequent” characters in a rule as tags can greatly reduce false positive rate. • The “frequent” Freq(c, r) of a character c in a regex r refers to the number of occurrences of c in regex r over the sum of lengths of all fixed substrings in r. • Freq(c, r) = , here Sc, r is the set of fixed substrings in regex rule r and |s| is the length of string s. • The reason of using Freq(c, r) to select tags : • It is simple to calculate. • With higher Freq(c, r), the possibility of false positive is lower because more part of the regex rule is checked by the chosen set of tags. National Cheng Kung University CSIE Computer & Internet Architecture Lab

  14. Evaluation (1/2) National Cheng Kung University CSIE Computer & Internet Architecture Lab

  15. Evaluation (2/2) National Cheng Kung University CSIE Computer & Internet Architecture Lab

More Related