1 / 13

SI-DFA: Sub-expression Integrated Deterministic Finite Automata for Deep Packet Inspection

SI-DFA: Sub-expression Integrated Deterministic Finite Automata for Deep Packet Inspection. Authors: Ayesha Khalid, Rajat Sen†, Anupam Chattopadhyay Publisher: Performance Switching and Routing ( HPSR), 2013 Present: Pei-Hua Huang Date : 2014/05/14.

ima-camacho
Download Presentation

SI-DFA: Sub-expression Integrated Deterministic Finite Automata for Deep Packet Inspection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SI-DFA: Sub-expression Integrated DeterministicFinite Automata for Deep Packet Inspection Authors: Ayesha Khalid, Rajat Sen†, Anupam Chattopadhyay Publisher: Performance Switching and Routing (HPSR), 2013 Present: Pei-Hua Huang Date: 2014/05/14 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.

  2. INTRODUCTION There is a space-time trade-off : NFAs are compact but slow, DFAs are fast but space hungry An ideal finite automata should thus have the processing speed of a DFA and space requirements of an NFA Computer & Internet Architecture Lab CSIE, National Cheng Kung University

  3. STATE-EXPLOSION • A phenomenon called exponential state blowup (or state explosion) happens when the regex corresponding to the NFA has following constructs • Counting Constraints • 1) .{n,m} : wildcard repetition between n~m times • 2) .{n,} : wildcard repetition at least n times • 3) .{n} : wildcard repetition exactly n times • Kleene Star (.*) Conditions • unbounded wildcard repetitions Computer & Internet Architecture Lab CSIE, National Cheng Kung University

  4. SUB-EXPRESSION INTEGRATED DFA (SI-DFA) • Break an expression into parts at blowup conditions and merge them into an integrated DFA • break regexes into parts called sub-expressions using kleenestar conditions as delimiters • create a merged DFA for all the sub-expressions.The accepting states of DFA are labeled as Final Accepting States (FAS) or Sub-expression Accepting States (SAS) Computer & Internet Architecture Lab CSIE, National Cheng Kung University

  5. SUB-EXPRESSION INTEGRATED DFA (SI-DFA) A regex is accepted if its constituent sub-expressions are accepted in the right order A link bit is associated with every sub-expression, whose addresses are specified in an Association Table Computer & Internet Architecture Lab CSIE, National Cheng Kung University

  6. SUB-EXPRESSION INTEGRATED DFA (SI-DFA) Consider a traffic trace cdablmcd Ex. ab.*cd and lm Computer & Internet Architecture Lab CSIE, National Cheng Kung University

  7. Cases not Conforming with SI-DFA • Pseudo wildcard repetitions • a forbidden character table is constructed with occurrence of forbidden character x tied to invalidate the link bit corresponding to sub-expression ab • forbidden characters occur in subsequent sub-expression cannot be handled by SI-DFA • Ex. RE = ab[ˆx]*cxd input = abmcxd Computer & Internet Architecture Lab CSIE, National Cheng Kung University

  8. Cases not Conforming with SI-DFA • Subsequent sub-expressions overlap • SI-DFA should start matching a sub-expression only after a subsequent sub-expression has already been accepted • Ex. RE = ab.*bc input = abc Computer & Internet Architecture Lab CSIE, National Cheng Kung University

  9. Cases not Conforming with SI-DFA • Complete containment in subsequent sub-expressions • SI-DFA will generate erroneous result if a sub-expression in a regex is completely contained in its following sub-expression • Ex. RE = a.*b.d input = bad Computer & Internet Architecture Lab CSIE, National Cheng Kung University

  10. Exact-match removal in .+ Cases ‘dot-plus’ condition, e.g., ab.+cd, will be the one that matches ab.*cd and not match abcd first making a Union automata of L1 and L2 and then converting the accepting state due to L2 as a non accepting state where L1={ab, cd} and L2={abcd}, L3 = L1−L2 Computer & Internet Architecture Lab CSIE, National Cheng Kung University

  11. PERFORMANCE EVALUATION developed in C++ Testing platform is an AMD Phenom 1055T Processor with 8 GB of RAM and Linux operating system rule-sets extracted from Bro 2.0 [19], Snort [20], and linux[21] rules Computer & Internet Architecture Lab CSIE, National Cheng Kung University

  12. PERFORMANCE EVALUATION Computer & Internet Architecture Lab CSIE, National Cheng Kung University

  13. PERFORMANCE EVALUATION Computer & Internet Architecture Lab CSIE, National Cheng Kung University

More Related