1 / 17

Memory-Efficient Regular Expression Search Using State Merging

Memory-Efficient Regular Expression Search Using State Merging. Author: Michela Becchi , Srihari Cadambi Publisher: INFOCOM 2007. 26th IEEE International Conference on Computer Communications. IEEE  Presenter: Ching-Hsuan Shih Date: 2014/04/09.

avari
Download Presentation

Memory-Efficient Regular Expression Search Using State Merging

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Memory-Efficient Regular Expression Search Using State Merging Author: MichelaBecchi, SrihariCadambi Publisher: INFOCOM 2007. 26th IEEE International Conference on Computer Communications. IEEE  Presenter: Ching-HsuanShih Date: 2014/04/09 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.

  2. Outline • Introduction • Related Work • State Merging: A Motivational Example • State Merging in DFAs • Bitmap-based Data Structures for DFAs • Experimental Results National Cheng Kung University CSIE Computer & Internet Architecture Lab

  3. Introduction (1/2) • Network Intrusion Detection System (NIDS) • Is a device or software to monitor the network whether there are malicious activities. • Most IDS is to observe the network packet ,system log or network flow. • Regular Expression • Current rule-sets like Snort, Bro, and many others are replacing strings with the more powerful and expressive regular expressions. National Cheng Kung University CSIE Computer & Internet Architecture Lab

  4. Introduction (2/2) • The classical method to perform regular expression search is to use a deterministic finite automaton (DFA). • The main problem with DFAs is prohibitive memory usage: • The number of states in a DFA scale poorly with the size and number of wildcards in the regular expressions they represent. • We propose a novel technique that allows non-equivalent states in a DFA to be merged using a scheme where the transitions in the DFA are labeled. National Cheng Kung University CSIE Computer & Internet Architecture Lab

  5. Related Work • Delayed DFA (D2FA) [6]: • It identifies two (or more) states that transition to the same set of destinations on the same input characters. • D2FA achieves memory compaction by removing duplicated transitions, but this happens at the expense of latency. • States with a default transition require more than one transition per input character. • In [14]: • The authors propose increasing the speed of regular expression search by expanding the alphabet. • They process two characters (bytes) for every state transition in the DFA. • This produces an exponential increase in memory usage. National Cheng Kung University CSIE Computer & Internet Architecture Lab

  6. State Merging: A Motivational Example(1/4) National Cheng Kung University CSIE Computer & Internet Architecture Lab

  7. State Merging: A Motivational Example (2/4) • The merged state is represented as 3_4 • The transition [g-i]/0, j/1 indicates that the same next state, in this case state 5, is reached from state 3_4 upon receiving input characters g, h, i with label 0 or input character j with label 1. National Cheng Kung University CSIE Computer & Internet Architecture Lab

  8. State Merging: A Motivational Example (3/4) National Cheng Kung University CSIE Computer & Internet Architecture Lab

  9. State Merging: A Motivational Example (4/4) • The merged state is represented as 1_2 • The transition a.0/0,1 from state 3_4 to state 1_2 means: • The transition carries with it a label 0 that tells its destination state, 1_2 that the transition is meant for underlying original state 1. • The transition is taken when its source state 3_4 receives labels 0 or 1. National Cheng Kung University CSIE Computer & Internet Architecture Lab

  10. State Mergingin DFAs (1/3) A. Labels • For every transition connecting two merged states, we define source labels and destination labels, ex. c.ld/l0, l1… B. Legality of State Merging National Cheng Kung University CSIE Computer & Internet Architecture Lab

  11. State Mergingin DFAs (2/3) C. Merging and Labeling Algorithm National Cheng Kung University CSIE Computer & Internet Architecture Lab

  12. State Mergingin DFAs (3/3) National Cheng Kung University CSIE Computer & Internet Architecture Lab

  13. Bitmap-based Data Structure for DFAs (1/3) • Basic: National Cheng Kung University CSIE Computer & Internet Architecture Lab

  14. Bitmap-based Data Structure for DFAs (2/3) • Bitmap-based: National Cheng Kung University CSIE Computer & Internet Architecture Lab

  15. Bitmap-based Data Structure for DFAs (3/3) • Bitmap-based merged data structure: National Cheng Kung University CSIE Computer & Internet Architecture Lab

  16. Experimental Results (1/2) • Note that the Snort rule-sets have lower percentages of distinct next state transitions than the Bro rule-sets. This is due to the large number of character ranges (both in the form [c1-c2] and \d, \D, \w, \W, \s, \S) and to the fact that Snort regular expressions are not case sensitive. National Cheng Kung University CSIE Computer & Internet Architecture Lab 16

  17. Experimental Results (2/2) • The width of the transition table is set to 32 bits. National Cheng Kung University CSIE Computer & Internet Architecture Lab 17

More Related