Memory efficient regular expression search using state merging
Download
1 / 21

Memory Efficient Regular Expression Search Using State Merging - PowerPoint PPT Presentation


  • 103 Views
  • Uploaded on

Memory Efficient Regular Expression Search Using State Merging. Michela Becchi Washington University in St. Louis Srihari Cadambi NEC Laboratories America. Matching Engine and RegEx set. Safe packets. Safe pay1. Safe pay2. Incoming packets. FTP.OPEN.* www.spyware Host=.*HTTP.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Memory Efficient Regular Expression Search Using State Merging' - urania


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Memory efficient regular expression search using state merging

Memory Efficient Regular Expression Search Using State Merging

Michela Becchi Washington University in St. Louis

Srihari Cadambi NEC Laboratories America


Context

Matching Engine and RegEx set Merging

Safe packets

Safe pay1

Safe pay2

Incoming packets

FTP.OPEN.*

www.spyware

Host=.*HTTP

Hosxyz

blaBLAb

Malicious packets

xHost=

FTP.OPEN

Context

  • Regular expression matching is a critical operation in networking

    • Intrusion detection

    • Context based billing

    • Peer-to-peer traffic detection and prioritization

    • Application level filtering

  • Challenge: perform regular expression matching at line rate

    • Processing time

    • Memory requirement (occupancy and bandwidth)

Michela Becchi


Background
Background Merging

  • Two algorithmic solutions

    • Non deterministic finite automata (NFAs)

      • High time complexity

      • Compact representation

    • Deterministic finite automata (DFAs)

      • Low time complexity

      • Potentially exponential number of states w/ respect to NFAs

  • Multiple implementation approaches

    • FPGA [Sidhu FCCM 2001, Clark 2003]

    • Software [Paxson 1998, Roesh 1999, Tuck 2004]

    • Custom hardware [Kumar 2006]

  • Problem: given a DFA, how to compactly represent it without violating the processing time bound

Michela Becchi


In this paper
In this paper Merging

  • New method to compact a DFA called state merging

  • Data structure to support state merging

  • Algorithm to perform state merging

  • Evaluation on real security rule-sets (from Bro and Snort NIDS)

Michela Becchi


Outline
Outline Merging

  • The idea

  • The algorithm

  • The data structure

  • Experimental evaluation

Michela Becchi


State merging the idea

Non-equivalent Merging

Automata!

State Merging: the idea

pattern: ((a[b-e][g-i])|(f[g-h]j))k+

0

a

a

1

a

3

1

[b-e]

a

[b-e]

a

.0

/0,1

a

a

a

[g-i]

f

a

f

f

a

/0

[g-i]

j

k

3_4

5

0

6

k

0

5

6

k

k

/1

j

a

a

a

f

f

[g-h]

.1

f

f

[g-h]

f

f

/0,1

2

4

f

2

f

f

f

Input text: acjk

  • common outgoing transitions are compressed

  • input labels keep 1-step history information

  • outgoing conditional transition ensure functional equivalence

Michela Becchi


State merging selecting the states
State Merging – selecting the states Merging

DFA

pattern: ((a[b-e][g-i])|(f[g-h]j))k+

a

a

[b-e]

1

3

a

[g-i]

f

a

f

a

0

k

5

6

k

j

a

a

f

Space reduction graph

f

[g-h]

2

4

f

3

1

f

f

6

5

0

4

2

  • bold edge has weight 3

  • remaining edges have weight 2

Michela Becchi


State merging selecting the states cont d

1 Merging

6

0

3_4

5

2

State Merging – selecting the states (cont’d)

a

DFA

1

a

[b-e].0

a/0,1

a

a

f

[g-i]/0

j/1

k

3_4

5

6

0

k

a

f

[g-h].1

f

f/0,1

f

2

Space reduction graph

f

State 1 and 2 have now one more target in common: merged state 3_4!

State merging can create new merging opportunities.

Michela Becchi


State merging selecting the states cont d1

a Merging.0

a.0

a.0/0,1,

f.1/0,1

a.0

0

a.0, f.1

1_2

3_4

5

6

[b-e].0/0

[g-i]/0

j/1

k

k

[g-h].1/1

f.1

f.1

f.1

State Merging – selecting the states (cont’d)

DFA

  • Key point: Labels can be reused

  • State merging stops when label overhead exceeds potential saving

  • Old and new DFA are functionally equivalent

Michela Becchi


Outline1
Outline Merging

  • The idea

  • The algorithm

  • The data structure

  • Experimental evaluation

Michela Becchi


A data structure to support state merging

0 … 0 1 1 1 1 1 1 0 0 0 0 0 ... 0 Merging

Bitmap

a

1

a

[b-e]

1

3

256 bits

Pointer Indirection

a

[g-i]

f

0

1

1

1

1

2

a

f

a

0

k

5

6

k

Pointer Indirection + Label

# 1 in

bitmap

0

0

0

0

0

0

0

1

1

1

1

2

j

a

a

f

f

[g-h]

2

4

f

# 1 in

bitmap

f

log2(distinct targets)

Transition Table

f

1

# distinct

targets

3

log2(distinct targets)+log2(labels)

2

potential

saving

through

state merging

32 bit

A data structure to support state merging

b

1

pattern: ((a[b-e][g-i])|(f[g-h]j))k+

1

  • Bitmap:

    • No replication of frequent transitions

  • Pointer indirection:

    • No pointer replication w/in a state

    • Character-transition target decoupling

3

Michela Becchi


Data structure after state merging

0 … 0 1 1 1 1 1 1 0 0 0 … 0 Merging

1 0

0 … 0 1 0 0 0 0 1 1 1 0 … 0

0

b,

0

1

1

1

1

0

0

0

0

0

0

1

0

1

1

1

0

0

1

1

1_2

3_4

Data structure after state merging

a.0

a.0/0,1

f.1/0,1

a.0

a.0

Saving: combined transition table

Overhead: labels

a.0, f.1

[b-e].0/0

[g-i]/0

j/1

k

0

1_2

3_4

5

6

k

[g-h].1/1

f.1

f.1

f.1

Bitmap 0

Bitmap 1

1

1_2

Pointer Indirection + Label

Pointer Indirection + Label

Combined Transition Table

0

3_4

Michela Becchi


Outline2
Outline Merging

  • The idea

  • The algorithm

  • The data structure

  • Experimental evaluation

Michela Becchi


State reduction
State reduction Merging

20x

Michela Becchi


Transition reduction
Transition reduction Merging

1000x

Michela Becchi


Memory requirement
Memory requirement Merging

25x

Michela Becchi


Summary
Summary Merging

  • Regular expression matching: critical operation in many networking applications

  • Two classical solutions: NFAs and DFAs

    • NFAs slow, DFAs fast but impractical

  • In this paper, we present a new method to compact a DFA called state merging

    • Data structure and fast algorithm to support state merging

    • Evaluation on real security rule-sets (from Bro and Snort NIDS)

      • 1000x reduction in number of transitions

      • 20x reduction in number of states

      • 25x memory reduction

Michela Becchi


Questions? Merging

Michela Becchi


Experimental evaluation
Experimental evaluation Merging

Michela Becchi


State merging the idea1

c Mergingj/0, cm/1

ck/0

S1,2

Sy

Sw

cn/1

Sz

State Merging: the Idea

Sx

0

ci

c1

cj

Sx

Sy

S1

ci/0, cl/1

c1.0

ck

SW

c2.1

Sx

cl

c2

cm

Sy

S2

cn

1

Sz

  • common outgoing transitions are compressed

  • input labels keep 1-step history information

  • outgoing conditional transition ensure functional equivalence

Michela Becchi


A data structure to support state merging1

0 … 0 1 1 1 1 1 1 0 0 0 0 0 ... 0 Merging

Bitmap

a

a

[b-e]

1

3

256 bits

Pointer Indirection

a

[g-i]

f

0

1

1

1

1

2

a

f

a

0

k

5

6

k

Pointer Indirection + Label

Transition Table

# 1 in

bitmap

0

0

0

0

0

0

0

1

1

1

1

2

1

3

3

3

3

2

j

a

a

f

f

[g-h]

2

4

f

# 1 in

bitmap

# 1 in

bitmap

f

log2(distinct targets)

Transition Table

f

1

# distinct

targets

3

log2(distinct targets)+log2(labels)

2

potential

saving

through

state merging

32 bit

32 bit

A data structure to support state merging

1

pattern: ((a[b-e][g-i])|(f[g-h]j))k+

  • Bitmap:

    • No replication of frequent transitions

  • Pointer indirection:

    • No pointer replication w/in a state

    • Character-transition target decoupling

Michela Becchi


ad