Loading in 2 Seconds...

Memory Efficient Regular Expression Search Using State Merging

Loading in 2 Seconds...

- 103 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Memory Efficient Regular Expression Search Using State Merging' - urania

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Memory Efficient Regular Expression Search Using State Merging

Michela Becchi Washington University in St. Louis

Srihari Cadambi NEC Laboratories America

Safe packets

Safe pay1

Safe pay2

Incoming packets

FTP.OPEN.*

www.spyware

Host=.*HTTP

Hosxyz

blaBLAb

Malicious packets

xHost=

FTP.OPEN

Context- Regular expression matching is a critical operation in networking
- Intrusion detection
- Context based billing
- Peer-to-peer traffic detection and prioritization
- Application level filtering

- Challenge: perform regular expression matching at line rate
- Processing time
- Memory requirement (occupancy and bandwidth)

Michela Becchi

Background

- Two algorithmic solutions
- Non deterministic finite automata (NFAs)
- High time complexity
- Compact representation
- Deterministic finite automata (DFAs)
- Low time complexity
- Potentially exponential number of states w/ respect to NFAs
- Multiple implementation approaches
- FPGA [Sidhu FCCM 2001, Clark 2003]
- Software [Paxson 1998, Roesh 1999, Tuck 2004]
- Custom hardware [Kumar 2006]
- Problem: given a DFA, how to compactly represent it without violating the processing time bound

Michela Becchi

In this paper

- New method to compact a DFA called state merging
- Data structure to support state merging
- Algorithm to perform state merging
- Evaluation on real security rule-sets (from Bro and Snort NIDS)

Michela Becchi

Automata!

State Merging: the ideapattern: ((a[b-e][g-i])|(f[g-h]j))k+

0

a

a

1

a

3

1

[b-e]

a

[b-e]

a

.0

/0,1

a

a

a

[g-i]

f

a

f

f

a

/0

[g-i]

j

k

3_4

5

0

6

k

0

5

6

k

k

/1

j

a

a

a

f

f

[g-h]

.1

f

f

[g-h]

f

f

/0,1

2

4

f

2

f

f

f

Input text: acjk

- common outgoing transitions are compressed
- input labels keep 1-step history information
- outgoing conditional transition ensure functional equivalence

Michela Becchi

State Merging – selecting the states

DFA

pattern: ((a[b-e][g-i])|(f[g-h]j))k+

a

a

[b-e]

1

3

a

[g-i]

f

a

f

a

0

k

5

6

k

j

a

a

f

Space reduction graph

f

[g-h]

2

4

f

3

1

f

f

6

5

0

4

2

- bold edge has weight 3
- remaining edges have weight 2

Michela Becchi

6

0

3_4

5

2

State Merging – selecting the states (cont’d)a

DFA

1

a

[b-e].0

a/0,1

a

a

f

[g-i]/0

j/1

k

3_4

5

6

0

k

a

f

[g-h].1

f

f/0,1

f

2

Space reduction graph

f

State 1 and 2 have now one more target in common: merged state 3_4!

State merging can create new merging opportunities.

Michela Becchi

a.0

a.0/0,1,

f.1/0,1

a.0

0

a.0, f.1

1_2

3_4

5

6

[b-e].0/0

[g-i]/0

j/1

k

k

[g-h].1/1

f.1

f.1

f.1

State Merging – selecting the states (cont’d)DFA

- Key point: Labels can be reused
- State merging stops when label overhead exceeds potential saving
- Old and new DFA are functionally equivalent

Michela Becchi

0 … 0 1 1 1 1 1 1 0 0 0 0 0 ... 0

Bitmap

a

1

a

[b-e]

1

3

256 bits

Pointer Indirection

a

[g-i]

f

0

1

1

1

1

2

a

f

a

0

k

5

6

k

Pointer Indirection + Label

# 1 in

bitmap

0

0

0

0

0

0

0

1

1

1

1

2

j

a

a

f

f

[g-h]

2

4

f

# 1 in

bitmap

f

log2(distinct targets)

Transition Table

f

1

# distinct

targets

3

log2(distinct targets)+log2(labels)

2

potential

saving

through

state merging

32 bit

A data structure to support state mergingb

1

pattern: ((a[b-e][g-i])|(f[g-h]j))k+

1

- Bitmap:
- No replication of frequent transitions
- Pointer indirection:
- No pointer replication w/in a state
- Character-transition target decoupling

3

Michela Becchi

1 0

0 … 0 1 0 0 0 0 1 1 1 0 … 0

0

b,

0

1

1

1

1

0

0

0

0

0

0

1

0

1

1

1

0

0

1

1

1_2

3_4

Data structure after state merginga.0

a.0/0,1

f.1/0,1

a.0

a.0

Saving: combined transition table

Overhead: labels

a.0, f.1

[b-e].0/0

[g-i]/0

j/1

k

0

1_2

3_4

5

6

k

[g-h].1/1

f.1

f.1

f.1

Bitmap 0

Bitmap 1

1

1_2

Pointer Indirection + Label

Pointer Indirection + Label

Combined Transition Table

0

3_4

Michela Becchi

Summary

- Regular expression matching: critical operation in many networking applications
- Two classical solutions: NFAs and DFAs
- NFAs slow, DFAs fast but impractical
- In this paper, we present a new method to compact a DFA called state merging
- Data structure and fast algorithm to support state merging
- Evaluation on real security rule-sets (from Bro and Snort NIDS)
- 1000x reduction in number of transitions
- 20x reduction in number of states
- 25x memory reduction

Michela Becchi

Michela Becchi

Experimental evaluation

Michela Becchi

ck/0

S1,2

Sy

Sw

cn/1

Sz

State Merging: the IdeaSx

0

ci

c1

cj

Sx

Sy

S1

ci/0, cl/1

c1.0

ck

SW

c2.1

Sx

cl

c2

cm

Sy

S2

cn

1

Sz

- common outgoing transitions are compressed
- input labels keep 1-step history information
- outgoing conditional transition ensure functional equivalence

Michela Becchi

0 … 0 1 1 1 1 1 1 0 0 0 0 0 ... 0

Bitmap

a

a

[b-e]

1

3

256 bits

Pointer Indirection

a

[g-i]

f

0

1

1

1

1

2

a

f

a

0

k

5

6

k

Pointer Indirection + Label

Transition Table

# 1 in

bitmap

0

0

0

0

0

0

0

1

1

1

1

2

1

3

3

3

3

2

j

a

a

f

f

[g-h]

2

4

f

# 1 in

bitmap

# 1 in

bitmap

f

log2(distinct targets)

Transition Table

f

1

# distinct

targets

3

log2(distinct targets)+log2(labels)

2

potential

saving

through

state merging

32 bit

32 bit

A data structure to support state merging1

pattern: ((a[b-e][g-i])|(f[g-h]j))k+

- Bitmap:
- No replication of frequent transitions
- Pointer indirection:
- No pointer replication w/in a state
- Character-transition target decoupling

Michela Becchi

Download Presentation

Connecting to Server..