memory efficient regular expression search using state merging
Download
Skip this Video
Download Presentation
Memory Efficient Regular Expression Search Using State Merging

Loading in 2 Seconds...

play fullscreen
1 / 21

Memory Efficient Regular Expression Search Using State Merging - PowerPoint PPT Presentation


  • 103 Views
  • Uploaded on

Memory Efficient Regular Expression Search Using State Merging. Michela Becchi Washington University in St. Louis Srihari Cadambi NEC Laboratories America. Matching Engine and RegEx set. Safe packets. Safe pay1. Safe pay2. Incoming packets. FTP.OPEN.* www.spyware Host=.*HTTP.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Memory Efficient Regular Expression Search Using State Merging' - urania


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
memory efficient regular expression search using state merging

Memory Efficient Regular Expression Search Using State Merging

Michela Becchi Washington University in St. Louis

Srihari Cadambi NEC Laboratories America

context

Matching Engine and RegEx set

Safe packets

Safe pay1

Safe pay2

Incoming packets

FTP.OPEN.*

www.spyware

Host=.*HTTP

Hosxyz

blaBLAb

Malicious packets

xHost=

FTP.OPEN

Context
  • Regular expression matching is a critical operation in networking
    • Intrusion detection
    • Context based billing
    • Peer-to-peer traffic detection and prioritization
    • Application level filtering
  • Challenge: perform regular expression matching at line rate
    • Processing time
    • Memory requirement (occupancy and bandwidth)

Michela Becchi

background
Background
  • Two algorithmic solutions
    • Non deterministic finite automata (NFAs)
      • High time complexity
      • Compact representation
    • Deterministic finite automata (DFAs)
      • Low time complexity
      • Potentially exponential number of states w/ respect to NFAs
  • Multiple implementation approaches
    • FPGA [Sidhu FCCM 2001, Clark 2003]
    • Software [Paxson 1998, Roesh 1999, Tuck 2004]
    • Custom hardware [Kumar 2006]
  • Problem: given a DFA, how to compactly represent it without violating the processing time bound

Michela Becchi

in this paper
In this paper
  • New method to compact a DFA called state merging
  • Data structure to support state merging
  • Algorithm to perform state merging
  • Evaluation on real security rule-sets (from Bro and Snort NIDS)

Michela Becchi

outline
Outline
  • The idea
  • The algorithm
  • The data structure
  • Experimental evaluation

Michela Becchi

state merging the idea

Non-equivalent

Automata!

State Merging: the idea

pattern: ((a[b-e][g-i])|(f[g-h]j))k+

0

a

a

1

a

3

1

[b-e]

a

[b-e]

a

.0

/0,1

a

a

a

[g-i]

f

a

f

f

a

/0

[g-i]

j

k

3_4

5

0

6

k

0

5

6

k

k

/1

j

a

a

a

f

f

[g-h]

.1

f

f

[g-h]

f

f

/0,1

2

4

f

2

f

f

f

Input text: acjk

  • common outgoing transitions are compressed
  • input labels keep 1-step history information
  • outgoing conditional transition ensure functional equivalence

Michela Becchi

state merging selecting the states
State Merging – selecting the states

DFA

pattern: ((a[b-e][g-i])|(f[g-h]j))k+

a

a

[b-e]

1

3

a

[g-i]

f

a

f

a

0

k

5

6

k

j

a

a

f

Space reduction graph

f

[g-h]

2

4

f

3

1

f

f

6

5

0

4

2

  • bold edge has weight 3
  • remaining edges have weight 2

Michela Becchi

state merging selecting the states cont d

1

6

0

3_4

5

2

State Merging – selecting the states (cont’d)

a

DFA

1

a

[b-e].0

a/0,1

a

a

f

[g-i]/0

j/1

k

3_4

5

6

0

k

a

f

[g-h].1

f

f/0,1

f

2

Space reduction graph

f

State 1 and 2 have now one more target in common: merged state 3_4!

State merging can create new merging opportunities.

Michela Becchi

state merging selecting the states cont d1

a.0

a.0

a.0/0,1,

f.1/0,1

a.0

0

a.0, f.1

1_2

3_4

5

6

[b-e].0/0

[g-i]/0

j/1

k

k

[g-h].1/1

f.1

f.1

f.1

State Merging – selecting the states (cont’d)

DFA

  • Key point: Labels can be reused
  • State merging stops when label overhead exceeds potential saving
  • Old and new DFA are functionally equivalent

Michela Becchi

outline1
Outline
  • The idea
  • The algorithm
  • The data structure
  • Experimental evaluation

Michela Becchi

a data structure to support state merging

0 … 0 1 1 1 1 1 1 0 0 0 0 0 ... 0

Bitmap

a

1

a

[b-e]

1

3

256 bits

Pointer Indirection

a

[g-i]

f

0

1

1

1

1

2

a

f

a

0

k

5

6

k

Pointer Indirection + Label

# 1 in

bitmap

0

0

0

0

0

0

0

1

1

1

1

2

j

a

a

f

f

[g-h]

2

4

f

# 1 in

bitmap

f

log2(distinct targets)

Transition Table

f

1

# distinct

targets

3

log2(distinct targets)+log2(labels)

2

potential

saving

through

state merging

32 bit

A data structure to support state merging

b

1

pattern: ((a[b-e][g-i])|(f[g-h]j))k+

1

  • Bitmap:
    • No replication of frequent transitions
  • Pointer indirection:
    • No pointer replication w/in a state
    • Character-transition target decoupling

3

Michela Becchi

data structure after state merging

0 … 0 1 1 1 1 1 1 0 0 0 … 0

1 0

0 … 0 1 0 0 0 0 1 1 1 0 … 0

0

b,

0

1

1

1

1

0

0

0

0

0

0

1

0

1

1

1

0

0

1

1

1_2

3_4

Data structure after state merging

a.0

a.0/0,1

f.1/0,1

a.0

a.0

Saving: combined transition table

Overhead: labels

a.0, f.1

[b-e].0/0

[g-i]/0

j/1

k

0

1_2

3_4

5

6

k

[g-h].1/1

f.1

f.1

f.1

Bitmap 0

Bitmap 1

1

1_2

Pointer Indirection + Label

Pointer Indirection + Label

Combined Transition Table

0

3_4

Michela Becchi

outline2
Outline
  • The idea
  • The algorithm
  • The data structure
  • Experimental evaluation

Michela Becchi

state reduction
State reduction

20x

Michela Becchi

transition reduction
Transition reduction

1000x

Michela Becchi

memory requirement
Memory requirement

25x

Michela Becchi

summary
Summary
  • Regular expression matching: critical operation in many networking applications
  • Two classical solutions: NFAs and DFAs
    • NFAs slow, DFAs fast but impractical
  • In this paper, we present a new method to compact a DFA called state merging
    • Data structure and fast algorithm to support state merging
    • Evaluation on real security rule-sets (from Bro and Snort NIDS)
      • 1000x reduction in number of transitions
      • 20x reduction in number of states
      • 25x memory reduction

Michela Becchi

slide18

Questions?

Michela Becchi

state merging the idea1

cj/0, cm/1

ck/0

S1,2

Sy

Sw

cn/1

Sz

State Merging: the Idea

Sx

0

ci

c1

cj

Sx

Sy

S1

ci/0, cl/1

c1.0

ck

SW

c2.1

Sx

cl

c2

cm

Sy

S2

cn

1

Sz

  • common outgoing transitions are compressed
  • input labels keep 1-step history information
  • outgoing conditional transition ensure functional equivalence

Michela Becchi

a data structure to support state merging1

0 … 0 1 1 1 1 1 1 0 0 0 0 0 ... 0

Bitmap

a

a

[b-e]

1

3

256 bits

Pointer Indirection

a

[g-i]

f

0

1

1

1

1

2

a

f

a

0

k

5

6

k

Pointer Indirection + Label

Transition Table

# 1 in

bitmap

0

0

0

0

0

0

0

1

1

1

1

2

1

3

3

3

3

2

j

a

a

f

f

[g-h]

2

4

f

# 1 in

bitmap

# 1 in

bitmap

f

log2(distinct targets)

Transition Table

f

1

# distinct

targets

3

log2(distinct targets)+log2(labels)

2

potential

saving

through

state merging

32 bit

32 bit

A data structure to support state merging

1

pattern: ((a[b-e][g-i])|(f[g-h]j))k+

  • Bitmap:
    • No replication of frequent transitions
  • Pointer indirection:
    • No pointer replication w/in a state
    • Character-transition target decoupling

Michela Becchi

ad