A Memory-Efficient Parallel String Matching for Intrusion Detection Systems

A Memory-Efficient Parallel String Matching forIntrusion Detection Systems HyunJin Kim, Hyejeong Hong, Hong-Sik Kim, and Sungho Kang, Member, IEEE

Outline • INTRODUCTION • PROPOSED PARALLEL STRING MATCHING • Architecture of String Matcher • Gray Code-Based Sorting • Bit Position Grouping • PERFORMANCE EVALUATION

INTRODUCTION • The DFA-based string matcher improves both regularity and scalability with lower time complexity [1]. • However, the memory requirements are proportional to the numbers of states and input symbols.

INTRODUCTION • In order to reduce the memory requirements for the DFAbased string matching, the bit-split string matching using Aho- Corasickalgorithm [2] was proposed in [3]. • The bit-split string matching partitions target patterns into subgroups with a list of the lexicographically sorted target patterns.

INTRODUCTION • Due to the biased bit transitions for each bit position group, the memory usage between FSM tiles in a string matcher could be unbalanced.

PROPOSED PARALLEL STRING MATCHING • The architecture of the string matcher is based on the string matching engine in [3], which is summarized as follows: • In a string matcher, each homogeneous FSM tile takes 𝑛 bits of one character (or one byte) as an input per cycle. • In a state of each FSM tile, pattern identifications are stored as a partial match vector (PMV), where the 𝑖−th bit represents whether the 𝑖−th pattern is matched or not in the state.

Architecture of String Matcher • Each state in an FSM tile has 2𝑛 pointers for the next state according to 𝑛-bit input. Therefore, the memory size of a string matcher is given by: • The main difference of the proposed string matcher from the string matching engine in [3] is that bits for an FSM tile input are selected among the input bits of one character (eight bits) using eight 8:1 multiplexers to support the bit position grouping.

Gray Code-Based Sorting • Target patterns are sorted based on BRGC values to reduce bit transitions between successive patterns. • When the character code values in the prefixes of target patterns are not evenly distributed, the effectiveness of the gray codebased sorting is restricted.

Bit Position Grouping • Let us assume that a string matcher has four FSM tiles with two input bits. In addition, “he,” “has,” “his,” and “hers” are assumed to be the patterns to be mapped. • For all string matchers in [3], a set of bit position groups for four FSM tiles is fixed as {(8, 7), (6, 5), (4, 3), (2, 1)}, where the number represents a bit position of one character from the LSB.

Bit Position Grouping

Bit Position Grouping • After grouping the MSB positions with other bits, an optimal set of bit position groups can be {(8, 4), (7, 3), (6, 5), (2, 1)}.

Bit Position Grouping

Bit Position Grouping • The bit position grouping for a string matcher has the constant time complexity of O (1). • When all target patterns to be mapped onto multiple string matchers, the time complexity can be O(𝑇 ). • The time complexity of pattern sorting can be O (𝑇 𝑙𝑜𝑔2𝑇 ).

Bit Position Grouping • However, due to the large constant factor of the bit position grouping complexity, if the number of target patterns 𝑇 is not sufficiently large, the pattern sorting will not be dominant.

PERFORMANCE EVALUATION • Target patterns were extracted from Snort v2.8 rules [4]. • Considering design analysis in [3], an FSM tile was assumed to take two bits of one character as an input.

PERFORMANCE EVALUATION

PERFORMANCE EVALUATION • In Table I, the number of adopted string matchers was reduced on average by 4.44%, in comparison with the existing bit-split string matching in [3].

PERFORMANCE EVALUATION

PERFORMANCE EVALUATION • For all patterns of Snort rule sets, total rule set with 7766 unique patterns was obtained, where the average number of characters in target patterns was 18.6. • The number of total unused states in all FSM tiles was reduced on average by 13.46%.

PERFORMANCE EVALUATION • When a string matcher did not adopt the fixed set of bit position groups, the proposed algorithm mapped more target patterns onto the string matcher than the method in [3].

PERFORMANCE EVALUATION • In Table III, the ratio of the string matchers that did not adopt the fixed set of bit position groups was up to 33.33%.

PERFORMANCE EVALUATION • Considering the performance enhancements, the proposed parallel string matching is useful for reducing memory costs without losing regularity and scalability of the string matching.

A Memory-Efficient Parallel String Matching for Intrusion Detection Systems

A Memory-Efficient Parallel String Matching for Intrusion Detection Systems

Presentation Transcript

Deterministic Memory-Efficient String Matching Algorithms for Intrusion Detection

Intrusion Detection Systems

Intrusion Detection Systems

Intrusion Detection Systems

A Memory-Efficient Reconfigurable Aho-Corasick FSM Implementation for Intrusion Detection Systems

A High Throughput String Matching Architecture for Intrusion Detection and Prevention

Intrusion Detection Systems

Intrusion Detection Systems

Intrusion Detection Systems

Intrusion Detection Systems

Intrusion Detection Systems

Intrusion Detection Systems

Intrusion Detection Systems

Intrusion Detection Systems

Scalable Parallel Intrusion Detection

A Memory-Efficient and Modular Approach for Large-Scale String Pattern Matching

Implementing High-speed String Matching Hardware for Network Intrusion Detection Systems

Intrusion-Detection Systems

Intrusion Detection Systems

Deterministic Memory-Efficient String Matching Algorithms for Intrusion Detection