Scalable Pattern-Matching via Dynamic Differentiated Distributed Detection (D 4 )

Scalable Pattern-Matching via Dynamic Differentiated Distributed Detection (D4) Author: Kai Zheng, Hongbin Lu Publisher: GLOBECOM 2008 Presenter: Han-Chen Chen Date:2009/12/23

Introduction • Due to unbalance of network flow sizes, traditional flow based data parallel processing/programming model can not fully exert multicore platforms’ computing power and results in poor performance scalability. • Pattern set pre-partition, let multiple candidate PM methods to handle the subsets, Detection Mode would be selected specifically for each incoming flows at the run-time.

Primitive idea of Distributed Detection Reallocating/Balancing the workload via D2. Traditional Flow-based Load-Balancing.

Overhead of Distributed Detection • from the OS/system, for increased number of memory references to address the data structures of the subsets. • The higher mode used, the higher overhead may be required .

Architecture of Differentiated Distributed Detection Task-info Queue stores the information denoting which flow to inspect and which pattern set/sub-set to detect against.

Methods of Differentiated Distributed Detection Aho-Corasick (AC) algorithm : AC algorithm always consumes much more memory, relatively lower average performance especially when dealing with huge pattern sets. Modified-Wu-Manber (MWM) algorithm : Much lower memory requirement, but it would not be handy and its performance becomes non-deterministic when dealing with short patterns (since the Bad-Character shifts are bounded by the minimum pattern length of the set) and when hash collisions occur heavily.

Wu-Manber Algorithm • Basic idea of the Boyer-Moore algorithm. It contains a SHIFT table, a HASH table, and a PREFIX table. • We impose a requirement that all patterns have the same length. • Check B characters. • Each string of size B is mapped (using a hash function) to an integer used as an index to the SHIFT table. • We use the exact same integer to index into another table, called HASH. The i’th entry of the HASH table, HASH[i], contains a pointer to a list of patterns whose last B characters hash into i. • Due to the suffixes ‘ion’ or ‘ing’ are very common in English texts. We also map the first B’ characters of all patterns into the PREFIX table. • It is much less common to have different patterns that share the same prefix and the same suffix.

Wu-Manber Algorithm Shift table Ex: pattern set : working talking input string : abcding B=3; B’=2; hash[“ing”]=i; if(Shift[i]>0) shift Shift[i]; else { calculate prefix “ab” hash value k; find hash table ith bucket which prefix hash value k; check those patterns actually match; } hash i pattern last B characters i … hash table i

pseudo-code of the prototyped PSP algorithm Temp bucket IS1 IS2 ISNint … AC PS1 PS2 PS3 PSm-1 PSm …

Step 2 example • Pattern1: talking, Pattern2: working • K=Hash[“ing”]=15 , Nint=5 • When “ring” calculate hash key k=15 • I[15] = (I[15]+1)%5 = 1 • Add “ring” to IS1 • When “working” calculate hash key k=15 • I[15] = (I[15]+1)%5 = 2 • Add “working” to IS2 PSorig– PSmode-m(1) 1

Implementation of Mode Selector & Scheduler • It tends to be always un-worthwhile to apply D2 on small flows, since small flows is easy to be scheduled and would be less possible to incur “out-of-balance” issues. (Small flows: tens of KBs.) • The system may not be always ready for D2, even for the large flows. D2 only provides the way to gear up its CPU utilization, if the system is already very busy and would remain busy for a while, applying D2 would merely tire the system out. • MSS should also take account of the characteristics of the system or try to “adapt” to the system, e.g. a pre-test on the system (using certain sample traces) may be necessary when determining the parameters for dynamically mode selecting.

Schematic of Mode Selector & Scheduler

Performance Throughput scalability comparison among different MWM-based parallel PM schemes. • The straightforward per-flow-based load balance scheme (i.e. the non-D2 scheme using Mode 1 merely). • The Brute-force D2 scheme in which the Detection Modes are equal to the number of PME threads used. • Dynamic D2 scheme in which Detection Modes are selected in the runtime. • D4, which is similar with the Dynamic D2 Scheme except that the patterns whose sizes are not larger than 9 bytes would be processed by the AC algorithms when Mode>1.

Thanks for your listening

Scalable Pattern-Matching via Dynamic Differentiated Distributed Detection (D 4 )

Scalable Pattern-Matching via Dynamic Differentiated Distributed Detection (D 4 )

Presentation Transcript

Symmetry Detection via String Matching

Pattern Matching

Pattern Matching

Pattern Matching

Pattern Matching

Scalable Pattern Matching for High Speed Networks

Pattern Matching

Pattern Matching

Pattern Matching

Dynamic Text and Static Pattern Matching

Pattern matching

HMM-BASED PATTERN DETECTION

Pattern Matching

Pattern Matching

Pattern Matching against Distributed Datasets within DAME

SigMatch ： Fast and Scalable Multi-pattern Matching

Pattern Matching

Pattern Matching

Pattern matching

Pattern Matching

Pattern Matching

Dynamic Text and Static Pattern Matching