1 / 14

Scalable Pattern-Matching via Dynamic Differentiated Distributed Detection (D 4 )

Scalable Pattern-Matching via Dynamic Differentiated Distributed Detection (D 4 ). Author : Kai Zheng, Hongbin Lu Publisher: GLOBECOM 2008 Presenter: Han-Chen Chen Date: 2009/12/23. Introduction.

sartori
Download Presentation

Scalable Pattern-Matching via Dynamic Differentiated Distributed Detection (D 4 )

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scalable Pattern-Matching via Dynamic Differentiated Distributed Detection (D4) Author: Kai Zheng, Hongbin Lu Publisher: GLOBECOM 2008 Presenter: Han-Chen Chen Date:2009/12/23

  2. Introduction • Due to unbalance of network flow sizes, traditional flow based data parallel processing/programming model can not fully exert multicore platforms’ computing power and results in poor performance scalability. • Pattern set pre-partition, let multiple candidate PM methods to handle the subsets, Detection Mode would be selected specifically for each incoming flows at the run-time.

  3. Primitive idea of Distributed Detection Reallocating/Balancing the workload via D2. Traditional Flow-based Load-Balancing.

  4. Overhead of Distributed Detection • from the OS/system, for increased number of memory references to address the data structures of the subsets. • The higher mode used, the higher overhead may be required .

  5. Architecture of Differentiated Distributed Detection Task-info Queue stores the information denoting which flow to inspect and which pattern set/sub-set to detect against.

  6. Methods of Differentiated Distributed Detection Aho-Corasick (AC) algorithm : AC algorithm always consumes much more memory, relatively lower average performance especially when dealing with huge pattern sets. Modified-Wu-Manber (MWM) algorithm : Much lower memory requirement, but it would not be handy and its performance becomes non-deterministic when dealing with short patterns (since the Bad-Character shifts are bounded by the minimum pattern length of the set) and when hash collisions occur heavily.

  7. Wu-Manber Algorithm • Basic idea of the Boyer-Moore algorithm. It contains a SHIFT table, a HASH table, and a PREFIX table. • We impose a requirement that all patterns have the same length. • Check B characters. • Each string of size B is mapped (using a hash function) to an integer used as an index to the SHIFT table. • We use the exact same integer to index into another table, called HASH. The i’th entry of the HASH table, HASH[i], contains a pointer to a list of patterns whose last B characters hash into i. • Due to the suffixes ‘ion’ or ‘ing’ are very common in English texts. We also map the first B’ characters of all patterns into the PREFIX table. • It is much less common to have different patterns that share the same prefix and the same suffix.

  8. Wu-Manber Algorithm Shift table Ex: pattern set : working talking input string : abcding B=3; B’=2; hash[“ing”]=i; if(Shift[i]>0) shift Shift[i]; else { calculate prefix “ab” hash value k; find hash table ith bucket which prefix hash value k; check those patterns actually match; } hash i pattern last B characters i … hash table i

  9. pseudo-code of the prototyped PSP algorithm Temp bucket IS1 IS2 ISNint … AC PS1 PS2 PS3 PSm-1 PSm …

  10. Step 2 example • Pattern1: talking, Pattern2: working • K=Hash[“ing”]=15 , Nint=5 • When “ring” calculate hash key k=15 • I[15] = (I[15]+1)%5 = 1 • Add “ring” to IS1 • When “working” calculate hash key k=15 • I[15] = (I[15]+1)%5 = 2 • Add “working” to IS2 PSorig– PSmode-m(1) 1

  11. Implementation of Mode Selector & Scheduler • It tends to be always un-worthwhile to apply D2 on small flows, since small flows is easy to be scheduled and would be less possible to incur “out-of-balance” issues. (Small flows: tens of KBs.) • The system may not be always ready for D2, even for the large flows. D2 only provides the way to gear up its CPU utilization, if the system is already very busy and would remain busy for a while, applying D2 would merely tire the system out. • MSS should also take account of the characteristics of the system or try to “adapt” to the system, e.g. a pre-test on the system (using certain sample traces) may be necessary when determining the parameters for dynamically mode selecting.

  12. Schematic of Mode Selector & Scheduler

  13. Performance Throughput scalability comparison among different MWM-based parallel PM schemes. • The straightforward per-flow-based load balance scheme (i.e. the non-D2 scheme using Mode 1 merely). • The Brute-force D2 scheme in which the Detection Modes are equal to the number of PME threads used. • Dynamic D2 scheme in which Detection Modes are selected in the runtime. • D4, which is similar with the Dynamic D2 Scheme except that the patterns whose sizes are not larger than 9 bytes would be processed by the AC algorithms when Mode>1.

  14. Thanks for your listening

More Related