1 / 20

Parallel Event Processing for Content-Based Publish/Subscribe Systems

Parallel Event Processing for Content-Based Publish/Subscribe Systems. Amer Farroukh Department of Electrical and Computer Engineering University of Toronto. Joint work with Elias Ferzli , Naweed Tajuddin , and Hans-Arno Jacobsen. Motivation.

zody
Download Presentation

Parallel Event Processing for Content-Based Publish/Subscribe Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel Event Processing for Content-Based Publish/Subscribe Systems AmerFarroukh Department of Electrical and Computer Engineering University of Toronto Joint work with Elias Ferzli, NaweedTajuddin, and Hans-Arno Jacobsen DEBS 2009

  2. Motivation • Event processing is ubiquitous in enterprise-scale applications (Fraud detection, Data analysis) • Network security monitoring and analysis tools require Gigabit per second speed (Application-layer firewalls) • Selective dissemination of information for Internet-scale applications (RSS, XML, Xpath) • These systems need to support thousands of users and process millions of events • Achieving Scalability and high performance under excessive load is a challenging problem • Matching engine is the most computation intensive function in event processing DEBS 2009

  3. How to support high data-processing rates? • Choose an existing, powerful matching algorithm • Leverage chip multi-processors • Increase throughput or reduce matching time • Evaluate multi-threading vs. software transactional memory DEBS 2009

  4. Outline • Related work • Matching algorithm • Parallelization techniques • Implementation and results DEBS 2009

  5. Sequential Matching Algorithms • Single phase: A_TREAT [E.H., 1992] • Predicates are complied into a test network • Subscriptions may appear in one or several leaves • Poor locality, space consuming, hard to maintain • Two phase: SIFT [T.Y., 2000] • Predicates are evaluated in the first phase • Subscriptions are matched in the second phase • Predicates and subscription are indexed • Algorithm used: Filtering Algorithms [F.F., 2001] DEBS 2009

  6. Matching Algorithm E P1 P2 Price Color Quantity Phase 1 0 1 0 0 0 1 0 1 0 0 0 1 S1 Ap1 C1 C2 C3 S5 Ap2 C1 C2 Phase 2 Ap3 . . . Ap4 S9 Ap5 C1 C2 C3 DEBS 2009

  7. Multiple Events Independent Processing Thread 1 E1 E2 Thread 2 P1 P2 P1 P2 Price Color Quantity 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 1 0 0 0 1 S3 S1 Ap1 C1 C2 C3 S2 Ap2 S7 C1 C2 Ap3 . . . Ap4 S8 S9 Ap5 C1 C2 C3 DEBS 2009

  8. Single Event Collaborative Processing Thread 1 E Thread 2 P1 P2 Price Color Quantity 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 S1 Ap1 C1 C2 C3 S2 Ap2 C1 C2 Ap3 . . . Ap4 S8 Ap5 C1 C2 C3 DEBS 2009

  9. Multiple Events Collaborative Processing Group 2 Group 1 E1 E2 T3 T4 T1 T2 P1 P2 P1 P2 Price Color Quantity 0 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 1 0 0 1 0 1 1 0 0 S1 S2 Ap1 C1 C2 C3 S3 S4 Ap2 C1 C2 Ap3 . . . Ap4 S7 S9 Ap5 C1 C2 C3 DEBS 2009

  10. Implementation Setup • Synchronization • Static • Locks • Software transactional memory (STM) • Machine • 2.33GHz quad-core Xeon processors • 32KB L1 cache and 4MB L2 cache • Workload DEBS 2009

  11. Multiple Events Independent Processing Analysis Linear Throughput and Constant Average Matching Time DEBS 2009

  12. Single Event Collaborative Processing Analysis Lock Implementation is best Bit vector size limits scalability DEBS 2009

  13. Multiple Events Collaborative Processing Analysis Threads can be allocated based on system requirements and load DEBS 2009

  14. Conclusions • Parallel matching engine is a promising solution • Over 1600 events/s with 6M subs • Matching time vs. throughput • Lock-based implementation is more efficient • HTM is a potential candidate for enhancing speed and potential ease of implementation DEBS 2009 14

  15. THANK YOU! DEBS 2009

  16. Predicate Tables (Phase 1) S1: quantity = 2 , price < 30 S2: quantity > 4 , price = 20 1 3 4 2 DEBS 2009

  17. Subscription Clusters (Phase 2) Ap1 S1 S2 S3 S4 P1 P2 P3 Ap2 S5 P4 . . . ApN DEBS 2009

  18. Time Profiling DEBS 2009

  19. Block Size DEBS 2009

  20. Subscriptions Effect SE-CP ME-IP DEBS 2009

More Related