1 / 35

Scalable Trigger Processing

Scalable Trigger Processing. Discussion of publication by Eric N. Hanson et al Int Conf Data Engineering 1999 CS561. Motivation. Triggers popular for: Integrity constraint checking Alerting, logging, etc. Commercial database systems Limited triggering capabilities

haruko
Download Presentation

Scalable Trigger Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scalable Trigger Processing Discussion of publication by Eric N. Hanson et al Int Conf Data Engineering 1999 CS561

  2. Motivation • Triggers popular for: • Integrity constraint checking • Alerting, logging, etc. • Commercial database systems • Limited triggering capabilities • 1 trigger/update-type on table; or at best 100. • But : Current technology doesn’t scale well • And, internet and web-based applications may need millions of triggers.

  3. An Example Trigger • Example “stock ticker notification”: • Stock holding: 100*IBM • Query: Inform an agent whenever the price of the stock holding crosses $10,000 Create Trigger stock-watch from quotes q on update(q.price) when q.name=‘IBM’ and 100*q.price > 10,000 do raise event ThresholdCrossed(100*q.price). • Note: We may need 1,000 or millions of such triggers • Web interface may allow users to create such triggers

  4. What Next? • Problem description • TriggerMan system architecture • Predicate index • Trigger processing

  5. Problem Definition • Given: Relational DB, Trigger statements, Data Stream • Find: Triggers corresponding to each stream item • Objective: Scalable trigger processing system • Assumptions: • Number of distinct structures of trigger expressions is relatively small • All trigger expression structures small enough to fit in main memory

  6. The Problem, once more. • Requires millions of triggers (on huge data). • Steps for trigger processing • Event monitoring • Condition evaluation • Executing triggered action • Response time for database operations critical !

  7. Related Work ECA Model (not scalable) Indexing Parallel Processing [Gupt89,Hell98] AI [Forg82,Mira87] (smaller rule set) Range Predicates, Marking-Based [Hans96b, Ston90] (large memory, complicated storage)

  8. Overall Driving Idea • If large number of triggers are created, then many have the same format. • Triggers share same expression signature except that parameters substituted. • Group predicates from trigger conditions based on expression signatures into equivalence classes • Store them in efficient main memory data structures

  9. TriggerMan System

  10. Components • TriggerMan Datablade (lives inside Informix) • Data Sources • Local/remote tables/streams; must capture updates and transmit to TriggerMan (place in a queue) • TriggerMan Client applications • Create /drop triggers, etc. • TriggerMan Driver • Periodically involve TmanTest() fn to perform condition testing and action execution. • TriggerMan console • Direct user interaction interface for trigger creation, system shutdown, etc.

  11. TriggerMan Syntax • Trigger syntax create trigger <triggerName> [in setName] [optionalFlags] from fromList [on eventSpec] [when condition] [group by attributeList] [having groupCondition] do action

  12. Example : Salary Increases Update Fred’s salary when Bob’s salary is updated create trigger updateFred from emp on update (emp.salary) when emp.name = ’Bob’ do execSQL ’update emp set salary=:NEW.emp.salary where emp.name=’’Fred’’’

  13. Example : Real Estate Database “If new house added which is in neighborhood that salesperson Iris reprensents then notify her” House (hno,address,price,nno,spno) Salesperson (spno,name,phone) Represents (spno,nno) Neighborhood (nno,name,location) create trigger IrisHouseAlert on insert to house from salesperson s, house h, represents r when s.name = ‘Iris’ and s.spno=r.spno and r.nno=h.nno do raise event NewHouseInIrisNeighborhood(h.hno, h.address)

  14. Trigger Condition Structure • Expression signature • Expression signature consists of • Data source ID • Operation code, e.g. insert, delete, etc. • Generalized Expression (parameterized) FROM: Data src: emp ON: Event : update WHEN: boolean exp. = Emp.name CONSTANT

  15. Condition structure (contd) • Steps to obtain canonical representation of WHEN clause • Translate expression to CNF • Group each conjunct by data source they refer to • Selection Predicate will be of form : (C11 OR C12 OR ..) AND ... AND (Ck1 OR …), where each Cij refers to same tuple variable. • Each conjunct refers to zero, one, or more data sources • Group conjuncts by set of sources they refer to • If one data source, then selection predicate • If two data sources, then JOIN predicate

  16. Triggers for stock ticker notification • Create triggerT1fromstock whenstock.ticker = ‘GOOG’ and stock.value < 500 donotify_person(P1) • Create triggerT2fromstock whenstock.ticker = ‘MSFT’ and stock.value < 30 donotify_person(P2) • Create triggerT3fromstock whenstock.ticker = ‘ORCL’ and stock.value < 20 donotify_person(P3) • Create triggerT4 from stock whenstock.ticker = ‘GOOG’ donotify_person(P4)

  17. Expression Signature • Idea: Common structures in condition of triggers Expression Signature: • E1: stock.ticker = const1 and stock.value < const2 • Expression Signature: • E2: stock.ticker = const3 • Expression signature defines equivalence class of all instantiations of expression with different constants T1: stock.ticker = ‘GOOG’ and stock.value < 500 T2: stock.ticker = ‘MSFT’ and stock.value < 30 T3: stock.ticker = ‘ORCL’ and stock.value < 20 T4: stock.ticker = ‘GOOG’

  18. What to do now • Only a few distinct expression signatures, build data structures to represent them explicitly (in memory) • Create constant tables that store all different constants, and link them to their expression signature

  19. Main Structures • A-treat Network • Network for trigger condition testing • For a trigger to fire, all conditions must be true • Expression Signature • Common structure in a trigger • E1: stock.ticker = const1 and stock.value < const2 • Constant Tables • Constants for each expression signature

  20. Root stock.value < const2 stock.ticker = const1 predicates Node 1 Node 2 alpha-node alpha-node A-Treat Network to represent a trigger • For each trigger condition • stock.ticker = const1 and stock.value < const2

  21. Condition Testing • A-Treat network is a discrimination network for trigger condition testing. • For a predicate to be satisfied, all its conjuncts should be true. • This is checked using A-Treat network.

  22. A-Treat network (Hanson 1992) Define rule SalesClerk If emp.sal>30,000 And emp.dno=dept.dno And dept.name=“sales” And emp.jno=job.jno And job.title=“clerk” Then Action

  23. Expression Signature Table E1: stock.ticker = const1 and stock.value < const2 E2: stock.ticker = const3

  24. Constant Tables • Tables of constants in trigger conditions Const_e1 Const_e2 T1: stock.ticker = ‘GOOG’ and stock.value < 500 T2: stock.ticker = ‘MSFT’ and stock.value < 30 T3: stock.ticker = ‘ORCL’ and stock.value < 20 T4: stock.ticker = ‘GOOG’

  25. Tables • Primary tables • trigger_set (tsID, name, comments, creation_date, isEnabled) • Trigger (triggerID, tsID, name, comments, trigger_text, creation_date, isEnabled, …) • Trigger cache in main memory for recently accessed triggers.

  26. Predicate Index • Tables • expression_signature(sigID, dataSrcID, signatureDesc, constTableName, constantSetSize, constantSetOrganization) • const_tableN(exprID, triggerID, nextNetworkNode, const1, … constK, restOfPredicate) • Root of predicate index linked to data source predicate indices • Each data source contains an expression signature list • Each expression signature links to its constant table. • Index expressions on most selective conjunct (rest on fly).

  27. Predicate Index hash(src-ID) Goal: Given an update, identify all predicates that match it.

  28. Processing Trigger Definition • Parse the trigger and validate it • Convert the when clause to conjunctive normal form • Group the conjuncts by the distinct sets of tuple variables they refer to • Form a trigger condition graph, that is, undirected graph with node for each tuple variable and edge for join predicates. • Build A-Treat network

  29. Processing trigger definition (2) • For each selection predicate • If predicate with same signature not seen before • Add signature of predicate to list • And, add signature to expression_signature table • If signature has a constant placeholder in it, create a constant table for the signature. • Add constants • Else if predicate has constants, add a row to the constant table for the expression

  30. Alternate Organizations • Storage for the expression signature’s equivalence class: • Main memory lists • Main memory index • Non-indexed database table • Indexed database table • For each expression signature, choose a structure depending on number of triggers. efficiency Scalability

  31. Processing update descriptors • On getting an update descriptor (token) (data src ID, operator code, old/new tuple) • Locate data source predicate index from root of predicate index. • For each expression signature, find constant matching the token using index. • Check additional predicate clauses against the token. • When all predicate clauses of a trigger have matched, pin the trigger in main memory • Bring in A-treat network representing that trigger to process aremaining part of trigger, like join, etc. • If trigger condition is satisfied, execute action.

  32. Processing an Update Update Stock (ticker=GOOG, value=495) Root Index of stock.ticker=const1 Other source Predicate index… E1 E2 E1: stock.ticker = const1 and stock.value < const2 const_e1 const_e2 const_e1

  33. Concurrency • Better scalability even on single processor

  34. Concurrency • Identified elements that can be parallelized • Token-level • Multiple tokens processed in parallel • Condition-level • Multiple selection conditions tested concurrently • Rule-action-level • Multiple rule actions fired at the same time • Data-level • Set of data values in the network processed in parallel

  35. Conclusion : Overall Key Points • If a large number of triggers are created, many of them have almost the same format • Group triggers with same structure together into expression signature equivalence classes • Number of distinct signatures is small enough to fit into main memory (index) • Develop a selection predicate index structures • Architecture to build a scalable trigger system.

More Related