Scalable Trigger Processing
E N D
Presentation Transcript
Scalable Trigger Processing Eric N. Hanson et al ICDE 1999 Presented by Shyamshankar D
Motivation • Use of Triggers • Integrity constraint checking • Alerting etc. • Commercial database systems • Limited triggering capabilities • 1-100 triggers/table. • Doesn’t scale to internet and World wide web requirements
An Example • A Continuous query • Stock holding: 100*IBM • Query: Inform me whenever the price of the stocks crosses 10,000$ Create Trigger stock-watch from quotes q on update(q.price) when q.name=‘IBM’ and 100*q.price > 10,000 do raise event ThresholdCrossed(100*q.price).
What Next? • The problem • The key concept • TriggerMan system architecture • Predicate Index • Trigger processing • Concurrency
The Problem • Requires millions of triggers. • Steps for trigger processing • Event monitoring • Condition evaluation • Executing the trigger action • Response time for the database operations.
The Key Concept • If large number of triggers are created, then most of them have the same format. • Triggers share the same expression signature except that one constant has been substituted by another. • Group predicates from trigger conditions based on expression signatures and store these in an efficient main memory data structure.
The TriggerMan System • Trigger system built on Informix database • Could also be implemented in a trigger system in a DBMS server. • Trigger syntax create trigger <triggerName> [in setName] [optionalFlags] from fromList [on eventSpec] [when condition] [group by attributeList] [having groupCondition] do action
Example 1 Update Fred’s salary when Bob’s salary is updated create trigger updateFred from emp on update(emp.salary) when emp.name = ’Bob’ do execSQL ’update emp set salary=:NEW.emp.salary where emp.name=’’Fred’’’
Example 2 house(hno,address,price,nno,spno) salesperson(spno,name,phone) represents(spno,nno) neighborhood(nno,name,location) create trigger IrisHouseAlert on insert to house from salesperson s, house h, represents r when s.name = ‘Iris’ and s.spno=r.spno and r.nno=h.nno do raise event NewHouseInIrisNeighborhood(h.hno, h.address)
Components • TriggerMan Datablade • Data Sources • Local/remote tables/streams • TriggerMan Client applications • Create drop triggers etc. • TriggerMan Driver • Condition testing and action TmanTest() fn. • TriggerMan console • Direct user interaction interface
Trigger Condition structure • Expression signature • Expression signature consists of • Data source ID • Operation code • Generalized Expression Data src: emp Event : update = Emp.name CONSTANT
Condition structure (contd) • Steps to obtain canonical representation of WHEN clause • Translate expression to CNF • Group each conjunct by data source they refer. • Predicate will be of form (C11 OR C12 OR ..) AND ... (Ck1 OR …), where each Cij refers to a tuple variable. • Very large number of predicates created only if different triggers contain distinct CONSTANT values. • If expression has m constants replace xth constant by CONSTANTX
The Tables • Primary tables • trigger_set (tsID, name, comments, creation_date, isEnabled) • Trigger (triggerID, tsID, name, comments, trigger_text, creation_date, isEnabled, …) • Trigger cache in main memory for recently accessed triggers.
Predicate Index • Tables • expression_signature(sigID, dataSrcID, signatureDesc, constTableName, constantSetSize, constantSetOrganization) • const_tableN(exprID, triggerID, nextNetworkNode, const1, … constK, restOfPredicate) • Root of predicate index linked to set of data source predicate index • Each data source contains an expression signature list • For each expression signature containing constants, a constant table. • Index expressions on most selective conjunct.
Condition Testing • For Predicate to be satisfied all conjuncts should be true. • This is checked using an A-Treat network. • A-Treat network is a discrimination network for trigger condition testing.
A-Treat network(Hans 92) Define rule SalesClerk If emp.sal>30,000 And emp.dno=dept.dno And dept.name=“sales” And emp.jno=job.jno And job.title=“clerk” Then Action
Processing trigger definition • Parse the trigger and validate it • Convert the when clause to conjunctive normal form and group the conjuncts by the distinct sets of tuple variables they refer to • Form a trigger condition graph. Undirected graph with node for each tuple variable and edge for join predicates. • Build the A-Treat network
Processing trigger definition(contd) • For each selection predicate over a alpha node • If predicate with same signature, not seen before • Add signature of predicate to list and to expression_signature table • If signature has a constant placeholder in it, create a constant table for the signature. • Else if predicate has constants, add a row to the constant table for the expression
Alternate Organization strategies • Storage for the expression signature’s equivalence class: • Main memory lists • Main memory index • Non-indexed database table • Indexed database table • For each expression signature use a structure depending on number of triggers. efficiency Scalability
Processing update descriptors • On getting an update descriptor/token (data src ID, oper code, old/new tuple) • Locate data source predicate index from root of predicate index. • For each expression signature, find constant matching the token. • Check additional predicate clauses against the token. • When all predicate clauses of a trigger have matched, pin the trigger in main memory • If trigger condition is satisfied, execute action.
Some optimizations For I = 1 to N Create trigger Ti from R when R.a=100 do … • List of trigger ids for R.a=100 • Constant sets and trigger Ids for equality conditions stored as • Lists • Clustered constant index • All entries of [const1,..constk] stored together • Enables fast retrieval of triggers ids together
Concurrent processing • Different types of concurrency • Token level concurrency • Multiple tokens in parallel • Condition level concurrency • Test multiple selection condns against token concurrently • Rule action concurrency • Process multiple actions fired at same time. • Data level concurrency • Current implementation supports token level concurrency
Concurrency N=[NUM_CPUS*TMAN_CONCURRENCY_LEVEL] • N driver processes • Each driver process calls TmanTest() after T time • T=future work • Balance switching overhead and avoid long executions
TmanTest() while(total execution time of this invocation of TmanTest < THRESHOLD and work is left in the task queue) { Get a task from the task queue and execute it. Yield the processor so other Informix tasks can use it } if task queue is empty return TASK_QUEUE_EMPTY return TASKS_REMAINING
Concurrency(contd) • Task can be • Process a token to check matching rule • Run a rule action • Process a token against a set of conditions • Process a token to run a set of rule actions triggered by token.
Concurrency For k=1 to M Create trigger T_K from R when R.company = "IBM" do raise event notify_user("userK", …. ) • Partition trigger sets into N sets.
Conclusion • An architecture to build a truly scalable trigger systems. • Future work • Scalability for temporal conditions • Aggregates