1 / 32

Performance Analysis of Temporal Queries

Performance Analysis of Temporal Queries. (Information Sciences #49, 1989) by Ilsoo Ahn , AT&T Bell Laboratories, Columbus, Ohio and Richard Snodgrass Dept. of Computer Science , University of Arizona Communicated by Ahmed Elmagarmid ~ * ~

kalli
Download Presentation

Performance Analysis of Temporal Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance Analysis of Temporal Queries (Information Sciences #49, 1989) by Ilsoo Ahn, AT&T Bell Laboratories, Columbus, Ohio and Richard Snodgrass Dept. of Computer Science, University of Arizona Communicated by Ahmed Elmagarmid ~ * ~ Presented by Barry Klein for CS-599, 10/26/2000

  2. Abstract Temporal databases that maintain history data add historical queries and rollback operations to conventional db’s. This paper proposes a model for analyzing the performance of temporal queries over a range of access methods.

  3. Abstract, continued • Model: 4 transformations through a series of formal expressions common to all phases of query processing. • Input: Temporal Query + DB schema • Output: Estimated I/O cost for it. • Validation: Compare estimated cost from model with actual cost from a prototype.

  4. Introduction Factors affecting performance of a Temporal DBMS: • Access methods available • Query-processing strategies • Size and composition of the data

  5. Introduction, continued Methods for describing TDBMS effectiveness: • Empirical approach – actual performance is measured. Advantage: Results are reliable. • Analytical approach – develop a math model of the performance, which can predict performance in controlled context. Advantage: less effort, but results are questionable.

  6. Introduction, continued Three orthogonal types of Time: • Valid time, Transaction time, User-defined 4 categories of DBs defined in terms ofsupport for valid/transaction time: • Snapshot – conventional, no temporal support. • Rollback – support transaction time. • Historical – support valid time (real-world history) • Temporal – support both valid and txn time.

  7. Introduction, continued TQuel—non-procedural language based on tuple calculus—is chosen here to express historical queries and rollback operations: • Augments retrieve statement with when predicate – temporal relations among tuples. • Valid clause specifies how implicit time attributes are computed for result tuples. • Rollback operations implemented with as of clause (in either rollback or temporal db’s).

  8. Temporal relations used with the added constructs: Precede Overlap Extend Begin of End of TQuel augments valid and when clauses to: Append Delete Replace Create statement supported for temporal relations. Introduction, continued

  9. The New Model Performance analysis based on these givens: • A set of temporal queries • Some query-processing/optimization strategy • File structure(s) to implement the TDB • A set of parameters characterizing the storage devices.

  10. The New Model, continued Assumptions and decisions for this model: • Disk I/O traffic is used as measurement key: ~proportional to performance; • Inputs must be flexible; • Resulting estimate must be accurate.

  11. The New Model, continued The 4 transformations of the model use: • Algebraic expressions; • File-primitive expressions; • Access-path expressions.

  12. The Algebraic Expression Since TQuel is non-procedural, the algebraic expression is defined first: • Algebraic operators • Conventional: select, project, join, union, difference • Temporal: when, as of • Auxiliary: temporary, sort, reformat

  13. Conventional Algebraic Operators • Select – has a relations and a predicate to specify constraint that result tuples must satisfy. • Project – parameters are a relation and a set of attributes to be extracted from the relation. • Join – performs a theta-join of 2 relations, given the first 2 parameters; 3rd parm is join method, 4th is combining-method predicate. • Union – set addition on 2 relations. • Difference – set subtraction on 2 relations.

  14. Temporal Algebraic Operators • When – performs temporal selection on a relation according to a temporal predicate on the values of valid time attributes. • AsOf – similar, but compares 2 time constants with transaction-time attribute values. • Valid – performs temporal projection  the values of the valid time attributes. (It might perform similarly to project.)

  15. Auxiliary Algebraic Operators Operations that don’t change the query result but affect the query cost. • Temporary – create or access a temporary relation for the result of its parameter’s operation. • Sort – tuples in the rel sorted by 1st parm, with remaining parms as key sort attributes. • Reformat – changes the structure of the relation  1st parm, to form of 2nd parm , with remaining parms as key sort attributes.

  16. TQuel Algebraic xform’s:Example 1 range of h is relation_h retriev (h, id, h.seq) where h.id = 500 is mapped to: {L1: Select (h, h.id=500); Project (L1, h.id, h.seq) } Selects id=500 from rel_h, then extracts attribs id & seq from L1, the result of the previous operation. The “;” forces sequential execution.

  17. Example 1, continued The same expression can be mapped instead to: {[ L1: Select (h, h.id=500); Project (L1, h.id, h.seq) ]} The “[]” eliminates need for temporary file for intermediate results.

  18. TQuel Algebraic xform’s:Example 2 {L1: Join (h, I, TS, h.id= i.amount and h overlap i); Project (L1, h.id, h.seq) } L2: when (L1, i overlap “now”); Project (L2, h.id,i.id, i.amount) } Specifies Join using tuple substi- tution (TS) of rel’s h & i. range of h is relation_h range of i is relation_i retriev (h.id, h.id, id.amount) where h.id = id.amount when h overlap i and i overlap “now” is mapped to 2 different algebraic expressions:

  19. Example 2, continued {[ L1: When (i, i overlap now”); L2: Project (L1, i.id, i.amount, i.valid_from, i.valid_to) ]} L3: Temporary (L2); [L4: Join (h, L3, TS, h.id= i.amount and h overlap I); Project (L4, h.id, i.amount) ]} Equivalent to prev example, but performs much more efficiently The original expression: range of h is relation_h range of i is relation_i retriev (h.id, h.id, id.amount) where h.id = id.amount when h overlap i and i overlap “now” is also mapped to:

  20. Xform to File Primitive Expression The 2 primitives, Read and Write, take parms: • Access method - Heap, Hash, Isam or Btree; • File size • Length of overflow chain An FPE combines primitives to repeat or execute together to perform an algebraic operation. The simple example FPE-1: Read (Hash, 0) specifies one hashed access with no overflow records.

  21. File Primitive Expression, example 2 FPE-2: Read (Heap, 128) + ( Read (Heap, 19) * 2 - 1 + Write (Heap, 19) * 3 - 1 ) + Read (Heap, 19) + Read (Hash, 0) * 1024 This indicates one Read from the 128-block heap, 2 Read s from 19 blocks, 3 Writes to the 19-block heap, and a hashed access on a file with no overflow records, iterated 1024 times.

  22. Characteristics of DB Relations Transforming alg expressions to FPE, need: • Relation names • Temporal type • Storage structures • Attribute counts, names, formats, lengths • Key attributes • Tuple lengths and counts • Selectivity & distribution of attribute values • Data volatility • Update count (particularly for TDB)

  23. Steps of Transformation • For each algebraic operator, substitute file primitive(s) with the particular DB parameters. • Omit any algebraic operation that can be performed simultaneously with another operation. • Identify basic constructs in temporal queries. • Transform the subset of algebraic expressions (composed of these constructs) to FPEs.

  24. Access Path Expression APE: the path through the storage structure which satisfies an FPE access request. Node: physically contiguous record(s) involved in the access. Access (read or write) of a tuple: traverses node(s). Access path: a set of nodes connected (in)directly; also, set of chains. Chain: a group of nodes.

  25. Access Path Expression Modes • Guided if there’s a random-access location mechanism: • H: address is computed by a hash function; • P: there’s a pointer to the address; • A: component follows adjacently; • S: component shares starting address with its parent; • M: the component is in main memory. • Searched otherwise: • O: file is ordered, enabling log search; • U: unordered - requires sequential search.

  26. APE Subcomponent Parameters • f = number of records in a file • b= number of records in a block • r= number of bytes in a record • n= number of records to be accessed.

  27. Inverted & Multi-List File structures

  28. APE for Inverted Files Read (Inverted, 3): (P 3 (P 1 (S 1)) (P 1 (S 1)) (P 1 (S 1))) The head of the path is located by a pointer; it contains a key value and 3 chains, each of which is also located by a ptr; each has one node, which shares the same address with the chain, and contains one record. The expression abbreviates: (P 3 (P 1 (S 1))

  29. APE for Multilist Files Read (Multilist, 3): (P 1 (P 3 (S 1) (P 1) (P 1))) The head of the path is located by a pointer; it contains1 chain which is also located by a ptr, and has 3 nodes, each of which contains one record. The first node has the same address as the chain, and next nodes via pointers. Since the 2nd and 3rd nodes are identical, the expression abbreviates: (P 1 (P 3 (S 1) (P 1)))

  30. Transform FPE  Access Cost • Parse the APE and determine the access cost in terms of the random and the sequential access counts. • The avg access count for each componentest’ed re the component-location mode (see above) • The total access count for an APE =  of all its components, each multiplied by the corresponding value of count. • Ex: the APE (H 1 (P 28 (S 1) (P 1))) has a random access count of 1 + 28 (0+1) = 29

  31. Access-Time Calculations The time elapsed to access disk blocks requires modeling the characteristics of storage devices. Some of the criteria are: • Type of media • Fixed or moving heads • R/w or write-once • Seek time and transfer rate • Number and size of cylinders, tracks and sectors • Block size of DBMS vs page size of op system.

  32. Performance Analysis Summary The steps are: • Examine TQuel query to decide processing strategy • Transform it into an algebraic expression • Break down in terms of characteristics of DB/rel’s • Transform into FPE, and then into APE • Analyze for characteristics of storage devices • Compute I/O costs • Select and execute a validation method

More Related