1 / 26

Context

Context. Tailoring the DBMS To support particular applications Beyond alphanumerical data Beyond retrieve + process To support particular hardware New storage devices To incorporate novel techniques New join implementations. Extensibility. Language extensions Abstract data types (ADT)

karsen
Download Presentation

Context

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Context Tailoring the DBMS • To support particular applications • Beyond alphanumerical data • Beyond retrieve + process • To support particular hardware • New storage devices • To incorporate novel techniques • New join implementations

  2. Extensibility • Language extensions • Abstract data types (ADT) • User defined functions (UDF) • Data management extensions • New access methods • New storage methods • Query processing extensions • New join methods • New optimization techniques

  3. Starburst Contributions • Revisited internal data structures • Query graph model • Query execution plan: low-level operators and stars • Mechanisms for extensibility • Rules for query rewrite and plan optimization

  4. Predator Contributions • Enhanced abstract data types • Encapsulation principle applied to storage, optimization and evaluation • Type centric DBMS design

  5. Outline • Introduction • Starburst • Language extensions • Data management extensions • Query processing extensions • Predator • E-ADT processing • Summary

  6. Starburst - Language Extensions • User defined functions (1) • Scalar functions • In: one or more field values from a single tuple • Out: a single value • Aggregate functions • In: one or more field values from several tuples • Out: a single value

  7. Starburst - Language Extensions • User defined functions (2) • Set predicate functions • In: a simple predicate and a subquery (defines the range for the predicate) • Out: a boolean value • Table functions • In: one or several table expressions as well as field values • Out: a relation

  8. Starburst – Language Extensions • Abstract data types • Considered useful for: • Type checking • Structuring of users’data • Add-on to the system design

  9. Starburst – Data Management Extensions • Uniform record structure: • Header + offset directory + data area • Advantages: • Support for nested records • Treatment of null values and variable length fields • Inconvenients: • Overhead per record due to the offset directory • Core system services • Logging, recovery manager, predicate evaluator, event queues, lock manager, interface to OS services, debugging, tracing, error reporting.

  10. Starburst – Data Management Extensions • Storage methods [associated to a relation] • Run-time methods for accessing relations: scan, fetch, insert, update, delete, destroy • Implementation: the run-time methods are registered in vector lists • Compile-time cost estimates • Attachments [associated to a relation] • Access methods, integrity constraints and trigger extensions

  11. Starburst – Data Management Extensions • Advantages • New storage methods and attachments can be added without modifying existing code • Limitations • Attachments only called after storage methods • Order in which attachments are called in fixed order

  12. Starburst – Query Processing Extensions Internal representation of queries • Query graph model • Beyond parse trees for the low-level plan operators • Used for query rewrite • Query execution plan • Operator based representation • Strategy alternative rules (stars) to represent execution plan • Used for query plan generation

  13. Query Graph Model • Boxes • Stored relations • Derived relations • Vertices • Setformers iterators: produce tuples for a derived relation • Quantifiers iterators: restrict tuples for a derived relation • Edges • Range edges connecting a vertex and a box: access to a stored or a derived relation • Qualifier edges connecting one or more vertices: conjunction of predicates

  14. Query Rewrite • Objectives: • Equivalent representation for alternative phrasings of a query • Only the DBMS can rewrite queries involving views • Example rules: • Views may be merged • Redundant joins may be eliminated • Selections may be pushed down

  15. Query Rewrite Rules • A rule transforms a QGM into another QGM • Condition / action: IF THEN rules • Rule engine • Forward chaining • Various control strategies for rule application • Search strategy • Top down (depth first / breadth first)/ bottom up

  16. How to Choose Between Alternative Rules? • Cost based decision • Problem: cost estimates are only known at the query execution plan level • Approach: several alternatives are kept in the QGM – CHOOSE operation

  17. Query Execution Plan Execution plan represented using production rules: • Terminals: low-level plan operators • In: 0 or more streams of tuples • Out: 0 or more streams of tuples • Each stream of tuples is tagged with properties • Relational: schema information • Operational: order, location • Estimated: • Non terminals: STAR • Name • Alternative definitions in terms of low-level plan operators or other STARs

  18. Query Execution Plan • A query execution plan is a tree of low-level plan operators • STAR production rules are used for generating query execution plans • General purpose STAR evaluator • Search strategy to choose next STAR to apply • Vector list of stars

  19. Starburst Contributions • Revisited internal data structures • Query graph model • Query execution plan: low-level operators and STARs • Mechanisms for extensibility • Rules for query rewrite and plan optimization

  20. Outline • Introduction • Starburst • Language extensions • Data management extensions • Query processing extensions • Predator • E-ADT processing • Summary

  21. Basic Techniques for ADTs • Vector List of ADTs • Each ADT implements: • Common internal interface for access to ADT values • Functions for storage and indexed retrieval • Methods associated to ADT • ADT methods can be composed • DBMS understands minimal semantics about each method “Black box” ADT Approach

  22. Motivation for E-ADTs • Basic observation: • ADT Methods can be expensive! • Need to identify optimizations on ADT methods • Need to define a framework for applying these optimizations systematically

  23. Possible Optimizations • Algorithmic: • Using different algorithms for each method depending on data characteristics • Transformational: • Changing the order of methods • Constraint: • Pushing physical constraints through a method • Pipelining: • Avoiding materialization of intermediate results

  24. Architectural Framework Each E-ADT supports some of the following enhancements: • Optimization: transforms a method expression into a query execution plan expression • Evaluation: routines to execute the query execution plan expression • Catalog management: routines to store schema information and maintain statistics • Storage management: physical representation of values of its type

  25. E-ADT Rewrite Rules • Some of the optimizations for ADT methods can be applied on a logical representation of queries using rewrite rules

  26. Predator Contributions • Enhanced abstract data types • Encapsulation principle applied to storage, optimization and evaluation • Type centric DBMS design

More Related