1 / 25

Query Processing in the AquaLogic Data Services Platform

This article discusses the query processing capabilities of AquaLogic Data Services Platform and how it helps in accessing and organizing data from various sources.

neuman
Download Presentation

Query Processing in the AquaLogic Data Services Platform

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Query Processing in the AquaLogic Data Services Platform Vinayak Borkar, Michael Carey, Dmitry Lychagin, Till Westmann, Daniel Engovatov, Nicola Onose BEA Systemswww.bea.com

  2. Data Is Everywhere Today • Relational databases made things too easy • Departmental vs. inter-galactic centralized databases • Databases come in many flavors • Relational: Oracle, DB2(s), SQL Server, MySQL, … • Hangers-on: IMS, IDMS, VSAM, … • Not all data is SQL-accessible • Packaged applications: SAP, PeopleSoft, Siebel, Oracle, SalesForce, … • Custom “homegrown” applications • Files of various shapes and sizes (XML, non-XML) • And the list goes on…

  3. Painful to Develop Applications • No one “single view of X” for any X • What data do I have about X? • How do I stitch together the info I have? • What else is X related to? • No uniformity in source model or language • Data about X is stored in many different formats • Accessing or updating X involves many different APIs • Manual coding of “distributed query plans” • No reuse of artifacts • Different access criteria or returned data  different access plans • No model to help organize or find the artifacts anyway

  4. Agenda Why data services? Overview and Example The Query Processor Work in progress at BEA Summary and Q&A

  5. WSDL JAVA API Overview REPORTING JAVA/J2EE WEB SERVICE JDBC/SQL Client API Data Processing Engine Developer Tooling Connectivity Web Services Relational Files Java Functions Business partners, Adapter, .Net Tables, views stored procedures, SQL J2EE Excel LDAP XML, Flat Files Custom Access JCA JMS

  6. Customer Info Order Info Credit Card Info Rating Info CUSTOMER, ORDER CREDIT_CARD getRating(…) Example: Customer Profile Data Service

  7. Data Service – Design View

  8. Data Service – “Get All” Read Method (::pragma function ... kind="read" ...::) declare function tns:getProfile() as element(ns0:PROFILE)* { for $CUSTOMER in db1:CUSTOMER() return <tns:PROFILE> <CID>{ fn:data($CUSTOMER/CID) }</CID> <LAST_NAME>{ fn:data($CUSTOMER/LAST_NAME) }</LAST_NAME> <ORDERS>{ db1:getORDER($CUSTOMER) }</ORDERS> <CREDIT_CARDS>{ db2:CREDIT_CARD()[CID eq $CUSTOMER/CID] }</CREDIT_CARDS> <RATING>{ fn:data(ws1:getRating( <ns5:getRating> <ns5:lName>{ data($CUSTOMER/LAST_NAME) }</ns5:lName> <ns5:ssn>{ data($CUSTOMER/SSN) }</ns5:ssn> </ns5:getRating> ) }</RATING> </tns:PROFILE> };

  9. Data Service – Read & Navigate Methods (::pragma function ... kind="read" ...::) declare function tns:getProfileByID($id as xs:string) as element(ns0:PROFILE)* { tns:getProfile()[CID eq $id] }; ... (::pragma function ... kind="navigate" ...::) declare function tns:getCOMPLAINTs($arg as element(ns0:PROFILE)) as element(ns8:COMPLAINT)* { db3:COMPLAINT()[CID eq $arg/CID] }; ...

  10. Agenda Why data services? Overview and Example The Query Processor Work in progress at BEA Summary and Q&A

  11. Query Processor Overview

  12. Efficient processing • Avoid unnecessary work • Function inlining (view unfolding) • Push work to the sources • Queryable sources can do some of our work

  13. Optimization: Function Inlining Example: This fragment let $x := <CUSTOMER> <LAST_NAME>{$name}</LAST_NAME> <ORDERS>…</ORDERS> </CUSTOMER> return fn:data($x/LAST_NAME) can be replaced by $name But we need to: maintain structural type information for compilation extend “preserve” mode for runtime

  14. Pushdown: Overview • SQL Translator tries to “swallow” as much as possible. • Make translation easy • Remove unnecessary functions (by inlining) • Translation to joins and grouping • Split sorting and grouping • Make sources do as much as possible • DBMS specific code • Maximize Pushdown • Inverse functions

  15. Pushdown: Preparation • Translation to joins and grouping • Split sorting and grouping for $CUSTOMER in db1:CUSTOMER() return <tns:PROFILE> <CID>{ fn:data($CUSTOMER/CID) }</CID> … <ORDERS>{ db1:getORDER($CUSTOMER) }</ORDERS> <CREDIT_CARDS>{ db2:CREDIT_CARD()[CID eq $CUSTOMER/CID] }</CREDIT_CARDS> … </tns:PROFILE>

  16. Pushdown: Inverse Functions I Example fragment of tns:getProfile: <tns:PROFILE> <CID>{fn:data($CUSTOMER/CID)}</CID> <LAST_NAME>{ fn:data($CUSTOMER/LAST_NAME) }</LAST_NAME> <SINCE>{int2date($CUSTOMER/SINCE)}</SINCE> ... <tns:PROFILE> used in this query for $c in tns:getProfile() where $c/SINCE gt $start return $c yields (after inlining) for $c1 in ns3:CUSTOMER() where int2date($c1/SINCE) gt $start return <tns:PROFILE> ... </tns:PROFILE>

  17. Pushdown: Inverse Functions II Register • Inverse functiondate2intforint2date • Transformation rule (gt, int2date) gt-intfromdatewith declare function gt-intfromdate($x1 as xs:dateTime, $x2 as xs:dateTime) as xs:boolean?{ date2int($x1) gt date2int($x2) }; Now we can rewrite the query into for $c1 in ns3:CUSTOMER() where $c1/SINCEgt ns1:date2int($start) return <tns:PROFILE> ... </tns:PROFILE> can be pushed as SELECT * FROM "CUSTOMER" t1 WHERE (t1."SINCE" > ?)

  18. Optimization: PP-k JoinParameter Passing in chunks of k • Prerequisites • Distributed join • Right side is a relational source • Idea: relational source can partition its content • Steps • Read k items from the left into L • Select all items from the right that match any of the items in L into R • Join L and R in the middleware • Repeat until the left is exhausted • Benefit: excellent trade-off between • Memory footprint in the middleware • Roundtrip overhead imposed by the data source • k = 20

  19. Example: “Get All” Read Method Revisited (::pragma function ... kind="read" ...::) declare function tns:getProfile() as element(ns0:PROFILE)* { for $CUSTOMER in db1:CUSTOMER() return <tns:PROFILE> <CID>{ fn:data($CUSTOMER/CID) }</CID> <LAST_NAME>{ fn:data($CUSTOMER/LAST_NAME) }</LAST_NAME> <ORDERS>{ db1:getORDER($CUSTOMER) }</ORDERS> <CREDIT_CARDS>{ db2:CREDIT_CARD()[CID eq $CUSTOMER/CID] }</CREDIT_CARDS> <RATING>{ fn:data(ws1:getRating( <ns5:getRating> <ns5:lName>{ data($CUSTOMER/LAST_NAME) }</ns5:lName> <ns5:ssn>{ data($CUSTOMER/SSN) }</ns5:ssn> </ns5:getRating> ) }</RATING> </tns:PROFILE> };

  20. Evaluation Plan, Example 1 (getProfile)

  21. Evaluation Plan, Example 2 (query getProfile)

  22. Agenda Why data services? Overview and Example The Query Processor Work in progress at BEA Summary and Q&A

  23. Some ALDSP Work in Progress • Native JDBC/SQL92 support (available on Sep 15) • Bilingual engine for efficient reporting/BI tool access • Support for compensating transactions • Extend update facility to support safe non-XA updates (sagas) • XQuery Update support (as well as XQueryP) • Goal: no Java coding for many Web service use cases

  24. Agenda Why data services? Overview and Example The Query Processor Work in progress at BEA Summary and Q&A

  25. Summary • Covered here • Why Data Services? • How are Data Services used? • Some techniques for efficient evaluation • In the paper • XQuery extensions • Runtime architecture • Caches • Updates • Security

More Related