1 / 30

SQL Server 2005: Deep Dive On XML And XQuery

SQL Server 2005: Deep Dive On XML And XQuery. Michael Rys DAT405 Program Manager, SQL Server XML Technologies Microsoft Corporation. XML And Relational Data Today. Relational Data. XML. Relational Data. XML. Relational Data. Relational Data. Query and Combine. XML. XML.

kat
Download Presentation

SQL Server 2005: Deep Dive On XML And XQuery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SQL Server 2005: Deep Dive On XML And XQuery Michael Rys DAT405 Program Manager, SQL Server XML Technologies Microsoft Corporation

  2. XML And Relational Data Today RelationalData XML RelationalData XML RelationalData RelationalData Query and Combine XML XML File System

  3. XML Scenarios • Data Exchange • Business-to-business (B2B), business-to-consumer (B2C), application-to-application (A2A) • XML is ubiquitous, extensible, platform independent transport format • Document Management • XHTML, Office XML Documents • Messaging • Simple Object Access Protocol (SOAP), RSS • Mid-Tier Collaboration • Ad-hoc modeling of semistructured data • storing objects with sparse or multi-valued properties that do not fit well into the traditional relational schemata →Transport, Store, and Query XML data

  4. XML Or Relational?

  5. XML And Relational!

  6. SQL Server 2005 XML Architecture Relational XML XML XML Parser XML Schemata SchemaCollection Validation OpenXML/nodes() PATH Index Node Table Rowsets XML data type(binary XML) XML-DML PRIMARYXML INDEX PROP Index FOR XML with TYPE directive VALUE Index XQuery XQuery

  7. Why XQuery? • SQL does not understand XML • XPath 1.0 • W3C Recommendation • Used in SQL Server 2000: SQLXML and OpenXML • Navigation, no reshaping • Limited knowledge about types • XSLT • W3C Recommendation • Data-driven reshaping (uses XPath) • MSXML, System.XML • Hard to author and optimize for large amount of data • No XML data modification language (DML)

  8. What Is XQuery? • Queries and transforms trees • Functional, declarative query language • Combines XPath with node construction • Operates on (XML Schema-)typed and unconstrained XML • Designed to operate on large amounts of data • Optimizable • Current Status: In final Last Call • Recommendations in H2 CY2006 • Fulltext and DML extensions will follow later

  9. XQuery Introduction

  10. Key XQuery Features LET ORDER BY FOR WHERE • FLWOR: FOR / LET / WHERE / ORDER BY / RETURN • Includes XPath 2.0 (/doc[@id = 123]) • Element constructors (<topic>{…}</topic>) • Order-preserving operators • Input order (FLWR) • Document order (XPath, union) • Statically (or dynamically) typed • Strong typing with schema, weak typing without schema RETURN SQL: WITH SELECT FROM WHERE ORDER BY & SET

  11. XQuery Type System • 3 Classes of Item Types: • Node types: element(), attribute(), comment() etc. • Element content types: xs:anyType, user-defined (e.g., my:CustomerT) • Atomic types: built-in and user-defined (e.g., xs:int, my:hatSize) • XQuery uses XML Schema for content and atomic types • “Untyped” data have special types (e.g., xdt:untypedAtomic) • XML Schema (W3C standard) • Rich mechanism for type definitions and validation constraints • Can be used to constrain XML documents • XML Schema Collections will be used for typing (meta-data) • Benefits of typed data • Guarantees shape of data • Provide type specific semantics • Allows storage and query optimizations

  12. Static Typing In XQuery • Type Inference: Infers type of Expression during compilation • Type Check: Inferred Type is subtype of expected type • Benefits: • Compile-time type error discovery • Guarantees correct type at runtime • More efficient execution • Costs: • Sometimes type inference is less precise than data will be (inferring list on /a[1]/b, but there will always be only 1 b) • Requires more explicit casts and “pick first” (/a[1]/b[1])

  13. XML Data Modification • XQuery extensions: Insert, update, and delete • XML sub-tree modification: • Add or delete XML sub-trees • Update values • Generate consistent state

  14. replace value of delete insert XML-DML: insert <notes/>into /Customer replace value of(/Customer/name)[1]with “Nils” delete /Customer/Order[id = 42] insert <notes/>as lastinto /Customer Customer insert <notes/>as firstinto /Customer notes notes insert <notes/>before /Customer/name name: xs:string Order notes insert <notes/>after/Customer/name id: xs:int “Nils” “Janine” 42 Target needs to be statically one node

  15. XQuery And XML-DML In SQL Server 2005

  16. XQuery And XML-DML In SQL Server 2005 • Subset of XQuery implemented • Is aligned with July 2004 XQuery working draft • Added XML Data Modification • Applies to single XML data type instance • Methods on XML data type: • query(), value(), exist(), modify(), nodes() • Use SQL to iterate over collection of instances (XML-typed column) • Can refer to relational data • Take advantage of Schema-collection information to operate on typed XML data • Will make use of XML indices for optimization

  17. XQuery Methods • query() creates new, untyped XML data type instance • value()extracts an XQuery value into the SQL value and type space • Expression has to statically be a singleton • String value of atomized XQuery item is cast to SQL type • SQL type has to be SQL scalar type (no XML or CLR UDT) • exist()returns 1 if the XQuery expression returns at least one item, 0 otherwise

  18. XQuery: nodes() • Provides OpenXML-like functionality on XML data type column in SQL Server 2005 • Returns a row per selected node • Each row contains a special XML data type instance that • References one of the selected nodes • Preserves the original structure and types • Can only be used with the XQuery methods (not modify()), count(*), and IS (NOT) NULL

  19. sql:column()/sql:variable() • Map SQL value and type into XQuery values and types in context of XQuery or XML-DML • sql:variable(): accesses a SQL variable/parameterdeclare @value int set @value=42select * from T where T.x.exist(‘/a/b[@id=sql:variable(“@value”)]’)=1 • sql:column(): accesses another column valuetables: T(key int, x xml), S(key int, val int)select * from T join S on T.key=S.keywhere T.x.exist(‘/a/b[@id=sql:column(“S.val”)]’)=1 • Restrictions in SQL Server 2005: • No XML, CLR UDT, datetime, or deprecated text/ntext/image

  20. XQuery: modify() • Used with SET: declare @xdoc xmlset @xdoc.modify(‘delete /a/b[@id=“42”]’)update T set T.xdoc.modify(‘insert <b/> into /a’)where T.id=1 • Relational row-level concurrency: whole XML instance is locked

  21. Combined SQL And XQuery/DML Processing • SELECT x.query(‘…’), y FROM T WHERE … Metadata Static Phase XQuery Parser SQL Parser XML SchemaCollection Static Typing Static Typing Algebrization Algebrization Static Optimization of combined Logical and Physical Operation Tree Dynamic Phase Runtime Optimization and Execution of physical Op Tree XML and rel.Indices

  22. XML Indices • Create XML index on XML column CREATE PRIMARY XML INDEX idx_1 ON docs (xDoc) • Create secondary indexes on tags, values, paths • Speed up queries • Results can be served directly from index • SQL’s cost based optimizer will consider index • Primary and Secondary Indices will be efficiently maintained during updates • Only subtree that changes will be updated

  23. insert into Person values (42, '<book ISBN=”1-55860-438-3”> <section> <title>Bad Bugs</title> Nobody loves bad bugs. </section> <section> <title>Tree Frogs</title> All right-thinking people <bold>love</bold> tree frogs.</section></book>') Example Index Contents

  24. Primary XML Index CREATE PRIMARY XML INDEX PersonIdx ON Person (Pdesc) Assumes typed data; Columns and Values are simplified, see VLDB 2004 paper for details

  25. Architectural Blueprint: Indexing XML Columnin table T(id, x) Primary XML Index (1 per XML column)Clustered on Primary Key (of table T), XID 1 2 2 4 1 2 3 3 3 1 Non-clustered Secondary Indices (n per primary Index) Value Index Property Index Path Index

  26. XQuery Optimizations With XML Indices

  27. Take-Away: XML Indices • PRIMARY XML Index – use when lot’s of XQuery • FOR VALUE– useful for queries where values are more selective than paths such as //*[.=“Seattle”] • FOR PATH– useful for Path expressions: avoids joins by mapping paths to hierarchical index (HID) numbers. Example: /person/address/zip • FOR PROPERTY– useful when optimizer chooses other index (e.g., on relational column, or FT Index) in addition so row is already known

  28. Session Summary • SQL Server 2005 provides XQuery and XML DML on XML datatype • XQuery subset based on July 2004 WD • Typing provided by XML Schema collections on XML datatype • Node-based Data Manipulation Language (DML) • Integrates with relational processing • Optimization: • Using extended relational algebra and query optimizer • Indexing of XML datatype

  29. Community Resources • At PDC • DAT Track lounge: I’ll be there daily • After PDC • MSDN dev center: http://msdn.microsoft.com/SQL/2005 • XML and Databases whitepapers: http://msdn.microsoft.com/XML/BuildingXML/XMLandDatabase/ • Online WebCasts: http://msdn.microsoft.com/sql/2005/2005webcasts/ • Newsgroups & Forum: news:microsoft.public.sqlserver.xml http://forums.microsoft.com/msdn/ShowForum.aspx?ForumID=89 • My E-mail: mrys@microsoft.com • My Weblog: http://www.sqljunkies.com/weblog/mrys • Please fill out Session Evaluation

  30. © 2005 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.

More Related