gregg lippa themis inc http www themisinc com glippa@themisinc com
Download
Skip this Video
Download Presentation
Gregg Lippa Themis Inc. themisinc [email protected]

Loading in 2 Seconds...

play fullscreen
1 / 45

Gregg Lippa Themis Inc. themisinc [email protected] - PowerPoint PPT Presentation


  • 152 Views
  • Uploaded on

DB2 Version 9: Overview of pureXML. Plus an XPath Primer, XML Column Indexes, Support for Validation. Gregg Lippa Themis Inc. http://www.themisinc.com [email protected]

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Gregg Lippa Themis Inc. themisinc [email protected]' - venedict


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
gregg lippa themis inc http www themisinc com glippa@themisinc com
DB2 Version 9:Overview of pureXML

Plus an XPath Primer,XML Column Indexes, Support for Validation

Gregg Lippa

Themis Inc.

http://www.themisinc.com

[email protected]

slide2
Gregg Lippa is currently a Senior Technical Advisor at Themis Inc. He teaches DB2-related courses on SQL, Application Programming, and optimization, performance and tuning as well as Java, J2EE, distributed computing and Java-related topics. Gregg has worked with DB2 as a consultant and trainer for over 20 years and with Java for 10 years.

This material is taken from the Themis course DB1091: DB2 9 for z/OS pureXML Features. For more information visit http://www.themisinc.com.

Products and company names mentioned may be trademarks of their respective companies. Mention of third party products or software is for reference only and constitutes neither a recommendation nor an endorsement.

db2 version 9
DB2 Version 9
  • pureXML technology
    • Seamlessly integrating XML with relational data
      • Including publishing and parsing functions
    • Storing XML data natively in a database table
      • Including a new XML data type
an xml document
An XML Document

John Doe

123 Main

Anytown

PA

19134

610-654-1234

610-987-4321

[email protected]

2008-09-22

flu shot

gave flu shot

40.00

10.00

2008-05-11

sore foot

referred to specialist

60.00

10.00

Root

Nodes

Elements

Attributes

Values (text)

Well-formed

Valid

xml why
XML – Why?
  • Pervasive
  • Versatile
  • Self-describing
  • Neutral for exchanging data among diverse devices
    • Universal standard for data interchange
  • Growth of XML data
    • XML-based industry and data standards
    • SOA and Web services; Services-based frameworks
      • messages are encapsulated as XML
    • Web 2.0 technologies
      • XML feeds
      • Syndication services - rendered as XML files
  • XML data becoming more critical to enterprise operations
previous approaches to xml document storage
Previous Approaches to XML Document Storage
  • File systems
    • Not storing XML documents in a database
    • Does not scale well
  • Stuffing
    • XML data stored as large objects or VARCHAR
    • Inefficient for querying the XML data
  • Shredding
    • Decomposing XML data into multiple columns & tables
    • Often leads to complex join requirements
    • May be difficult to recreate original XML document
  • Utilizing XML-only database systems
    • Few options and little expertise available
xml vs relational model
XML vs Relational Model
  • Major differences:
    • XML data is hierarchical; relational data is tabular
    • XML data is self-describing; relational data is not
    • XML data is ordered; relational data is not
  • Which approach is right for my data?
    • XML maximizesflexibility of the data structure
    • Relational data provides performance benefits for data retrieval
    • Relational data supports referential integrity requirements
      • Referential constraints cannot be based on XML columns
    • Data warehouses are oriented toward relational data
purexml capabilities
pureXML Capabilities
  • XML data type
    • With specialized hierarchical storage structure
  • Indexing capabilities
    • Based on data within XML documents
  • New query languages (XPath and SQL/XML)
    • New query optimization techniques too
  • XML schema support
    • Including validation
  • Database utilities support
  • Integration with JDBC, ODBC, Embedded SQL
  • XML shredding and publishing facilities
    • For composing and decomposing XML documents
benefits of db2 purexml technology
Benefits of DB2 pureXML Technology
  • Faster development
    • Code simplification
    • Avoiding XML-relational transformations
  • Increased agility
    • Versatile XML schema evolution
      • Quickly modify applications to support new or changing requirements
  • Improved usability
    • Exploit previously unmanaged XML data
    • Speed up query processing through XML-optimized storage and indexing
purexml usage scenarios
pureXML Usage Scenarios
  • Integration ofdiverse data sources
    • DB2 provides the ability to join XML documents
  • Formprocessing
    • Store whole electronic forms in DB2 rather than shredding
  • Document storage and querying
    • Store and manage less structured (document-centric) XML data in DB2
  • Using XML to support transactions
    • Service-oriented architectures (SOA) offer message-based transactions
    • XML data can then be retrieved, updated, searched and analyzed
  • Syndication and XML feeds
    • Serve XML feeds through a Web services interface
    • Provide a repository for XML data to support these feeds
db2 support for xml
DB2 Support for XML

0 – XML Storage

1 – Bind in XML

2 – Store as XML

3 – Shred into Relational

4 – Retrieve XML data

5 – Publish as XML

6 – Bind XML output

7 – XML to XML

8 – XML to Relational

9 – Relational to XML

DB2 ENGINE

7

XML

0

2

1

Applicationthat makesuse of XML

4

Textual

XML

3

8

9

6

5

Relational

XSR

(validation)

purexml architecture
pureXML Architecture
  • DB2 9 – a hybrid database system
    • One database with both relational and native XML data
    • Single hybrid database enginehandles all processing
  • Application may combine SQL and SQL/XML
    • Access relational and XML data in a hybrid database
  • XML data is stored separately from other table contents
    • Has its own table space
  • Supports XML document validation with XML schemas
    • XML schemas used for validation are registered with DB2
      • XML Schema Repository (XSR)
xml data type
XML Data Type
  • DB2 native XML support includes a new XML data type
    • An XML column holds one XML document for each row
  • XML data is stored in a parsed tree structure
  • XML document can also be stuffed into LOB or VARCHAR
    • Provides advantages in certain scenarios
  • XML storage requirements
    • Separate XML tablespace plus space for any needed indexes
the xml data type
The XML Data Type
  • Optimized storage
    • New XML data type
      • Supports insert, update, and delete
      • Stores parsed XML documents
      • Available when creating or altering tables

CREATE TABLE PATIENT (PATIENTID CHAR(6), PATIENT_XML XML)

CREATE TABLE P2 (PID INT, PTYP CHAR(8), PX1 XML, PX2 XML)

    • Supports access to nodes within XML document via XPath
  • Query optimization
    • CREATE INDEX supports specification of an xmlpattern
xml indexes
XML Indexes
  • Indexes are often used to improve query performance
  • Indexes on XML columns are supported in DB2 9
    • Uses an XML pattern (XPath) expression
    • Indexes paths and values in stored XML documents
  • XML index entries provide access to document nodes
    • Not limited to providing access to the beginning of a document
    • Index keys are created based on XML pattern expressions

CREATE INDEX PATINDEX ON PATIENT(PATIENT_XML) GENERATE KEYUSING XMLPATTERN '/patient/service/sdate' as SQL VARCHAR(10)

application development support
Application Development Support
  • Supports developing apps that include XML requirements
    • Language support: C/C++, Java, Assembler, Cobol, PL/I
    • API support: JDBC, DB2 / ODBC, Embedded SQL, SQLJ
    • SQL/XML query support
    • DB2 sample database enhancements
  • Universal DB2 driver for JDBC enhanced to support XML
    • Provided extension XML type: com.ibm.db2.DB2Xml

import com.ibm.db2.jcc.DB2Xml;

DB2Xml xml1 = (DB2Xml) rs.getObject ("patient_xml");

String s = xml1.getDB2String();

InputStream is = xml1.getDB2XMLBinaryStream("UTF-16");

db2 and xpath

DB2 and XPath

Navigating Through the XML Tree

db2 xpath
DB2 XPath
  • XPath is an expression language
    • Designed by the World Wide Web Consortium (W3C)
    • Used to navigate XML documents
    • XPath expressions are similar to file path notations
  • DB2 XPath can be used:
    • With the XMLQUERY SQL built-in function
      • To extract data from an XML column
    • With the XMLEXISTS SQL predicate
      • To evaluate data in an XML column
    • When creating an XML index
      • To determine the XML document nodes to be indexed

'/patient/name'  All name elements within patient elements

xpath expressions
XPath Expressions
  • XPath expressions: the basic building block of Xpath
  • Types of expressions provided by DB2 XPath:
    • Primary expressions: basic primitives of the language
      • Include literals, variable references, and function calls
    • Path expressions: locate nodes within a document tree
      • Include Node Tests and Filter Expressions
    • Arithmetic expressions: add, subtract, multiply, divide, modulus
    • Comparison expressions: compare two values
    • Logical expressions: use boolean logic
  • Anywhere an expression is expected, any kind of expression can be used
    • Operands of an expression are typically other expressions
xpath data model
XPath Data Model

Key:

Document node

Comment node

Element Node

Attribute node

Text node

OneEmp.xml

Sample comment

Employee

Name

Phone

Phone

Address

Dept

type="home"

type="cell"

mgr="bob"

800-555-1234

212-321-4321

Sales

LastName

FirstName

Street

City

State

Zipcode

Roger

Rabbit

123 Main

Smallville

Wyoming

98765

types of xpath nodes
Types of XPath Nodes
  • Document node encapsulates an XML document
    • Parent of root element node
  • Element node encapsulates an XML element
    • Can have one parent and many children
  • Attribute node represents an XML attribute
    • Belongs to an element
  • Text node encapsulates XML character content
    • Elements may have these
  • Processing Instruction (PI) node
    • Encapsulates XML processing instruction
  • Comment node encapsulates an XML comment
  • Namespaces node is considered to be a node in XPath
xpath nodes processing order
XPath Nodes Processing Order

xmlns:th="http://themis.com/test"

th:format="instructor led">

XML for DB2

Lots of information

Document node 1

Element node 2,5,7

Comment node 9

Processing Instruction 10

Text node 6,8

Attribute node 4

Namespace node 3

1

course

2

title

descrip

7

5

10

3

4

th="http://themis.com/test"

name=formatvalue=“instructor led”

target=ourOwnPIforCourseware

6

8

9

Lots of information

XML for DB2

To be determined

path expressions overview
Path Expressions Overview
  • Path expressions navigate the XML tree structure to locate nodes
  • Navigation axes are used in XPath; always start at context node
    • ForwardAxis moves down through the XML tree:
      • attribute (@)
      • child (default)
      • descendant
      • self (.)
      • descendant-or-self(//=/descendant-or-self:node()/)
    • ReverseAxis moves up through the XML tree:
      • parent (..)
  • Three parts of an axis step:
    • Axis specifies a direction of movement
    • Node test specifies node selection criteria
    • Predicates (zero or more) filter returned sequence

parent

Element

self

attribute

Element

Element

Text

Comment

children

descendants

Text

path expressions syntax
Path Expressions Syntax
  • XML document to support upcoming example code:
  • XPath expressions support abbreviated syntax in axis steps

@ is abbreviated syntax for attribute

// is abbreviated syntax for /descendant-or-self::node()/

.. is abbreviated syntax for parent::node()

. is abbreviated syntax for self::node()

is abbreviated syntax for child::node()

  • Sara Lee
  • 33 Maple
  • Nearly
  • NJ
  • 07123
  • 908-842-7531
  • [email protected]
  • 2004-02-29
  • brittle nails
  • prescribed hormones
  • 84.00
  • 15.00
path expression examples
Path Expression Examples
  • ‘ / ’ slash indicates that path begins at root node
    • Whole XML document
  • ‘//service’  two slashes at beginning of path expression
    • Requested node, service, may be located anywhere in the document
    • Returns the entire service element, including all of its children
  • ‘//phone/@*’  All attributes appearing under the phone element
    • Only attribute under phone is: home
  • ‘//@*’  All attributes in the XML document
    • All attributes: 11123 home
  • ‘/patient/name’  All name elements under patient
    • Only name is Sara Lee
  • ‘/patient/addr/city/..’  All child elements of addr
    • Element addr, the parent of city, plus all child elements of addr
filter expressions
Filter Expressions
  • A filter expression is a path expression followed by predicates in square brackets
    • Filter its result based on applying conditions; for example:
  • All service information of patients who have a $15.00 copay

'/patient/*[copay="15.00"]'

2004-02-29

brittle nails

prescribe hormones

84.00

15.00

  • Set context node to patient and filter based on attribute type rather than on an element

/patient/phone[@type="home"]

908-842-7531

  • Any patient that has email

/patient[email]

Entire XML document

  • Sara Lee
  • 33 Maple
  • Nearly
  • NJ
  • 07123
  • 908-842-7531
  • [email protected]
  • 2004-02-29
  • brittle nails
  • prescribed hormones
  • 84.00
  • 15.00
arithmetic expressions
Arithmetic Expressions
  • Arithmetic expressions: perform operations that involve addition, subtraction, multiplication, division, and modulus
  • The XPath arithmetic operators:

*multiplication

div division

idiv integer division

mod modulus

+ addition

- subtraction

  • An arithmetic expression results in a numeric value
    • Or an empty sequence or an error
  • Place arithmetic expressions in parentheses
arithmetic expression example
Arithmetic Expression Example
  • An arithmetic expression to calculate the the remaining balance after the copay (and its return value):

/patient/service/(cost – copay)

69.00

  • Sara Lee
  • 33 Maple
  • Nearly
  • NJ
  • 07123
  • 908-842-7531
  • [email protected]
  • 2004-02-29
  • brittle nails
  • prescribed hormones
  • 84.00
  • 15.00
comparison expressions general comparisons
Comparison Expressions – General Comparisons
  • Comparison expressions allow comparing two values
    • The comparison operators are = != < <= > >=
  • All services with a cost greater than 60

'/patient/service/cost > 60'

true

  • All services with a cost greater than 60, but get the patient’s service info instead of just true or false

'/patient/service[cost > 60]'

2004-02-29

brittle nails

prescribed hormones

84.00

15.00

  • Sara Lee
  • 33 Maple
  • Nearly
  • NJ
  • 07123
  • 908-842-7531
  • [email protected]
  • 2004-02-29
  • brittle nails
  • prescribed hormones
  • 84.00
  • 15.00
comparison expressions logical comparisons
Comparison Expressions – Logical Comparisons
  • Logical expressions using AND return true if both of two expressions are true
  • Logical expressions using OR return true if one or both expressions are true
  • Return phone numbers of the type work or fax (two options shown)'//phone[./@type="work" or ./@type="fax"]''//phone[@type="work" or @type="fax"]'
  • The result is this phone info

908-842-7531908-751-2468

  • Sara Lee
  • 33 Maple
  • Nearly
  • NJ
  • 07123
  • 908-842-7531
  • 908-751-2468
  • [email protected]
  • 2004-02-29
  • brittle nails
  • prescribed hormones
  • 84.00
  • 15.00
the built in function library
The Built-in Function Library
  • Built-in functions offered by the DB2 XPath library:
    • String functions
    • Numeric functions
    • Functions working on boolean values
    • Functions working on sequences
  • Calls to these function are allowed in an XPath expression anywhere an expression is expected
  • Example:

SELECT XMLQUERY('fn:concat($x,$y)' PASSING 'come '

AS "x", 'together' AS "y") FROM SYSIBM.SYSDUMMY1;

a sampling of xpath functions
A Sampling of XPath Functions

fn:compare compares two strings to see which one is greater

fn:concat concatenates two or more strings into a single string

fn:contains determines whether a string contains a given substring

fn:count returns the number of values in a sequence

fn:normalize-space strips leading and trailing whitespace characters

fn:lower-case converts a string to lowercase

fn:matches determines whether a string matches a given pattern

fn:position returns the position of the context item in the sequence

fn:replace replaces characters that match a pattern

fn:round returns the integer that is closest to a numeric value

fn:string returns the string representation of a value

fn:string-length returns the length of a string

fn:substring returns a substring of a string

fn:upper-case converts a string to uppercase

xml namespaces
XML Namespaces
  • XML namespaces prevent naming collisions
    • An XML namespace is a set of names identified by a namespace URI
    • Distinguishes element types or attribute names with the same name associated with different DTDs or Schemas
  • Namespaces allow qualifying names of elements and attributes
    • Contain an optional namespace prefix, a colon, and a local name
  • Example: two elements with the same name bound to different URIs:

excel

dining

DB2

empty prefix; bound to default element namespace

prolog namespace definition
Prolog – Namespace Definition
  • A DB2 XPath expression optionally contains a prolog
    • Establishes the processing environment
  • Prolog declaration may specify multiple namespace declarations
    • May also specify one default namespace declaration
  • Prolog declaration is always followed by a semicolon (;)
    • Syntax:
    • Examples:

declare namespace prefix="namespace string literal";

declare default element namespace "namespace string literal";

declare namespace fn="http://www.w3.org/2005/xpath-functions";

declare default element namespace "http://www.xyz.com/movies";

indexes and xml

Indexes and XML

Indexes Built On Values Within XML Documents

xml indexing
XML Indexing
  • DB2 supports creating indexes on XML columns
    • Generated using XML pattern expressions
    • Support access to nodes in the document
  • Multiple parts of an XML document can satisfy an XML pattern
    • Multiple index keys may be generated for insert of a single document
  • GENERATE KEY USING XMLPATTERN clause of CREATE INDEX
    • Specifies what you want to index
    • Contains XML pattern expression

Same as before XML

CREATE UNIQUE INDEX PATIENT_ID_IX ON PATIENT(PATIENT_XML)

GENERATE KEY USING XMLPATTERN '/patient/@id'

AS SQL VARCHAR(5)

XML node tobe indexed

Required keywords

Type of storedindex values

data types associated with pattern expressions
Data Types Associated With Pattern Expressions
  • Keys from XML pattern expression specified in a CREATE INDEX statement must be associated with a data type
    • May use either DECFLOAT or VARCHAR(n), where n <= 1000
    • Value being inserted/indexed must be convertible to this type

INSERT INTO PATIENT VALUES('12345',

'Jim Beam')

Value too long forVARCHAR(5) index

DSNT408I SQLCODE = -20305, ERROR: AN XML VALUE CANNOT BE INSERTED OR UPDATED BECAUSE OF AN ERROR DETECTED WHEN INSERTING OR UPDATING THE INDEX IDENTIFIED BY 'DBID~132 OBID~23' ON TABLE *N. REASON CODE = 1.

unique keyword in xml index definition
UNIQUE Keyword in XML Index Definition
  • The UNIQUE keyword is supported in XML index definitions
    • However, its meaning is different than in relational index definitions
  • When creating a relational index, the UNIQUE keyword enforces uniqueness across all rows in the table
  • When creating an index over XML data, the UNIQUE keyword enforces uniqueness across all documents in an XML column
example queries and supporting indexes
Example Queries and Supporting Indexes
  • Example 1

SELECT * FROM PATIENT WHERE XMLEXISTS('$Z/patient/service[copay="10.00"]' PASSING BY REF PATIENT_XML AS "Z")

Supporting index

CREATE INDEX copayIdx on PATIENT(PATIENT_XML) GENERATE KEY USING XMLPATTERN '/patient/service/copay' AS SQL DECFLOAT

  • Example 2

SELECT * FROM PATIENT WHERE XMLEXISTS('$Z/patient/phone[@type="work"]' PASSING BY REF PATIENT_XML AS "Z")

Supporting index

CREATE INDEX phoneTypIdx on PATIENT(PATIENT_XML) GENERATE KEY USING XMLPATTERN '/patient/phone/@type' AS SQL VARCHAR(20)

  • Example 3

SELECT * FROM PATIENT WHERE XMLEXISTS('$Z/patient/addr[city="Uptown"]' PASSING BY REF PATIENT_XML AS "Z")

Supporting index

CREATE INDEX cityIdx on PATIENT(PATIENT_XML) GENERATE KEY USING XMLPATTERN '/patient/addr/city' AS SQL VARCHAR(20)

xml schema
XML Schema
  • XML Schema Definition (XSD) defines structure of XML instance documents
    • Published as a recommendation by W3C
    • Defines elements and attributes permitted in a document
    • Defines parent / child relationship between elements
    • Defines data types, constraints and values for elements and attributes

xml schema management with the xsr
XML Schema Management With the XSR
  • XML schema repository (XSR)
    • Set of tables that store XML schemas
    • Created during DB2 installation or migration
  • XML schemas may be added to the XSR
    • Then used to validate XML documents being inserted or updated
  • Registering XML schema documents (adding to XSR):
    • Call DB2-supplied stored procedures from a DB2 application
    • Or invoke a provided JDBC method from a Java application
  • Removing XML schema documents from the DB2 XSR
    • Call DB2-supplied stored procedure or invoke JDBC method

Only schemas,and not DTDs,may be used forXML validation inDB2 Version 9

xml schema validation
XML Schema Validation
  • Use SQL INSERT statement to insert data into XML column
    • Inserted data must be a well-formed XML document
  • Validate the XML against a registered XML schema during insertion using the DSN_XMLVALIDATE function
    • User Defined Function
  • XML validation determines whether the structure, content, and data types of an XML document are valid according to a corresponding schema
  • Validation is optional

INSERT into AutoDealers VALUES( '12345', CURRENT DATE, 'Sams Deals', DSN_XMLValidate(:xmlDealerInfo, SYSXSR.DealerInfoSchema));

xml decomposition
XML Decomposition
  • Decomposition, or shredding, is the process of storing XML document content in columns of relational tables
    • Decomposed data has SQL type of column where it is inserted
  • An XML schema consists of one or more XML schema documents
  • Annotated XML schema decomposition
    • Control of the decomposition process is provided by XML schema annotation
resources
Resources

DB2 Version 9.1 for z/OS XML Guide (SC18-9858-03)

DB2 9 for z/OS Technical Overview (SG24-7330-00)

DB2 Version 9.1 for z/OS Application Programming and SQL Guide (SC18-9841-01)

DB2 Version 9.1 for z/OS Utility Guide and Reference (SC18-9855-02)

DB2 Version 9.1 for z/OS Application Programming Guide and Reference for Java(SC18-9842-01 )

ad