Semistructured data and xml
This presentation is the property of its rightful owner.
Sponsored Links
1 / 72

Semistructured Data and XML PowerPoint PPT Presentation


  • 84 Views
  • Uploaded on
  • Presentation posted in: General

Semistructured Data and XML. How the Web is Today. HTML documents often generated by applications consumed by humans only easy access: across platforms, across organizations only layout, no semantic information No application interoperability: HTML not understood by applications

Download Presentation

Semistructured Data and XML

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Semistructured data and xml

SemistructuredData and XML


How the web is today

How the Web is Today

  • HTML documents

    • often generated by applications

    • consumed by humans only

    • easy access: across platforms, across organizations

    • only layout, no semantic information

  • No application interoperability:

    • HTML not understood by applications

      • screen scraping brittle

    • Database technology: client-server

      • still vendor specific


Xml data exchange format

XML Data Exchange Format

  • A standard from the W3C (World Wide Web Consortium, http://www.w3.org).

  • The mission of the W3C

    „. . . developing common protocols that promote its evolution and ensure its interoperability. . .“.

  • Basic ideas

    • XML = data

    • XML generated by applications

    • XML consumed by applications

    • Easy access: across platforms, organizations.


Paradigm shift on the web

Paradigm Shift on the Web

  • For web search engines:

    • From documents (HTML) to data (XML)

    • From document management to document understanding (e.g., question answering)

    • From information retrieval to data management

  • For database systems:

    • From relational (structured) model to semistructured data

    • From data processing to data /query translation

    • From storage to transport


The semistructured data model

complex object

atomic object

The Semistructured Data Model

Bib

Object Exchange

Model (OEM)

&o1

paper

paper

book

references

&o12

&o24

&o29

references

references

author

page

author

year

author

title

http

title

title

publisher

author

author

author

&o43

&25

&96

1997

last

firstname

firstname

lastname

first

lastname

&243

&206

“Serge”

“Abiteboul”

“Victor”

122

133

“Vianu”


The semistructured data model1

The Semistructured Data Model

  • Data is self-describing, i.e. the data description is integrated with the data itself rather than in a separate schema.

  • Database is a collection of nodes and arcs (directed graph).

  • Leaf nodes represent data of some atomic type (atomic objects, such as numbers or strings).

  • Interior nodes represent complex objects consisting of components (child nodes), connected by arcs to this node.

  • Arcs are directed and connect two nodes.


The semistructured data model2

The Semistructured Data Model

  • Arc labels indicates the relationship between the two corresponding nodes.

  • The root node is the only interior node without in-arcs, representing the entire database.

  • All database objects are children of the root node.

  • Every node must be reachable from the root.

  • A general graph structure is possible, i.e. the graph need not be a tree structure.


Syntax for semistructured data

Syntax for Semistructured Data

Bib: &o1 { paper: &o12 { … },

book: &o24 { … },

paper: &o29

{ author: &o52 “Abiteboul”,

author: &o96 { firstname: &243 “Victor”,

lastname: &o206 “Vianu”},

title: &o93 “Regular path queries with constraints”,

references: &o12,

references: &o24,

pages: &o25 { first: &o64 122, last: &o92 133}

}

}

Observe: Nested tuples, set-values, oids!


Syntax for semistructured data1

Syntax for Semistructured Data

May omit oids:

{ paper: { author: “Abiteboul”,

author: { firstname: “Victor”,

lastname: “Vianu”},

title: “Regular path queries …”,

page: { first: 122, last: 133 }

}

}


Vs relational model

Vs. Relational Model

  • Missing attributes

  • Additional attributes

  • Multiple attribute values (set-valued attributes)

  • Objects as attribute values

  • No global schema

     only the first characteristics supported by relational model, all others are not


Vs relational model1

Vs. Relational Model

  • Semistructured data

    • Self-describing,

    • Irregular data,

    • No a-priori structure.

  • Relational DB

    • Separate schema,

    • Regular data,

    • A-priori structure.


  • Semistructured data and xml

    XML


    Important xml standards

    Important XML Standards

    • XSL/XSLT: presentation and transformation standards

    • RDF: resource description framework (meta-info such as ratings, categorizations, etc.)

    • Xpath/Xpointer/Xlink: standard for linking to documents and elements within

    • Namespaces: for resolving name clashes

    • DOM: Document Object Model for manipulating XML documents

    • SAX: Simple API for XML parsing

    • XQuery: query language


    Semistructured data and xml

    XML

    • A W3C standard to complement HTML

    • Origins: Structured text SGML

      • Large-scale electronic publishing

      • Data exchange on the web

    • Motivation:

      • HTML describes presentation

      • XML describes content

    • http://www.w3.org/TR/2000/REC-xml-20001006 (version 2, 10/2000)


    From html to xml

    From HTML to XML

    HTML describes the presentation


    Semistructured data and xml

    HTML

    <h1> Bibliography </h1>

    <p> <i> Foundations of Databases </i>

    Abiteboul, Hull, Vianu

    <br> Addison Wesley, 1995

    <p> <i> Data on the Web </i>

    Abiteboul, Buneman, Suciu

    <br> Morgan Kaufmann, 1999

    HTML describes the presentation


    Semistructured data and xml

    XML

    <bibliography>

    <book> <title> Foundations… </title>

    <author> Abiteboul </author>

    <author> Hull </author>

    <author> Vianu </author>

    <publisher> Addison Wesley </publisher>

    <year> 1995 </year>

    </book>

    </bibliography>

    XML describes the content


    Why are we db ers interested

    Why are we DB’ers interested?

    • It’s data. That’s us.

    • Database issues:

      • How are we going to model XML? (graphs).

      • How are we going to query XML? (XQuery)

      • How are we going to store XML (in a relational database? object-oriented? native?)

      • How are we going to process XML efficiently? (many interesting research questions!)


    Elements

    Elements

    • Tagsbook, title, author, …

      • start tag: <book>, end tag: </book>

      • defined by user / programmer (different from HTML!)

    • Elements <book>…<book>,<author>…</author>

      • An element consists of a matching start and end tag and the enclosed content.

      • Elements can be nested, i.e. content of one element can consist of sequence of other elements.


    Attributes

    Attributes

    • Attributes can be associated with any element.

    • Provide additional information about elements.

    • Attributes can have only one value.

    • Example

      <bookprice = “55” currency = “USD”>

      <title> Foundations of Databases </title>

      <author> Abiteboul </author>

      <year> 1995 </year>

      </book>

    • Attributes can also be used to connect elements.


    Non tree like xml

    Non-tree-like XML

    • So far: only tree-like XML documents,i.e. each element is nested within at most one other element.

    • Attributes can also be used to create non-tree XML documents.

    • Attributes with a domain of ID serve as primary keys of elements.

    • Attributes with a domain of IDREF serve as foreign keys referencing the ID of another element.


    Non tree like xml1

    Non-tree-like XML

    • Example of a non-tree structure

      <persons>

      <personpersonid=“o555”> <name> Jane </name>

      </person>

      <personpersonid=“o456”>

      <name> Mary </name>

      <childrenrefs=“o123 o555”</children >

      </person>

      <personpersonid=“o123” mother=“o456”>

      <name>John</name>

      </person>

      </persons>


    Namespaces

    Namespaces

    • An XML document can involve tags that come for multiple sources.

    • One and the same tag can appear in more than one source.

      <table> <tr>

      <td>Apples</td>

      <td>Bananas</td>

      </tr> </table>

      <table>

      <name>African Coffee Table</name>

      <width>80</width>

      <length>120</length>

      </table>


    Namespaces1

    Namespaces

    • Name conflicts can be resolved by prefixing tag names according to their source.

      <h:table> <h:tr> <h:td>Apples</h:td>

      <h:td>Bananas</h:td> </h:tr>

      </h:table>

      <f:table>

      <f:name>African Coffee Table</f:name>

      <f:width>80</f:width>

      <f:length>120</f:length>

      </f:table>

    • When using prefixes in XML, a namespace for the prefix must be defined.

    • The namespace must be referenced (via an URI) in the start tag of an enclosing element .


    Well formed xml

    Well-Formed XML

    • A well-formed XML document satisfies the following conditions:

      • Begins with a declaration that it is XML.

      • Has a single root element that encloses the whole document.

      • Consists of properly nested elements, i.e. start and end tag of an element are within the same enclosing element.

    • standalone =“yes” states that document has no DTD.

    • In this mode, you can invent your own tags, like in semistructured data model.


    Well formed xml1

    Well-Formed XML

    • <?XML version=“1.0” standalone =“yes” ?>

    • <bibliography>

      • <book> <title> Foundations… </title>

      • <author> Abiteboul </author>

      • <author> Hull </author>

      • <author> Vianu </author>

      • <publisher> Addison Wesley </publisher>

      • <year> 1995 </year>

      • </book>

      • <book> <title> … </title>

      • . . .

      • </book>

    • </bibliography>


    Well formed xml2

    Well-Formed XML

    • HTML browsers will display documents with errors (like missing end tags).

    • The W3C XML specification states that a program should stop processing an XML document if it finds an error.

    • The main reason is that XML is being consumed by programs rather than by humans (as HTML).

    • W3C provides a validator that checks whether an XML document is well-formed.


    Valid xml

    Valid XML

    • The validator can also check whether an XML document is valid, i.e. conforms to a Document Type Definition (DTD).

    • A DTD specifies the allowable tags and how they can be nested.

    • XML with a DTD is no longer semistructured (self-describing).

    • However, a DTD is less rigid than the schema of a relational DB. E.g., a DTD allows missing and multiple attributes / elements.


    Semistructured data and xml

    DTD


    Document type definitions

    Document Type Definitions

    • Document Type Definition (DTD): set of rules (grammar) specifying elements, attributes and all other aspects of XML documents.

    • For each element, specify name and content type.

    • Content type can, e.g., be

      • #PCDATA (character string),

      • other elements,

      • regular expression made of the above content types* = zero or more occurrences? = zero or one occurrence+ = one or more occurrences, = sequence of elements.


    D ocument t ype d escriptors

    Document Type Descriptors

    • Sort of like a schema but not really.

    • Inherited from SGML DTD standard

    • BNF grammar establishing constraints on element structure and content

    • Definitions of entities


    Example dtd product catalog

    Example DTD: Product Catalog

    <!DOCTYPE CATALOG [

    <!ELEMENT CATALOG (PRODUCT+)>

    <!ELEMENT PRODUCT (SPECIFICATIONS+,OPTIONS?,PRICE+,NOTES?)>

    <!ATTLIST PRODUCT NAME CDATA #IMPLIED

    CATEGORY (HandTool|Table|Shop-Professional) "HandTool"

    PARTNUM CDATA #IMPLIED

    PLANT (Pittsburgh|Milwaukee|Chicago) "Chicago"

    INVENTORY (InStock|Backordered|Discontinued) "InStock">

    <!ELEMENT SPECIFICATIONS (#PCDATA)>

    <!ATTLIST SPECIFICATIONS WEIGHT CDATA #IMPLIED

    POWER CDATA #IMPLIED>

    <!ELEMENT OPTIONS (#PCDATA)>

    <!ATTLIST OPTIONS FINISH (Metal|Polished|Matte) "Matte"

    ADAPTER (Included|Optional|NotApplicable) "Included"

    CASE (HardShell|Soft|NotApplicable) "HardShell">

    <!ELEMENT PRICE (#PCDATA)>

    <!ATTLIST PRICE MSRP CDATA #IMPLIED

    WHOLESALE CDATA #IMPLIED

    STREET CDATA #IMPLIED

    SHIPPING CDATA #IMPLIED>

    <!ELEMENT NOTES (#PCDATA)> ]>


    Shortcomings of dtds

    Shortcomings of DTDs

    Useful for documents, but not so good for data:

    • Element name and type are associated globally

    • No support for structural re-use

      • Object-oriented-like structures aren’t supported

    • No support for data types

      • Can’t do data validation

    • Can have a single key item (ID), but:

      • No support for multi-attribute keys

      • No support for foreign keys (references to other keys)

      • No constraints on IDREFs (reference only a Section)


    Xml schema

    XML Schema


    Xml schema1

    XML Schema

    • The successor of DTDs to specify a schema for XML documents.

    • A W3C standard.

    • Includes and extends functionality of DTDs.

    • In particular, XML Schemas support data types. This makes it easier to validate the correctness of data and to work with data from a database.

    • XML Schemas are written in XML. You don't have to learn a new language and can use your XML parser to parse your Schema files.


    Example xml schema

    Example XML Schema

    <schema version=“1.0” xmlns=“http://www.w3.org/1999/XMLSchema”>

    <element name=“author” type=“string” />

    <element name=“date” type = “date” />

    <element name=“abstract”>

    <type> … </type>

    </element>

    <element name=“paper”>

    <type>

    <attribute name=“keywords” type=“string”/>

    <element ref=“author” minOccurs=“0” maxOccurs=“*” />

    <element ref=“date” />

    <element ref=“abstract” minOccurs=“0” maxOccurs=“1” />

    <element ref=“body” />

    </type>

    </element>

    </schema>


    Simple elements

    Simple Elements

    • Simple elements contain only text.

    • They can have one of the built-in datatypes:

      xs:string, xs:decimal, xs:integer, xs:boolean

      xs:date, xs:time.

    • Example

      <xs:element name="lastname“ type="xs:string"/>

      <xs:element name="age" type="xs:integer"/>

      <xs:element name="dateborn" type="xs:date"/>


    Simple elements1

    Simple Elements

    • Restrictions allow you to further constrain the content of simple elements.

      <xs:element name="age">

      <xs:simpleType>

      <xs:restriction base="xs:integer">

      <xs:minInclusive value="0"/> <xs:maxInclusive value="120"/>

      </xs:restriction>

      </xs:simpleType>

      </xs:element>


    Attributes1

    Attributes

    • Attributes can be specified using the attribute element:

      <xs:attribute name="xxx" type="yyy"/>

    • Attribute elements are nested within the element of the element with which they are associated.

    • By default, attributes are optional.

    • To make an attribute mandatory, use

      <xs:attribute name="lang“ type="xs:string“use="required"/>

    • Attributes can have the same built-in datatypes as simple elements.


    Complex elements

    Complex Elements

    • Complex elements can contain other elements and can have attributes.

    • Nested elements need to occur in the order specified.

    • The number of repetitions of elements are controlled by the attributes minOccurs and maxOccurs. The default is one repetition.

    • A complex element with an attribute:

      <xs:element name="product"><xs:complexType> <xs:attribute name="prodid" type="xs:positiveInteger"/> </xs:complexType> </xs:element>


    Complex elements1

    Complex Elements

    • A complex element containing a sequence of nested (simple) elements:

      <xs:element name="employee"> <xs:complexType> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence> </xs:complexType>

      </xs:element>


    Complex elements2

    Complex Elements

    • If you name the complex element, other elements can reference and include it:

      <xs:complexType name="persontype">

      <xs:sequence>

      <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence>

      </xs:complexType>

      <xs:element name="person" type="persontype"/>


    Example xml schema1

    Example XML Schema

    <schema version=“1.0” xmlns=“http://www.w3.org/1999/XMLSchema”>

    <element name=“author” type=“string” />

    <element name=“date” type = “date” />

    <element name=“abstract”>

    <type> … </type>

    </element>

    <element name=“paper”>

    <type>

    <attribute name=“keywords” type=“string”/>

    <element ref=“author” minOccurs=“0” maxOccurs=“*” />

    <element ref=“date” />

    <element ref=“abstract” minOccurs=“0” maxOccurs=“1” />

    <element ref=“body” />

    </type>

    </element>

    </schema>


    Xml vs semistructured data

    XML vs. Semistructured Data

    • Both described best by a graph.

    • Both are schema-less, self-describing(XML without DTD / XML schema).

    • XML is ordered, semistructured data is not.

    • XML can mix text and elements:

      <talk> Making Java easier to type and easier to type

      <speaker> Phil Wadler </speaker>

      </talk>

    • XML has lots of other stuff: attributes, entities, processing instructions, comments.


    Xml path xpath

    XML-Path = XPath


    Query languages for xml

    Query Languages for XML

    • XPath is a simple query language based on describing similar paths in XML documents.

    • XQuery extends XPath in a style similar to SQL, introducing iterations, subqueries, etc.

    • XPath and XQuery expressions are applied to an XML document and return a sequence of qualifying items.

    • Items can be primitive values or nodes (elements, attributes, documents).

    • The items returned do not need to be of the same type.


    Xpath

    XPath

    • A path expression returns the sequence of all qualifying items that are reachable from the input item following the specified path.

    • A path expression is a sequence consisting of tags or attributes and special characters such as slashes (“/”).

    • Absolute path expressions are applied to some XML document and returns all elements that are reachable from the document’s root element following the specified path.

    • Relative path expressions are applied to an arbitrary node.


    Xpath1

    XPath

    <?XML version=“1.0” standalone =“yes” ?>

    <bibliography>

    <book bookID = “b100“> <title> Foundations… </title>

    <author> Abiteboul </author>

    <author> Hull </author>

    <author> Vianu </author>

    <publisher> Addison Wesley </publisher>

    <year> 1995 </year> </book>

    </bibliography>

    • Applied to the above document, the XPath expression /bibliography/book/author returns the sequence

      <author> Abiteboul </author>

      <author> Hull </author>

      <author> Vianu </author> . . .


    Attributes2

    Attributes

    • If we do not want to return the qualifying elements, but the value one of their attributes, we end the path expression with @attribute.

      <?XML version=“1.0” standalone =“yes” ?>

      <bibliography>

      <book bookID = “b100“> <title> Foundations… </title>

      <author> Abiteboul </author>

      <author> Hull </author>

      <author> Vianu </author>

      <publisher> Addison Wesley </publisher>

      <year> 1995 </year> </book>

      the XPath expression

      [email protected]

      returns the sequence

      “b100“ . . .


    Wildcards

    Wildcards

    • We can use wildcards instead of actual tags and attributes:* means any tag, and @* means any attribute.

    • Examples /bibliography/*/author returns the sequence <author> Abiteboul </author>

      • <author> Hull </author>./bibliography//author/@* returns the sequence “IBM““a739“.


    Path expressions

    Path Expressions

    Examples:

    • Bib.paper

    • Bib.book.publisher

    • Bib.paper.author.lastname

      Given an OEM instance, the value of a path expression p is a set of objects


    Path expressions1

    Bib

    &o1

    paper

    paper

    book

    references

    &o12

    &o24

    &o29

    references

    references

    author

    page

    author

    year

    author

    title

    http

    title

    title

    publisher

    author

    author

    author

    &o43

    &25

    &o44

    &o45

    &o46

    &o52

    &96

    1997

    &o51

    &o50

    &o49

    &o47

    &o48

    last

    firstname

    firstname

    lastname

    first

    lastname

    &o70

    &o71

    &243

    &206

    “Serge”

    “Abiteboul”

    “Victor”

    122

    133

    “Vianu”

    Path Expressions

    Examples:

    DB =

    Bib.paper={&o12,&o29}

    Bib.book.publisher={&o51}

    Bib.paper.author.lastname={&o71,&206}


    Xml query xquery

    XML-Query = XQuery


    Xquery

    XQuery

    Summary:

    • FOR-LET-WHERE-ORDERBY-RETURN = FLWOR

    FOR/LET Clauses

    List of tuples

    WHERE Clause

    List of tuples

    ORDERBY/RETURN Clause

    Instance of Xquery data model


    Xquery1

    XQuery

    • FLWOR expressions are similar to SQL select . . from . . . where . . . queries.

    • XQuery allows zero, one or more for and let clauses.

    • The where clause is optional.

    • There is one optional order-by clause.

    • Finally, there is exactly one return clause.

    • XQuery is case-sensitive.

    • XQuery (and XPath) is a W3C standard.


    Xquery clauses

    XQuery Clauses

    • for $x in expr

      • Defines node variable $x.

      • The expression exprevaluates to a sequence of items.

      • The variable $x is assigned to each item, in turn, and the body of the for clause is executed once for each assignment.

    • let $x := expr

      • Defines collection variable $x.

      • The expression exprevaluates to a sequence of items.

      • The variable is bound to the entire sequence of items.

      • Useful for common subexpressions and for aggregations.


    Xquery clauses1

    XQuery Clauses

    • where condition

      • The condition is a boolean expression.

      • The clause is applied to some item.

      • If and only if the condition evaluates to true, the following return clause is executed for that item.

    • returnexpression

      • The result of a FLWOR clause is a sequence of items.

      • Expressiondefines the result format for the current (qualifying) item.

      • The sequence of items produced by expression is appended to the sequence of items produced so far.


    Interpretation as xquery

    Interpretation as XQuery

    • XQuery expressions can be used wherever an XML expression of any kind is permitted.

    • Any text string is acceptable as content of a tag or value of an attribute.

    • If a string contains an XQuery expression that should be evaluated, this substring must be surrounded by curly brackets {}.

    • Example

      for$b in doc("bib.xml")/bibliography/book return <result id = [email protected]}>{$b/title}</result>


    For v s let

    FOR v.s. LET

    • Find all books

    FOR$xINdocument("bib.xml")/bib/book

    RETURN <result> $x </result>

    Returns:

    <result> <book>...</book></result>

    <result> <book>...</book></result>

    <result> <book>...</book></result>

    ...

    Returns:

    <result> <book>...</book>

    <book>...</book>

    <book>...</book>

    ...

    </result>

    LET$xINdocument("bib.xml")/bib/book

    RETURN <result> $x </result>


    Xquery2

    XQuery

    Find all book titles published after 1995:

    FOR$xINdocument("bib.xml")/bib/book

    WHERE$x/year > 1995

    RETURN$x/title

    Result:

    <title> abc </title>

    <title> def </title>

    <title> ghi </title>


    Ordering the query result

    Ordering the Query Result

    • The order-by clause allows you to order the results of an XQuery expression.

      order-by list of expressions

    • The sort order is based on the value of the first expression. Ties are broken based on the value of the second (if necessary third etc.) expression.

    • By default, the order is ascending.

    • A descending sort order can be specified using descending.


    Elimination of duplicates

    Elimination of Duplicates

    • The built-in function distinct-values eliminates duplicates from a sequence of result items.

    • In principle, it applies only to primitive (atomic) types.

    • It can also be applied to elements, but then it will remove their tags, replacing them by quotes “”.

    • Example

      If return $b/title produces <title> aaa </title> <title> bbb </title> <title> aaa </title>

      then distinct-values (return $b/title) produces

      “aaa” “bbb”.


    Xquery3

    XQuery

    For each author of a book by Morgan Kaufmann, list all books she published:

    FOR$aINdistinct(document("bib.xml")/bib/book[publisher=“Morgan Kaufmann”]/author)

    RETURN <result>

    $a,

    FOR$tIN /bib/book[author=$a]/title

    RETURN$t

    </result>

    Result:

    <result>

    <author>Jones</author>

    <title> abc </title>

    <title> def </title>

    </result>

    <result>

    <author> Smith </author>

    <title> ghi </title>

    </result>

    distinct = a function thateliminates duplicates


    Joins

    Joins

    • We can join two or more documents, by using one variable for each of the documents .

    • We let a variable range over the elements of the corresponding document, within a for-clause.

    • Need to be careful when comparing elements for equality, since their equality is by element identity, not by element content.

    • Typically, we want to compare the element content.

    • The built-in function data(E) returns the content of an element E.


    Xquery4

    XQuery

    Find books whose price is larger than average:

    LET$a=avg(document("bib.xml")/bib/book/price)

    FOR$b in document("bib.xml")/bib/book

    WHERE$b/price > $a

    RETURN$b


    Sorting in xquery

    Sorting in XQuery

    <publisher_list>

    FOR$pINdistinct(document("bib.xml")//publisher)

    ORDERBY $p

    RETURN <publisher> <name> $p/text() </name> ,

    FOR$bIN document("bib.xml")//book[publisher = $p]

    ORDERBY $b/priceDESCENDING

    RETURN <book>

    $b/title ,

    $b/price

    </book>

    </publisher>

    </publisher_list>


    If then else

    If-Then-Else

    FOR$h IN //holding

    ORDERBY $h/title

    RETURN <holding>

    $h/title,

    [email protected] = "Journal"

    THEN$h/editor

    ELSE$h/author

    </holding>


    Existential quantifiers

    Existential Quantifiers

    FOR$b IN //book

    WHERESOME$p IN $b//paraSATISFIES

    contains($p, "sailing")

    AND contains($p, "windsurfing")

    RETURN$b/title


    Quantification

    Quantification

    • XQuery supports the existential and the universal quantifier.

    • Universal quantifierevery $v in expression1 satisfies expression 2

    • Existential quantifiersome $v in expression1 satisfies expression 2

    • Expression1 evaluates to a sequence of items, expression 2 is a boolean expression.


    Aggregation

    Aggregation

    • XQuery provides built-in functions for the standard aggregations such as SUM, MIN, COUNT and AVG.

    • They can be applied to any XQuery expression, i.e. to any sequence of items.

    • Exampleavg(doc("bib.xml")/bibliography/book/price)count(doc("bib.xml")/bibliography/book/price)Computes the average book price and the number of books, resp.


    Xquery examples

    XQuery Examples

    • Find books whose price is larger than the average price.

    • Uses aggregate operator (avg), applied to the result of a path expression.

    let$a:=avg(doc("bib.xml")/bibliography/book/price)

    for$bindoc("bib.xml")/bibliography/book

    where$b/price > $a

    return$b


    Xquery examples1

    XQuery Examples

    • Find title of books with a paragraph containing the terms “sailing” and “windsurfing”.

    • Uses existential quantifier (some) and string matching (contains).

    for$bindoc("bib.xml")//book

    where some$pin$b//parasatisfies

    contains($p, "sailing") and contains($p, "windsurfing")

    return$b/title


  • Login