XPath, the best known modal logic ever. And . . . made in Amsterdam! Maarten Marx Information and Language Processing Systems (ILPS)

Download Presentation

XPath, the best known modal logic ever. And . . . made in Amsterdam! Maarten Marx Information and Language Processing Systems (ILPS)

Loading in 2 Seconds...

- 90 Views
- Uploaded on
- Presentation posted in: General

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

XPath, the best known modal logic ever.

And . . . made in Amsterdam!

Maarten Marx

Information and Language Processing Systems (ILPS)

Informatics Institute,

University of Amsterdam, The Netherlands

XPath, what is that?

• A standard language proposed by the W3C in November 1999.

• XPath is a language for addressing parts of an XML document.

• XPath beats temporal logic as the best known modal logic:

• Google: Resultaten 1 - 10 van circa 1.870.000 voor XPath

• Google: Resultaten 1 - 10 van circa 242.000 voor ”temporal logic”

Research aim behind this talk

• Create an expressively complete navigational query language for XML

documents.

Aim of this talk

• Show that modal logic is the right paradigm for such a task.

• One can get remarkable results with (for modal logicians) simple proofs.

• The modal logic literature is full of hints and almost-results.

- FO logic of “XML documents”
- • An XML document can be seen as a finite, node labeled, sibling ordered, unbounded tree.
- • Nb. We abstract away from the “data details” and only focus on the skeleton of an XML
- document.
- • The first order language for these models has
- descendant relation
- 2. following sibling relation
- 3. unary predicates corresponding to node labels and attribute–value
- pairs.
- • Nb. we cannot express joins on attribute values!

Known results

• For binary relations, very little is known. Immerman Kozen:

1. strings have the 3 variable property;

2. bounded trees have a k variable property.

• For unary relations, more is known:

1. Kamp’s theorem, strings have H-dimension 3;

2. unbounded unordered trees have no finite H-dimension (Schlingloff)

3. unbounded ordered trees have H-dimension 3 (PODS 2004).

• Nb.

1. k-variable property is stronger than H-dimension;

2. k-variable property is independent from “finite complete set of

operators property” (Hodkinson–Simon, JPhL).

• Thus we cannot answer our research goal by known results.

- Conditional XPath
- The syntax is based on
- XPath 1.0 (W3C)
- 2. Kleene algebras (= regular path queries) with tests (Kozen)
- 3. Propositional Dynamic Logic (Pratt, Harel)
- step ::= child | parent | right | left.
- path wff ::= step | (step?node wff) +
- ?node wff
- |path wff/path wff | path wff [ path wff.
- node wff ::= p | hh path wff ii | ¬ node wff | node wff ^ node wff.

Semantics

• Given an ordered tree,

– each path wff denotes a set of pairs of nodes, and

– each node wff denotes a set of nodes.

• All set theoretical operations have their standard meaning.

• hh p wff ii is true at a node n iff n is in the domain of the relation p wff.

Note! Every path wff (node wff) defines a first order definable binary

(unary) relation.

Example expressions

child :: pi child/?pi

child :: pi[descendant :: ] child/?pi/? hh child +

ii /descendant :: pi ? ¬hh parent ii /child + /?pi

child :: child

self :: pi[child] ?(pi ^ hh child ii )

preceding :: pi parent/left + /child/?pi.

Equivalent XPath 1.0 and Conditional XPath expressions.

Conditional XPath fulfills our research goal

• Theorem 1 (Kamp/PODS 2004) Every FO definable set of nodes is

definable by a Conditional XPath node wff.

• Theorem 2 Every first order definable binary relation is definable by a

Conditional XPath path wff.

• Corollary Every FO relation °(x1, . . . , xn) is equivalent to a union of

conjunctive queries consisting of atoms of the form xi path wff xj.

Difference between the two theorems

• Theorem 1 is about node wffs and unary relations. Theorem 2 about

path wffs and binary relations.

• Theorem 2 implies theorem 1, but not conversely.

• Node wffs have much stronger operators (and, not, bounded quantification).

• Path wffs only have “until”, concatenation and union.

XML document

An XML document can be seen as a

finite, node labelled, sibling ordered unbounded tree.

(Nb. We abstract away from the “data details” and only focus on the skeleton of an XML document.)

Design Constraints

• Stay as close as possible to the existing W3C standard XPath.

• This means:

– no (first or second order) variables.

– express sets of nodes (answer sets) and relations between nodes

(paths).

– relations should be “drawable” (use only the regular expression

operators)

Navigational XPath

We can give W3C XPath 1.0 a PDL like definition:

step ::= child | parent | right | left.

path wff ::= step | step +

| ?node wff

| path wff ; path wff | path wff [ path wff.

node wff ::= p | hh path wff ii | ¬ node wff | node wff ^ node wff.

• Note the very restricted use of ( · ) + !

• We use hh path wff ii to mean “I start a path wff”.

Modal Logic, XPath and XML , ten Cate Workshop, Februari, 2005. 8

- Examples of Navigational XPath expressions
- ¬hh parent ii
- (2) ¬hh child ii
- (3) hh child ;?first ; right ;?last ii
- (4) root ^ ¬hh child ;?( ¬ leaf ^ ¬hh child ;?first ; right ;?last ii ) ii
- Not expressible in this version of XPath are
- (child ;?q) ; child ;?p until p, q holds (as a relation)
- (child ; child) ;?leaf the relation of being an even number
- of steps above a leaf
- Modal Logic, XPath and XML , ten Cate Workshop, Februari, 2005. 9

- Results (from the modal logic literature)
- The node wffs form a modal language, created by Blackburn, Meyer-Viol, de Rijke in the 90’s. The logic is finitely axiomatizable.
- 2. SAT problem is hard for EXPTIME (from Fisher-Ladner 79 for PDL).
- 3. SAT problem is decidable in EXPTIME (from Vardi, Wolper 86 for PDL with converse).
- 4. This language is expressively complete w.r.t. first order logic in two variables (use the result for the line of Etessami, Vardi, Wilke 97).
- 5. Cf. ACM SIGMOD Record March 2005.

Conclusion

• The W3C standard is a well designed language. They have reinvented a wheel which has been shown to possess very good properties.

• Still, the expressive completeness is not completely satisfactory. (Note that W3C XPath is not complete for two variable FO for paths.)

• Conditional XPath is excellent for expressing first order queries.

• Implementing the conditional axis is still open (special staircase joins?)