xml constraints specification analysis and applications l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
XML Constraints: Specification, Analysis, and Applications PowerPoint Presentation
Download Presentation
XML Constraints: Specification, Analysis, and Applications

Loading in 2 Seconds...

play fullscreen
1 / 50

XML Constraints: Specification, Analysis, and Applications - PowerPoint PPT Presentation


  • 139 Views
  • Uploaded on

XML Constraints: Specification, Analysis, and Applications. Wenfei Fan School of Informatics, University of Edinburgh & Network Data and Services Research, Bell Laboratories. Outline. XML Specifications: types and integrity constraints Specification of XML constraints:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'XML Constraints: Specification, Analysis, and Applications' - daniel_millan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
xml constraints specification analysis and applications

XML Constraints: Specification, Analysis, and Applications

Wenfei Fan

School of Informatics, University of Edinburgh

&

Network Data and ServicesResearch, Bell Laboratories

outline
Outline
  • XML Specifications: types and integrity constraints
  • Specification of XML constraints:
    • keys, foreign keys, FDs
    • absolute vs. relative constraints
  • Analysis of XML constraints
    • Consistency analysis
    • Implication analysis
  • Applications of XML constraints, and research issues
    • Relational storage of XML data via constraint propagation
    • Schema-directed XML integration
    • Normal forms, query optimization, updates, data cleaning . . .
introduction to xml specificaiton
Introduction to XML specificaiton
  • XML Specification:
    • types
    • integrity constraints
    • the need for XML constraints
xml data an example

db

...

province

province

capital

capital

@name

city

capital

“Hasselt”

@inProvince

“Limburg”

“Limburg”

“others”

@inProvince

“Hasselt”

“Limburg”

XML data - an example

Rooted, node-labeled tree

  • elements: db, province, capital, city, subtree/sub-document elements/subelements, e.g., the capital child of province
  • @attributes: @name, @inProvince, carrying text
  • text nodes, e.g., “Hasselt”
xml specification dtd type

db

...

province

province

capital

capital

@name

city

capital

“Hasselt”

@inProvince

“Limburg”

“Limburg”

“others”

@inProvince

“Hasselt”

“Limburg”

XML specification: DTD (type)
  • Production: constrains the subelement list of each element <!ELEMENT db (province+, capital+)>

<!ELEMENT province (city*, capital)>

  • Attributes: uniquely identified by name for each element, unordered

province: @name, capital: @inProvince

xml specification integrity constraints

db

...

province

province

capital

capital

@name

city

capital

“Hasselt”

@inProvince

“Limburg”

“Limburg”

“others”

@inProvince

“Hasselt”

“Limburg”

XML specification: integrity constraints

Keys and foreign keys (vs. relational constraints):

  • key: the value of a @name uniquely identifies a province

province.@name province

capital.@inProvince capital

  • FK: @inProvince of a capital references @name of a province

capital.@inProvince  province.@name

xml specification
XML specification
  • A type (DTD) D
  • A set of integrity constraints, 

Example:

  • DTD D: structure of the document, vs. types in a PL

<!ELEMENT db (province+, capital+)>

<!ELEMENT province (city*, capital)>

province.@name, capital.@inProvince

  • Constraints  : defined in terms of data values across elements

province.@name  province

capital.@inProvince  capital

capital.@inProvince  province.@name

why xml constraints
Why XML constraints?

Supported by W3C XML standard, XML Schema

In databases (supported by SQL standard), constraints are:

  • an essential part of the semantics of data,
  • fundamental to conceptual design,
  • useful for choosing efficient storage and access methods,
  • central to update anomaly prevention, …

In the XML setting: constraints have proved useful in

  • database storage of XML data (via constraint propagation),
  • schema-directed database publishing/integration in XML,
  • XML query optimization and formulation,
  • design theory for XML specifications: normal forms
  • data cleaning, …
data exchange on the web xml publishing

DTD

constraints

Data exchange on the Web: XML publishing

All members of a community (or industry) agree on a schema and exchange data w.r.t. the schema:e-commerce, health-care, ...

Schema-Directed XML Publishing/Integration:

  • mapping data from traditional database to XML
  • satisfying the predefined DTD and constraints

Web

XML

XML

Q: XML view

DB1

DB2

data exchange on the web xml shredding
Data exchange on the Web: XML shredding

XML shredding:

  • mapping XML data to relations
  • relational design: normalization via constraint propagation from XML to relations
    • optimal relational storage of XML data
    • semantic connection: query/update optimization

Web

XML

XML

XML keys

XML shredding

propagation

DB1

DB2

relational FDs

xml constraints
XML constraints
  • Specification of XML constraints:
    • keys, foreign keys, FDs
    • absolute vs. relative constraints
absolute constraints

db

...

province

province

capital

capital

@name

city

capital

@inProvince

“Hasselt”

“Limburg”

“Limburg”

“others”

@inProvince

“Hasselt”

“Limburg”

absolute constraints

Absolute keys and foreign keys are to hold on the entire document.

province.@name  province

capital.@inProvince  capital

capital.@inProvince  province.@name

Extensions of relational counterparts

absolute keys and foreign keys pods 00 01
Absolute keys and foreign keys [PODS’00, 01]
  • key: [X]  . An XML document satisfies the key iff

 x y  ext() (l X (x.l = y.l)  x = y)

  • foreign key (FK): a combination of an inclusion constraint  1[X]  2[Y], and a key  2[Y]   2 .

A document satisfies the FK iff it satisfies the key and

 x  ext(1 )  y  ext(2 ) (x[X] = y[Y])

    • , 1 ,2: element types; X, Y: sets (lists) of attributes;
    • ext(): the set of  elements in an XML document.

Equality issue:

  • (string) value equality: when comparing attributes
  • node identify: when comparing XML elements

Unary keys and foreign keys: defined in terms of single-attribute.

relative constraints www 01 pods 02
Relative constraints [WWW’01, PODS’02]

An XML tree specifies countries, provinces, province capitals.

  • What is a key for a province?
  • What does @inProvince of a capital reference?

db

...

country

country

...

...

province

capital

province

capital

@name

@name

“Holland”

“Belgium”

@name

capital

@name

capital

“Hasselt”

@inProvince

@inProvince

“Maastricht”

“Limburg”

“Limburg”

“Limburg”

“Limburg”

@inProvince

“Hasselt”

@inProvince

“Hasselt”

“Limburg”

“Limburg”

examples of relative constraints
Examples of relative constraints

Relative constraints: on a subdocument rooted at a country:

key: country (province.@name  province)

country (capital.@inProvince  capital)

FK: country (capital.@inProvince  province.@name)

Absolute: on the entire document: country.@name  country

db

...

country

country

...

...

province

capital

province

capital

@name

@name

“Belgium”

“Holland”

@name

capital

“Hasselt”

@name

capital

@inProvince

“Maastricht”

@inProvince

“Limburg”

“Limburg”

“Limburg”

“Limburg”

@inProvince

“Hasselt”

@inProvince

“Hasselt”

“Limburg”

“Limburg”

relative keys and foreign keys
Relative keys and foreign keys
  • key: (1[X]  1). An document satisfies the key iff

 c  ext() y, z  ext(1)

( (y c)  (z  c)  l X (y.l = z.l)  y = z)

  • foreign key (FK): ( 1[X]  2[Y] ) and a key ( 2[Y]  2).

A document satisfies the FK iff it satisfies the key and

 c  ext()  y  ext(1) (( y  c) 

 z  ext(2 ) ((z  c)  y[X] = z[Y] ))

where 

  • (y c):y is a descendant of c (y in the subtree rooted at c);
  • : context type;
  • ext(): the set of  elements in an XML document.
relative vs absolute
Relative vs. Absolute
  • Absolute constraints are a special case of relative ones:

country.@name  country  db ( country.@name  country )

absolute: a fixed context type -- the root type r

  • Absolute constraints are scoped within the entire document; whereas relative ones within the context of a subdocument.

country (province.@name  province)

country (capital.@inProvince  capital)

country (capital.@inProvince  province.@name)

country.@name  country

Together they specify constraints on the entire document

  • Beyond relational constraints; important for hierarchically structured data: XML, scientific databases, biomedical data, ...
define keys with path expressions

db

company

company

government

university

...

employee

employee

dept

employee

employee

name

name

employee

@id

name

Define keys with path expressions
  • XML data is hierarchically structured!

“name” as a key for employees of companies only: target set is identified with a path expression: //company//employee

  • XML data is semistructured: it may not have a DTD/schema!
    • key paths may be missing or have multiple occurrences

key specification should be independent of types

name

@id

@id

name

firstName

lastName

absolute path constraints www 01
Absolute path constraints [WWW’01]

Absolute key: (Q, {P1, . . ., Pk} )

  • Path expressions Q, Pi: XPath, regular path expressions, …
  • target path Q: to identify a target set [[Q]] of nodes on which the key is defined (vs. relation)
  • a set of key paths {P1, . . ., Pk}: to provide an identification for nodes in [[Q]] (vs. key attributes)
  • semantics: for any two nodes in [[Q]], if they have all the key paths and agree on them by value equality (existential), then they must be the same node (value equality and node identity)

Examples:

(//company//employees, {name, phone})-- composite key

( //company//employees, {//@id})-- multiple keys

(//., {@id})-- capturing ID attributes in DTDs

relative path constraints www 01
Relative path constraints [WWW’01]

Relative key: (Q, K)

  • path Q identifies a set [[Q]] of nodes, called the context path;
  • K = (Q’, {P1, . . ., Pk} ) is a key on sub-documents rooted at nodes in [[Q]] (relative to Q).

Example. (//country, (province, {@capital}))

(//country, {@name}) -- absolute key

  • Absolute keys are a special case of relative keys:

(Q, K) when Q is the empty path

  • Similarly for foreign keys

Specification of XML constraints is more involved than its relational counterparts

keys and foreign keys in xml schema
Keys and foreign keys in XML Schema

key: (Q, {P1, . . ., Pk} )

  • Path expressions Q, Pi: fragments of XPath
  • Uniqueness and existence: for each node x in [[Q]] and each i in [1, n],there exists a unique nodeyi reached via Pi, and yi is either a text node or an attribute

Foreign keys: (Q, {P1, . . ., Pk} )  (S, {S1, . . ., Sk} )

  • (S, {S1, . . ., Sk} ) is a key
  • Uniqueness and existence: bothPiandSi

The uniqueness and existence condition complicates the consistency and implication analyses

Absolute constraint

other constraints for xml
Other constraints for XML

Functional dependencies: {P1, . . ., Pk}  {S1, . . ., Sk}

  • Generalizations of relational FDs – for deriving an extension of relational-schema normal forms
  • Absolute constraints [Arenas and Libkin, PODS’02]

XIGs:  x1 …  xn ( B(x1, …, Xn) 

∨ (i  [1, l])( y1 …  yk Ci (x1, …, xn, y1, …, yk))

  • Generalization of relational embedded constraints
  • B, Ci: conjunction of simple XPath expressions
  • Subsuming relative keys and foreign keys (Deutsch and Tannen, [KRDB’01])
constraint analysis
Constraint analysis
  • Analysis of XML constraints
    • Consistency analysis
    • Implication analysis
    • Absolute, relative, path-expression constraints
consistency of xml specifications
Consistency of XML specifications

Given D: a DTD

: a set of integrity constraints over D

Consistency: Is there an XML document that both conforms to D and satisfies ?

One wants to know whether XML specifications make sense!

Run-time check: attempts to validate documents with (D, ).

This would not tell us whether repeated failures are due to a bad specification or problems with the documents

 static analysis is desirable

an inconsistent specification
An inconsistent specification

The specification with D and  is inconsistent!

  • DTD D:

<!ELEMENT db (province+, capital+)>

<!ELEMENT province (city*, capital)>

province.@name, capital.@inProvince

  • Constraints  :

province.@name  province

capital.@inProvince  capital

capital.@inProvince  province.@name

In contrast, one can specify keys and foreign keys in SQLwithout worrying about their consistency with schema.

cardinality constraints by keys foreign keys
Cardinality constraints by keys, foreign keys

Constraints  :

province.@name  province

capital.@inProvince  capital

capital.@inProvince  province.@name

Notation:

  • ext(): the set of elements in an XML document
  • ext(.l): the set ofl attribute values of all  elements

|ext(province.@name)| = |ext(province)|

|ext(capital.@inProvince)| = |ext(capital)|

|ext(capital.@inProvince)|  |ext(province.@name)|

 |ext(capital)|  |ext(province)|

cardinality constraints imposed by dtds
Cardinality constraints imposed by DTDs

DTD D: <!ELEMENT db (province+, capital+)>

<!ELEMENT province (city*, capital)>

Variables:

  • Xprovince: the number of province elements under the root
  • Xcapital: the number of capital subelements of the root
  • Ycapital: the number of capital subelements of province’s

Xprovince  1, Xcapital  1

|ext(province)| = Xprovince, Xprovince = Ycapital

|ext(capital)| = Xcapital + Ycapital

|ext(capital)| > |ext(province)|

the interaction

db

...

province

province

capital

capital

@name

city

capital

“Hasselt”

@inProvince

“Limburg”

“Limburg”

“others”

@inProvince

“Hasselt”

“Limburg”

The interaction

Contradiction:

  • From the constraints  : |ext(capital)||ext(province)|
  • From the DTD D: |ext(capital)| > |ext(province)|

Thus there exists NO XML document that both conforms to D and satisfies .

consistency analysis pods 01 02
Consistency analysis [PODS’01, 02]
  • Trivial for relational databases: given any schema and keys, foreign keys, one can always find a nonempty instance of the schema satisfying the constraints.
  • Hard for XML: XML specifications may not be consistent!
    • Both DTDs and constraints impose cardinality constraints
    • The interaction between these two classes of cardinality constraints is rather complicated.
consistency analysis of xml constraints
Consistency analysis of XML constraints

Theorem: The consistency problem is

  • undecidable for multi-attributeabsolute keys and foreign keys;
  • NP-complete for unary absolute keys and foreign keys, even for primary keys (primary: at most one key for each element type);
  • in NEXPTIME for primary multi-attributeabsolute keys and unary foreign keys
  • in NEXPTIMEand PSPACE-hard for unary absolute regular keys and foreign keys (target path: /, where  is a regular path expression and  an element type; key paths: attributes)
  • undecidable for relative keys and foreign keys, even when all the constraints are unary and primary.

As opposed to the trivial analysis of the relational counterpart.

some tractable cases
Some tractable cases
  • Restrictions on constraints.

Theorem: For multi-attribute relative keys only, the consistency problem is in linear time for arbitrary DTDs.

Recall relative keys: country (province.@name  province)

In contrast, due to the existence and uniqueness condition:

Theorem: It is intractable for unary keys alone in XML Schema.

  • Restrictions on DTDs:

Theorem: When DTD is fixed, the consistency problem is in PTIME for absolute unary keys and foreign keys.

In practice, DTD is designed at one time, but constraints are written in stages: constraints are incrementally added.

implication analysis pods 00 01 02 dbpl 01
Implication analysis [PODS’00, 01, 02, DBPL’01]

Given D: a DTD

: a set of constraints expressed in C

:a property(a constraint of C)

Implication (C ): Is it the case that for any XML document, if it conforms to D and satisfies , then it must satisfy ?

C: a constraint language

The need for studying implication:

  • data integration: constraints checking at virtual views
  • optimization of XML queries and XML relational storage
  • design theory for XML specifications: normalization
some complexity results for implication analysis
Some complexity results for implication analysis

Theorem: The implication problem is

  • undecidable for multi-attribute absolute keys and foreign keys, and for unaryrelative keys and foreign keys;
  • PSPACE-hard for unaryregular absolute keys and foreign keys;
  • coNP-complete for unary absolute keys and foreign keys.
  • coNP-hard for XML-Schema unary keys
  • in linear time for absolute multi-attribute keys;
  • in PTIME for arbitrary absolute keys and foreign keys when the DTD is fixed, and
  • in PTIME for relative path keys in the absence of DTDs

The analysis of XML constraints is far more intricate than its relational counterpart

applications
Applications
  • Application of XML constraints, and open problems
    • Constraint propagation
    • Schema-directed XML integration
    • Normal form
    • Query rewriting/optimization
    • Update processing
    • Data cleaning
    • . . .
xml shredding relational storage of xml data
XML shredding: relational storage of XML data

XML shredding:

  • mapping XML data to relations
  • relational design: normalization
    • optimal relational storage of XML data
    • semantic connection: query/update optimization

Web

XML

XML

XML keys

XML shredding

propagation

DB1

DB2

relational FDs

example xml constraints

db

book

book

book

book

isbn

chapter

title

chapter

isbn

title

“XML”

title

section

number

section

number

“XML”

number

title

number

XPath

“1”

number

text

DTD

number

“10”

“1”

“6”

Example: XML constraints
  • (//book, {isbn}) -- isbn is an (absolute) key of book
  • (//book, (chapter, {number}) -- number is a key of chapter relative to book
  • (//book, (title, {})) -- each book has a unique title

chapter

chapter

mapping from xml to a predefined relation

db

book

book

book

book

isbn

chapter

title

chapter

isbn

chapter

chapter

title

“XML”

title

section

number

section

number

“XML”

number

title

number

XPath

“1”

number

text

DTD

number

“10”

“1”

“6”

Mapping from XML to a predefined relation

Predefined RDB: chapter(bookTitle, chapterNum, chapterTitle)

  • Mapping: for each book, extract its title, and the numbers and titles of all its chapters
  • Predefined relational key: (bookTitle, chapterNum)

Can the XML data be mapped to the RDB without violating the key?

a safe mapping

db

book

book

book

book

isbn

chapter

title

chapter

isbn

chapter

chapter

title

“XML”

title

section

number

section

number

“XML”

number

title

number

XPath

“1”

number

text

DTD

number

“10”

“1”

“6”

A safe mapping

Now change the relational schema to

RDB: chapter(isbn, chapterNum, chapterTitle)

The relation can be populated without any violation. Why?

The relational key (isbn, chapterNum) for chapter is implied (entailed) by the keys on the original XML data:

(//book, {isbn}), (//book, (chapter, {number}), (//book, (title, {}))

constraint propagation icde 03
Constraint Propagation [ICDE’03]
  • Input:
    • a set K of XML keys (context and target path: a fragment of XPath, key paths: attributes)
    • a predefined relational schema S,
    • a mapping f from XML to S (XPath,projection, join, union)
    • and a relational functional dependency FD over S
  • Output: is the FDpropagated from K via f? I.e., does FD hold over the DB f(T) for any XML document T that satisfies K?

Theorem: The constraint propagation problem is in PTIME.

  • Checking the consistency of a predefined relational schema for storing XML data
  • XML schema/DTD is not required – K is the only semantics
deriving relational schema for storing xml

db

book

book

book

book

isbn

chapter

title

chapter

isbn

chapter

chapter

title

“XML”

title

section

number

section

number

“XML”

number

title

number

XPath

“1”

number

text

DTD

number

“10”

“1”

“6”

Deriving relational schema for storing XML

One wants to find a “good” relational schema to store:

chapter(isbn, bookTitle, author, chapterNum, chapterTitle)

What is a good schema? In normal form: BCNF, 3NF, …

  • Prevent update anomaly (the relational theory)
  • Efficient storage, query optimization …

But how to find a normalized design?

constraint propagation and normalization
Constraint propagation and normalization

From the given XML keys:

(//book, {isbn}), (//book, (chapter, {number}), (//book, (title, {}))

one can derive functional dependencies:

isbn  bookTitle, isbn, chapterNum  chapterTitle

Normalize the relation by using these functional dependencies:

chapter(isbn, bookTitle, author, chapterNum, chapterTitle)

book(isbn, bookTitle),

chapter(isbn, chapterNum, chapterTitle),

author(isbn, author)

The new schema is in BCNF!

computing minimum cover of propagated fds
Computing minimum cover of propagated FDs
  • Input: a set K of XML keys, and a mapping f from XML to a universal schema U
  • Output: a minimum cover F of all the functional dependencies (FDs) propagated from the XML keys K via f
    • F is a cover (a set of FDs): any FD propagated from K via f is implied by F
    • F is minimum: F contains no redundant FDs, i.e., any FD in F is not entailed by other FDs in F.

Theorem: There is a PTIME algorithm for computing a minimum cover of propagated FDs.

Normalize relational schema for storing/querying XML data!

research issues
Research issues

For general constraints/mapping languages: undecidable

  • if the mapping language is relationally complete (selection, projection, join, union, difference), even for XML keys alone
  • if both XML keys and foreign keys are considered, even for the identity “transformation”

Open:

  • To identify (a) practical mapping languages and (b) practical XML constraints that allow efficient constraint propagation
  • Constraint propagation from relations to XML
    • Information preserving (lossless) data exchange
    • Query/update rewriting/optimization
    • Overcoming incompleteness of source data (foreign keys)
xml publishing integration

DTD

constraints

XML publishing/integration

All members of a community (or industry) agree on a schema and exchange data w.r.t. the schema:e-commerce, health-care, ...

Schema-directed XML Publishing/Integration:

  • mapping data from traditional database to XML
  • satisfying the predefined DTD and constraints

Web

XML

XML

Q: XML view

DB1

DB2

schema directed integration sigmod 03

XML view

Schema-directed integration [SIGMOD’03]

DTD

DB

DB

integration

DB

constraints

multiple, distributed sources

  • Schema-directed: XML view conforming to a schema (D, )
    • D: a DTD
    • : a set of XML constraints (relative keys, foreign keys)
  • Attribute Integration Grammar (AIG)
    • DTD-directed view definition: recursive, nondeterministic
  • Inherited and synthesized attributes
    • Constraint compilation: automatically captures integrity constraints and DTD in a uniform framework
xml normal forms
XML normal forms
  • Extensions of (nested) relational normal forms, via XML FDs
    • M. Arenas and L. Libkin. A Normal Form for XML Documents, [PODS 02]. XNFs, decomposition algorithms, complexity, …
    • M. Vincent, J. Liu and C. Liu. Strong functional dependencies and their application to normal forms in XML. [TODS 29(3), 2004]
    • X. Wu, T.W. Ling, S. Lee, M. Lee, G. Dobbie. NF-SS: A Normal Form for Semistructured Schema. [ER (Workshops) 2001]
  • Research issues
    • Implication analysis: more intriguing than relational FDs
    • Relative functional dependencies: hierarchical nature of XML
    • “Right” normal form: XML data is typically stored in RDBMS
      • redundancy often helps, e.g., performance and reliability
      • XML data is often “static”: update anomalies?
run time analysis incremental constraint checking
Run-time analysis: incremental constraint checking

Input:XML tree T, constraints, update ∆T, where T satisfies

Question: does (T +∆T)satisfy?

  • ∆X . Code generator: incremental checking. Lucent applications

M. Benedikt, G. Brun, J. Gibson, R. Kuss and A. Ng. Automated update management for XML integrity constraints. [PLANX’02]

  • Application of incremental techniques for attribute grammar

M. Abrao, B. Bouchou, M. Alves, D. Laurent, M. Musicante. Incremental Constraint Checking for XML Documents[XSym’04]

Research issues:

  • Complexity of incremental constraint checking
  • XML editors: broken link detection and repair
  • Incremental checking techniques for XML data stored in RDBMS
query rewriting and optimization
Query rewriting and optimization

Query translation from XQuery to SQL: XML data stored in RDBMS

    • encode XIGs and XQuery in relational queries and constraints
    • extensions of chase and backchase

A. Deustch and V. Tannen

    • Reformulation of XML Queries and Constraints[ICDT’03]
    • MARS: A System for Publishing XML from Mixed and Redundant Storage [VLDB’03]

R. Krishnamurthy, R. Kaushik, J. Naughton. Efficient XML-to-SQL Query Translation: Where to Add the Intelligence? [VLDB 2004]

Research issues:

  • Rewriting queries over (recursive security) views of XML data
  • Query optimization for (compressed) XML data in native store
data cleaning
Data cleaning

Input:XML tree T, constraints, DTD D

Question: if Tdoes notsatisfyD +,find a repairT’ such that (a) T’ satisfies D +, and (b) the distance between T and T’ is minimal (update operations: insert, delete, modify)

  • G. Flesca, F. Furfaro, S. Greco, E. Zumpano. Repairs and Consistent Answers for XML Data with Functional Dependencies [XSym’03]

Research issues:

  • Effective techniques for repairing integrated XML data: conflicts and inconsistencies may emerge as violations of constraints.
    • Various constraint languages,
    • XML schema
  • Automated tools for repairing Web pages: broken links
summary
Summary
  • Specification of XML constraints:
    • absolute vs. relative, path constraints: XML data is hierarchical and semi-structured
    • mild extensions of relational constraints are not sufficient
  • Consistency and implication analysis of XML constraints
    • DTDs interact with XML constraints
    • far more intricate than their relational counterparts
  • Applications of XML constraints
    • XML storage, query, update, integration, cleaning, …
    • many practical issues remain to be explored