self maintenance of materialized xml views with non cooperative data sources n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Self Maintenance of materialized XML views with non-cooperative data sources PowerPoint Presentation
Download Presentation
Self Maintenance of materialized XML views with non-cooperative data sources

Loading in 2 Seconds...

play fullscreen
1 / 36

Self Maintenance of materialized XML views with non-cooperative data sources - PowerPoint PPT Presentation


  • 85 Views
  • Uploaded on

Self Maintenance of materialized XML views with non-cooperative data sources. DBDBD – 2006 Virginie Sans –ETIS/CNRS Laboratory– MIDI Team. Issue and context Pre-requisite The issue Context State of the art Contributions View computation with the XAlgebra

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Self Maintenance of materialized XML views with non-cooperative data sources' - luthando-morin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
self maintenance of materialized xml views with non cooperative data sources

Self Maintenance of materialized XML views with non-cooperative data sources

DBDBD – 2006

Virginie Sans –ETIS/CNRS Laboratory– MIDI Team

summary
Issue and context

Pre-requisite

The issue

Context

State of the art

Contributions

View computation with the XAlgebra

Detection and Identification of source updates

View maintenance

Applications and performances

Conclusion

Summary
mediation architecture
Introduced by WiederHold

The architecture

mediator

wrappers

sources

Query langague

Mediation architecture

1.1 Pre-requisite

mediation architecture1
Mediator

Handle the user request: canonization, atomization

Send atomic request to a source via its wrapper

wrappers

Translate query coming from the mediator into a query in the native langague of the web source

Give the mediator an answer in XML

Data sources

heterogeneous

distributed

In a web context : Partially unavailable

Mediation architecture

Meditor

Atomic request

XML

Wrapper

SQL

Tuples

Source SQL

1.1 Pre-requisite

views
What about views ?

Data integration

Access control, security

Data-warehouses

Why ?

Interoperability

Heterogeneous data

Materializing views

Fast access to complex query

Better Availability

Request optimization

Views

Mediator

Materialized

views

Wrapper

Wrapper

Wrapper

RDB

SQL

HTML

1.1 Pre-requisite

issue view maintenance
Issue : View maintenance

Maintenance process

  • Recomputation
    • Recompute the whole view from scratch

When data sources are updated, the view consistency should be kept

Maintenance

View

t+1

View

t

incremental

Maintenance

View computation

Recomputation

  • Incremental maintenance
    • compute changes to view in response to changes to base sources

Source t+1

Source t

Update

1.2 Issue

context semi structured xml data
Context : semi-structured XML data

<bib>

<book>

<price> 65.95 </price>

<title> Advanced Programming in

the Unix environment </title>

</book>

<book>

<title> TCP/IP Illustrated </title>

</book>

<book>

<price> 65.95 </price>

<title> Advanced Programming in

the Unix environment </title>

</book>

<book>

<price>39.95</price>

<title> Data on the Web </title>

<title> Données sur le Web </title>

</book>

</bib>

  • XML views are materialized at the mediator level
  • Hierarchical data
  • No scheme, except the query scheme

1.3 Context

context xquery
Context : XQUERY

Syntaxe FLWOR

for $var in foret [$var in foret]*

let $var:= sous-arbre

Where condition

Return result

  • XQuery
  • Dedicated to XML data
    • Relational operator (projection, select, join, union, …)
    • XML operator (tagging, unnesting, aggregation, ..)
  • FLWOR syntax

…………(pronounced Flower !)

<result>

for $b in document("bib.xml")/bib/book

let $a=$b/author

where $b/price/text() < 60

Order by $b/year

return

<cheap_book>

$b/title

</cheap_book>

</result>

1.3 Context

context other specificities
Context : Other specificities
  • Views are computed using XAlgebra
    • Cf.View computation
  • Wrappers have limited resources
    • Few computation possibilities
    • A component named logger stores the last modification date and a checksum of sources
  • Non cooperative web sources
    • No information about their updates
    • Not always available
    • Not enough granularity

1.3 Context

state of the art 1 2
State of the art (1/2)
  • Relational views
    • Not fit for semi-structured data
  • Abiteboul and Al.
    • OEM (Object Embedded Model)
    • LOREL language
    • Some Operators are missing
  • VOX – Rainbow Team
    • Need to know the exact position in the XML Tree where the update has been done

1.4 State of the art

state of the art 2 2
State of the art (2/2)
  • Cobena and Al.
    • XDiff – an algorithm for XML files comparison
    • Need a copy of the source at the wrapper level
  • Bonnet and Al. /Papadimos and Al.
    • Parachute queries
    • A mutant query plan

What about when sources are really unavailable ?

Our goal :

Reduce to the minimum sources access

Use information that are stored in the view

1.4 State of the art

view maintenance the process
View maintenance : The process
  • View computation
    • An algebraic approach using XAlgebra – Extension of the XAlgebra (identifiers)
  • Update detection
    • Comparison of the information of the source and those stored in the logger
  • Update identification
    • Recovering process
    • Diff Algorithm
  • View maintenance
    • Propagation rules for each operator

2.1 View computation

view computation
View computation

Steps :

2.1 View computation

the xalgebra data model
The XAlgebra data model
  • Operators :
    • XSource, XConstruct, XUnion, ….
  • Data structures :
    • XRelation, XTuple, XAttributes

2.1 View computation

xsource operator step 1
XSource Operator– Step 1
  • XQuery analysis

For $f in doc("informations.xml")/personnes/personne

Let $a:=$f/nom

Where $f/age<27 and $a="Durand"

Return

<nom>{$a}</nom>

<prenom>{$f/prenom}</prenom>

Path extraction :

  • Optional
  • Mandatory
  • Hidden

We obtain :

  • A context
  • A set of patterns

2.1 View computation

xsource operator step 2 and 3
XSource Operator– Step 2 and 3
  • From XML Sub-Trees to the tabular structure

1 Sub Tree => 1 Xtuple XRelation = set of XTuples

2.1 View computation

xsource operator extending the algebra
XSource Operator– Extending the Algebra
  • adding identifiers : XTids

An XTID is a set of pair :

{(idsource, idfragment), …..}

2.1 View computation

view computation xoperator
View computation - XOperator
  • XProject

2.1 View computation

view computation xoperator1
View computation - XOperator
  • XJoin

XTids propagation : card (XTID)1for some nodes

2.1 View computation

update detection and identification
Update detection and Identification
  • Detection

Comparison of the information of the source and those stored in the logger

      • The last modification date
      • The checksum of the source
  • Identification
    • Partial recovery of the source information based on Xtids
    • Comparison of the recovered XRelation with the updated source
    • Δ computation

2.2 Update detection and identification

xrecover
XRecover
  • Step 1 : Project XRv on XR1 patterns

2.2 Update detection and identification

xrecover1
XRecover
  • Step 2 : filtering XTuples values

2.2 Update detection and identification

xrecover2
XRecover
  • Step 3 : re-ordering XTuples

XTidUnnest

Xtuples are unnested depending on their XTids

2.2 Update detection and identification

xrecover3
XRecover

Step 3 : re-ordering Xtuples

XTidnest

Xtuples are nested by their Xtids

Xtuples are re-ordered

2.2 Update detection and identification

update identification comparison algorithm
Update Identification – Comparison Algorithm
  • Comparison of XR1t+1 avec XRt’
    • XR1t+1 is the XRelation obtained by applying Xsource to source 1 at t+1
    • XRt’ is the partial recovery of Xrelation of source 1 at t

Remark :XR1t+1 can also be filtered using predicates before comparison

The Diff algorithm is based on Unix Diff (Hunt & McIllroy).

The symbol is the Xtuple instead of being the line

2.2 Update detection and identification

update identification diff algorithm
Update identification – Diff algorithm
  • Delta with hunks :
    • Insert(pos; Xtuple)
    • delete(pos;Xtuple)
    • Replace(pos; Xtupleold, Xtuplenew)

Insert(2,{Leclerc,Avide,{(1,3)}} {John,Avide,{(1,3)}} }

Delete(4,{Durand,Avide,{(1,11)}}, {Marcel,Avide,{(1,11)}} {Eric,Avide,{(1,11)}}}

Etc…

2.2 Update detection and identification

maintenance rules from delta to view maintenance
Maintenance RulesFrom Delta to view maintenance
  • Case of a deletion - delete(pos, xtuple)

An Xtuple is associated to an Xtid {(x)} such that card=1, Each Xvalue of the view have xtids noted XTID

1) We delete from Xvalues each pair of the Xtid such that x  XTID

Example :

The XTuple where xtid is x=1,3 has been deleted

The Xvalue {Alain}1,3;1,4 becomes XValeur {Alain}1,4

2)We delete each Xvalues such that card(XTID)=0

If XValue {Alain}1,3 become XValeur {Alain} We delete entirely the XValue

3) If the Xvalue was concenned by the predicate, we delete the XTuple

  • Join and restriction case

2.3 View maintenance

maintenance rules from delta to view maintenance1
Maintenance RulesFrom Delta to view maintenance
  • Case of an insertion - insert(pos; xtuple)

1) A new Xtid is created

Goal : preserved Xtuples order for a later recovery

2) Depending on the operator; we obtain various maintenance instructions

Projection: insert of the projection of the xtuple

Select : xtuple satisfies the predicat  insertion

Join XR1* XR2, computation of XT= xtuple * XR2.

If XT    insertion of XT

Union and Intersect: we keep the conservation des doublons

 Union  Select where the predicate is always true

 Intersect  join

Depending on the predicate, we can request either XR2 or its recovery

2.3 View maintenance

maintenance rules from delta to view maintenance2
Maintenance RulesFrom Delta to view maintenance
  • Case of a modification- Replace(pos; Xtupleold, Xtuplenew)

Xtuple modification

=

Xvalue modification

OR

Xvalues deletion followed by insertion

Project and Union: modification of the concerned XValues

Select and Intersect:

If modification is applied an Xvalue that must verify the condition,

    • deletion of the Xtuple

Else modification of the XValues

Intersect select.

Join deletion followed by insertion.

2.3 View maintenance

maintenance rules missing information

Mediator

Materialized

views

Wrapper

Wrapper

HTML

SQL

Maintenance rulesMissing Information
  • Missing Information

(join ?)

    • Source Recovery
    • Multi-view strategy
    • Source request

Goal : limited acces to the sources !!!!

Example :

View= S1*S2

Insertio : x * S2’

Computation of S2’

xtuple x is inserted in S1

2.3 View maintenance

applications
Applications
  • On the web

When necessary sources are unavailable

Goal : Limited access to them

  • With sensors (ANR Project )

With sensors that have no wire

Goal: Preserve power ressources

2.4 Applications and performances

performances
Performances
  • Comparison between XRecover and Recomputation

2.4 Applications and performances

performances1
Performances
  • Comparison between XRecover and Recomputation

2.4 Applications and performances

contributions
Contributions
  • Maintenance process in the context of non-cooperative web sources
  • Contribution to the XAlgebra
    • New operators : XRecover, XTidUnnest, XTidNest
    • New data structure : XTids
  • Futur work
    • Order sensitive view maintenance
    • A better Diff algorithm

Conclusion