Containment of partially specified tree pattern queries
Download
1 / 44

Containment of - PowerPoint PPT Presentation


  • 255 Views
  • Updated On :

Containment of Partially Specified Tree-Pattern Queries. Dimitri Theodoratos (NJIT, USA) Theodore Dalamagas (NTUA, GREECE) Pawel Placek (NJIT, USA) Stefanos Souldatos (NTUA, GREECE) Timos Sellis (NTUA, GREECE).

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Containment of ' - MartaAdara


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Containment of partially specified tree pattern queries l.jpg

Containment of Partially Specified Tree-Pattern Queries

Dimitri Theodoratos (NJIT, USA)

Theodore Dalamagas (NTUA, GREECE)

Pawel Placek (NJIT, USA)

Stefanos Souldatos (NTUA, GREECE)

Timos Sellis (NTUA, GREECE)


Introduction data model additional concepts query containment experiments conclusion l.jpg

IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion


Motivating example l.jpg

r

GREECE

USA

ATHENS

YAMAHA

BMW

HONDA

YAMAHA

BMW

ON-OFF

TRAVEL

TRAVEL

ON-OFF

TRAVEL

200cc

F650GS

650cc

VARADERO

200cc

650cc

SERROW

F650GS

F650

NJ

125cc

1000cc

SERROW

Motivating Example ()

  • Tree structure (e.g. XML) with motorbike spare parts.

  • We search for spare parts.

  • BUT…

Stefanos Souldatos - HDMS 2006


Motivating example4 l.jpg

r

GREECE

USA

?

ATHENS

YAMAHA

BMW

HONDA

YAMAHA

BMW

ON-OFF

TRAVEL

TRAVEL

ON-OFF

TRAVEL

200cc

F650GS

650cc

VARADERO

200cc

650cc

SERROW

F650GS

F650

NJ

125cc

1000cc

SERROW

Motivating Example ()

  • Dimitri Theodoratos lives in NJ.

  • He has a Yamaha Serrow motorbike in Greece.

  • He searches for spare parts in Greece or USA.

     structural difference

Stefanos Souldatos - HDMS 2006


Motivating example5 l.jpg

r

GREECE

USA

ATHENS

YAMAHA

BMW

HONDA

YAMAHA

BMW

ON-OFF

TRAVEL

TRAVEL

ON-OFF

TRAVEL

200cc

F650GS

650cc

VARADERO

200cc

650cc

SERROW

F650GS

F650

NJ

125cc

1000cc

SERROW

Motivating Example ()

  • Theodore Dalamagas has a BMW motorbike.

  • He looks for spare parts worldwide.

    structural inconsistency

../650cc/F650GS

../F650GS/650cc

Stefanos Souldatos - HDMS 2006


Motivating example6 l.jpg

r

GREECE

USA

ATHENS

YAMAHA

BMW

HONDA

YAMAHA

BMW

ON-OFF

TRAVEL

TRAVEL

ON-OFF

TRAVEL

200cc

F650GS

650cc

VARADERO

200cc

650cc

SERROW

F650GS

F650

NJ

125cc

1000cc

SERROW

Motivating Example ()

  • Stefanos Souldatos has a Honda Varadero.

  • But, he is not fully aware of the tree structure.

     unknown structure

Stefanos Souldatos - HDMS 2006


Motivating example7 l.jpg

r

r

r

GREECE

GREECE

GREECE

USA

USA

USA

ATHENS

ATHENS

ATHENS

YAMAHA

YAMAHA

YAMAHA

BMW

BMW

BMW

HONDA

HONDA

HONDA

YAMAHA

YAMAHA

YAMAHA

BMW

BMW

BMW

ON-OFF

ON-OFF

ON-OFF

TRAVEL

TRAVEL

TRAVEL

TRAVEL

TRAVEL

TRAVEL

ON-OFF

ON-OFF

ON-OFF

TRAVEL

TRAVEL

TRAVEL

200cc

200cc

200cc

F650GS

F650GS

F650GS

650cc

650cc

650cc

VARADERO

VARADERO

VARADERO

200cc

200cc

200cc

650cc

650cc

650cc

SERROW

SERROW

SERROW

F650GS

F650GS

F650GS

F650

F650

F650

NJ

NJ

NJ

125cc

125cc

125cc

1000cc

1000cc

1000cc

SERROW

SERROW

SERROW

Motivating Example ()

  • Pawel Placek wants to buy a motorbike that he can easily find spare parts for.

  • He searches in many different tree structures.

    source integration

Stefanos Souldatos - HDMS 2006


Motivation l.jpg
Motivation

Querying tree-structured data

BUT

structure is not always strictly defined

 user does not always deal with structure:

 Find Honda spare parts in Greece.

Stefanos Souldatos - HDMS 2006


Introduction data model additional concepts query containment experiments conclusion9 l.jpg

IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion


Dimension graph l.jpg

R

C

L

E

B

M

T

r

GREECE

USA

ATHENS

YAMAHA

BMW

HONDA

YAMAHA

BMW

ON-OFF

TRAVEL

TRAVEL

ON-OFF

TRAVEL

200cc

F650GS

650cc

VARADERO

200cc

650cc

SERROW

F650GS

F650

NJ

125cc

1000cc

SERROW

Dimension Graph

dimension graph = summary of the tree structure

DIMENSIONS

R (oot)

C (ountry)

L (ocation)

B (rand)

T (ype)

M (odel)

E (ngine)

Stefanos Souldatos - HDMS 2006


Partially specified tree pattern query l.jpg

R

C = {Greece}

C

L

B = {BMW}

B = {BMW}

E

B

M = ?

E = ?

M

T

Partially Specified Tree-pattern Query

  • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info)

DIMENSIONS

R (oot)

C (ountry)

L (ocation)

B (rand)

T (ype)

M (odel)

E (ngine)

Stefanos Souldatos - HDMS 2006


Partially specified tree pattern query12 l.jpg

R

C = {Greece}

C

L

B = {BMW}

B = {BMW}

E

B

M = ?

E = ?

M

T

PSP p1

PSP *p2

Partially Specified Tree-pattern Query

  • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info)

DIMENSIONS

R (oot)

C (ountry)

partially specified

paths (PSP)

L (ocation)

B (rand)

T (ype)

M (odel)

E (ngine)

Stefanos Souldatos - HDMS 2006


Partially specified tree pattern query13 l.jpg

R

C = {Greece}

C

L

B = {BMW}

B = {BMW}

E

B

M = ?

E = ?

M

T

PSP p1

PSP *p2

Partially Specified Tree-pattern Query

  • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info)

DIMENSIONS

R (oot)

C (ountry)

output

path (*)

partially specified

paths (PSP)

L (ocation)

B (rand)

T (ype)

M (odel)

E (ngine)

Stefanos Souldatos - HDMS 2006


Partially specified tree pattern query14 l.jpg

R

C = {Greece}

C

L

B = {BMW}

B = {BMW}

E

B

M = ?

E = ?

M

T

PSP p1

PSP *p2

Partially Specified Tree-pattern Query

  • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info)

parent

child

ancestor

descendant

DIMENSIONS

R (oot)

C (ountry)

output

path (*)

partially specified

paths (PSP)

L (ocation)

B (rand)

T (ype)

M (odel)

E (ngine)

Stefanos Souldatos - HDMS 2006


Partially specified tree pattern query15 l.jpg

R

C = {Greece}

C

L

B = {BMW}

B = {BMW}

E

B

M = ?

E = ?

M

T

PSP p1

PSP *p2

Partially Specified Tree-pattern Query

node sharing

expression

(NSE)

  • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info)

parent

child

ancestor

descendant

DIMENSIONS

R (oot)

C (ountry)

output

path (*)

partially specified

paths (PSP)

L (ocation)

B (rand)

T (ype)

M (odel)

E (ngine)

Stefanos Souldatos - HDMS 2006


Introduction data model additional concepts query containment experiments conclusion16 l.jpg

IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion


Additional concepts l.jpg

C = {Greece}

C = {Greece}

B = {BMW}

B = {BMW}

M = ?

E = ?

PSP p1

PSP *p2

Additional Concepts

Full Form Query

Stefanos Souldatos - HDMS 2006


Additional concepts18 l.jpg

R

C = {Greece}

C = {Greece}

R

C

L

B = {BMW}

B = {BMW}

C = {Greece}

E

B

B = {BMW}

M = ?

E = ?

M

T

T

PSP p1

PSP *p2

E

M

Additional Concepts

Full Form Query

Dimension Trees

DIMENSION TREES = QUERY + GRAPH

Stefanos Souldatos - HDMS 2006


Introduction data model additional concepts query containment experiments conclusion19 l.jpg

IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion


Absolute containment l.jpg
Absolute Containment

Each result of Q1

is a result of Q2.

Q1  Q2

Stefanos Souldatos - HDMS 2006


Absolute containment21 l.jpg
Absolute Containment

Each result of Q1

is a result of Q2.

Q1  Q2

homomorphism

from Q2

to Q1

Stefanos Souldatos - HDMS 2006


Absolute containment22 l.jpg

C

C

C

C

B

B

M

M

E

E

Absolute Containment

Each result of Q1

is a result of Q2.

Q1  Q2

homomorphism

from Q2

to Q1

Q1

Q2

PSP *p1

PSP p2

PSP *p3

PSP p4

Stefanos Souldatos - HDMS 2006


Relative containment w r t g l.jpg
Relative Containment (w.r.t. G)

Each result of Q1 in G

is a result of Q2 in G.

Q1 G Q2

Stefanos Souldatos - HDMS 2006


Relative containment w r t g24 l.jpg
Relative Containment (w.r.t. G)

Each result of Q1 in G

is a result of Q2 in G.

Q1 G Q2

homomorphism

from the Dimension Trees of Q2 to the Dimension Trees of Q1

Stefanos Souldatos - HDMS 2006


Relative containment w r t g25 l.jpg

R

R

C

C

B

B

T

T

M

E

E

Relative Containment (w.r.t. G)

Each result of Q1 in G

is a result of Q2 in G.

Q1 G Q2

homomorphism

from the Dimension Trees of Q2 to the Dimension Trees of Q1

A dimension tree

of Q1

A dimension tree

of Q2

Stefanos Souldatos - HDMS 2006


Relative containment heuristic l.jpg
Relative Containment Heuristic

100msec

Relative Containment

(RC)

1msec

Absolute Containment

(AC)

Stefanos Souldatos - HDMS 2006


Relative containment heuristic27 l.jpg
Relative Containment Heuristic

 sound but not complete

  • extract structural information from the Dimension Graph

  • insert it in the query Q1

  • check Q1  Q2 instead of Q1 G Q2

Relative Containment Heuristic (RCH)

100msec

Relative Containment

(RC)

1msec

Absolute Containment

(AC)

Stefanos Souldatos - HDMS 2006


Relative containment heuristic28 l.jpg

R

C

L

E

B

M

T

Relative Containment Heuristic

  • Example

Q1

Q2

Q1  Q2

C = ?

B = ?

B = ?

T = ?

PSP *p1

PSP *p2

Stefanos Souldatos - HDMS 2006


Relative containment heuristic29 l.jpg

R

C

L

E

B

M

T

Relative Containment Heuristic

  • Example

Q1

Q2

B=>T : R->C, C=>B

Q1  Q2

C = ?

B = ?

B = ?

T = ?

PSP *p1

PSP *p2

Stefanos Souldatos - HDMS 2006


Relative containment heuristic30 l.jpg

R

C

L

E

B

M

T

Relative Containment Heuristic

  • Example

Q1

Q2

B=>T : R->C, C=>B

Q1  Q2

R = ?

C = ?

C = ?

B = ?

B = ?

Q1 G Q2

T = ?

PSP *p1

PSP *p2

Stefanos Souldatos - HDMS 2006


Introduction data model additional concepts query containment experiments conclusion31 l.jpg

IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion


Experiments l.jpg
Experiments

  • We measured…

    • execution time for

      • Absolute Containment (AC)

      • Relative Containment (RC)

      • Relative Containment Heuristic (RCH)

    • accuracy for RCH

  • …for various graph sizes

  • …for various query sizes

Stefanos Souldatos - HDMS 2006


Slide33 l.jpg
Time

Graph dimensions: 30

Graph dimensions: 40

Graph dimensions: 20

RC

RC

RC

RCH

RCH

RCH

Time (msec)

AC

AC

AC

Graph paths: 10 - 80

Graph paths: 15 - 120

Graph paths: 20 - 160

Query PSPs: 1

Query PSPs: 2

RC

RC

Time (msec)

RCH

RCH

AC

AC

Nodes per PSP: 3 - 6

Nodes per PSP: 3 - 6

Stefanos Souldatos - HDMS 2006


Accuracy of rch l.jpg
Accuracy of RCH

  • 80% for graphs of common sizes

    • based on XML benchmarks (XMach, XMark, etc.)

  • 50% for graphs of higher density

Stefanos Souldatos - HDMS 2006


Introduction data model additional concepts query containment experiments conclusion35 l.jpg

IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion


Conclusion l.jpg
Conclusion

  • Query Containment forPartially Specified Tree-Pattern Queries (PSTPQs).

  • Soundtechnique for checking Relative Query Containment

    • Time: one order of magnitude

    • Accuracy: over 80%

Stefanos Souldatos - HDMS 2006


Future work l.jpg

A

A

B

B

C

C

PSP p1

PSP p2

PSP *p3

Future Work

  • Heuristics for checking Relative Containment

    • precomputed and on-the-fly

    • trade-off between time and accuracy

  • Special forms of queries, e.g. swings:

Stefanos Souldatos - HDMS 2006



Links l.jpg
Links

Introduction (2-9)

Data Model (10-17)

Additional Concepts (18-20)

Query Containment (21-32)

Experiments (33-36)

Conclusion (37-41)

Appendix (42-46)

Stefanos Souldatos - HDMS 2006



Who defines the dimensions l.jpg
Who defines thedimensions?

  • Automatic

    • XML tags (dimension graph = “path summary”, “path index”, “structural summary”)

  • Semi-automatic

    • Graph administrator + XML tags

      (dimension = group of XML tags)

    • Graph administrator + ontology

  • Manual

    • Graph administrator

Stefanos Souldatos - HDMS 2006


Inference rules l.jpg

R

C = {Greece}

C = {Greece}

C

L

B = {BMW}

B = {BMW}

E

B

M = ?

E = ?

M

T

PSP p1

PSP *p2

Inference Rules

INFERENCE RULES

(IR1) |- R[p1]  R[p2]

(IR2) A[p1]  A[p2], A[p2]  A[p3] |- A[p1]  A[p3]

(IR3) a structural expression that involves A[p] |- R[p] => A[p]

(IR4) A[p]  B[p] |- A[p] => B[p]

(IR5) A[p] => B[p], B[p] => C[p] |- A[p] => C[p]

(IR6) A[p]  B[p], A[p => C[p] |- B[p] => C[p]

(IR7) A[p]  B[p], C[p] => B[p] |- C[p] => A[p]

(IR8) A[p1]  B[p1], B[p1]  B[p2] |- A[p2]  B[p2]

(IR9) A[p1] => B[p1], B[p1]  B[p2] |- A[p2] => B[p2]

(IR10) A[p1] => B[p1], A[p1]  A[p2], R[p2] => B[p2] |- A[p2] => B[p2]

(IR11) A[p1] => B[p1], B[p1]  B[p2] |- A[p1]  A[p2]

(IR12) A[p1]  B[p1], C[p2]  B[p2], D[p1]  D[p2] |- D[p1] => A[p1]

(IR13) A[p1]  B[p1], A[p2]  C[p2], D[p1]  D[p2] |- D[p1] => A[p1]

(IR14) A[p1] => B[p1], B[p2] => A[p2], C[p1]  C[p2] |- C[p1] => A[p1]

1. Full Form Query

Stefanos Souldatos - HDMS 2006


Dimension trees l.jpg

R

R

C = {Greece}

C = {Greece}

R

C = {Greece}

C

L

B = {BMW}

B = {BMW}

C = {Greece}

B = {BMW}

E

B

B = {BMW}

M = ?

E = ?

T

M

T

T

R

R

PSP p1

PSP *p2

M

E

C = {Greece}

C = {Greece}

M

E

B = {BMW}

B = {BMW}

T

T

E

M

E

M

E

M

Dimension Trees

r/Greece/BMW/

*T[*E]/*M

r/Greece/BMW/

*T/*M [*E]

r/Greece/BMW/

*T/*E/*M

r/Greece/BMW/

*T[*M/*E]/*E*M

Stefanos Souldatos - HDMS 2006


Previous approaches l.jpg
Previous Approaches

  • Keyword-based search approach

    • Absence of structure

  • Naive approach

    • All possible query patterns are generated

      (Honda=>Greece, Greece=>Honda)

  • Approximation techniques

    • Relax the query  more answers

  • Traditional integration approach

    • Global structure and mapping rules

Stefanos Souldatos - HDMS 2006


ad