containment of partially specified tree pattern queries
Download
Skip this Video
Download Presentation
Containment of Partially Specified Tree-Pattern Queries

Loading in 2 Seconds...

play fullscreen
1 / 44

Containment of - PowerPoint PPT Presentation


  • 257 Views
  • Uploaded on

Containment of Partially Specified Tree-Pattern Queries. Dimitri Theodoratos (NJIT, USA) Theodore Dalamagas (NTUA, GREECE) Pawel Placek (NJIT, USA) Stefanos Souldatos (NTUA, GREECE) Timos Sellis (NTUA, GREECE).

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Containment of ' - MartaAdara


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
containment of partially specified tree pattern queries

Containment of Partially Specified Tree-Pattern Queries

Dimitri Theodoratos (NJIT, USA)

Theodore Dalamagas (NTUA, GREECE)

Pawel Placek (NJIT, USA)

Stefanos Souldatos (NTUA, GREECE)

Timos Sellis (NTUA, GREECE)

introduction data model additional concepts query containment experiments conclusion

IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion

motivating example

r

GREECE

USA

ATHENS

YAMAHA

BMW

HONDA

YAMAHA

BMW

ON-OFF

TRAVEL

TRAVEL

ON-OFF

TRAVEL

200cc

F650GS

650cc

VARADERO

200cc

650cc

SERROW

F650GS

F650

NJ

125cc

1000cc

SERROW

Motivating Example ()
  • Tree structure (e.g. XML) with motorbike spare parts.
  • We search for spare parts.
  • BUT…

Stefanos Souldatos - HDMS 2006

motivating example4

r

GREECE

USA

?

ATHENS

YAMAHA

BMW

HONDA

YAMAHA

BMW

ON-OFF

TRAVEL

TRAVEL

ON-OFF

TRAVEL

200cc

F650GS

650cc

VARADERO

200cc

650cc

SERROW

F650GS

F650

NJ

125cc

1000cc

SERROW

Motivating Example ()
  • Dimitri Theodoratos lives in NJ.
  • He has a Yamaha Serrow motorbike in Greece.
  • He searches for spare parts in Greece or USA.

 structural difference

Stefanos Souldatos - HDMS 2006

motivating example5

r

GREECE

USA

ATHENS

YAMAHA

BMW

HONDA

YAMAHA

BMW

ON-OFF

TRAVEL

TRAVEL

ON-OFF

TRAVEL

200cc

F650GS

650cc

VARADERO

200cc

650cc

SERROW

F650GS

F650

NJ

125cc

1000cc

SERROW

Motivating Example ()
  • Theodore Dalamagas has a BMW motorbike.
  • He looks for spare parts worldwide.

structural inconsistency

../650cc/F650GS

../F650GS/650cc

Stefanos Souldatos - HDMS 2006

motivating example6

r

GREECE

USA

ATHENS

YAMAHA

BMW

HONDA

YAMAHA

BMW

ON-OFF

TRAVEL

TRAVEL

ON-OFF

TRAVEL

200cc

F650GS

650cc

VARADERO

200cc

650cc

SERROW

F650GS

F650

NJ

125cc

1000cc

SERROW

Motivating Example ()
  • Stefanos Souldatos has a Honda Varadero.
  • But, he is not fully aware of the tree structure.

 unknown structure

Stefanos Souldatos - HDMS 2006

motivating example7

r

r

r

GREECE

GREECE

GREECE

USA

USA

USA

ATHENS

ATHENS

ATHENS

YAMAHA

YAMAHA

YAMAHA

BMW

BMW

BMW

HONDA

HONDA

HONDA

YAMAHA

YAMAHA

YAMAHA

BMW

BMW

BMW

ON-OFF

ON-OFF

ON-OFF

TRAVEL

TRAVEL

TRAVEL

TRAVEL

TRAVEL

TRAVEL

ON-OFF

ON-OFF

ON-OFF

TRAVEL

TRAVEL

TRAVEL

200cc

200cc

200cc

F650GS

F650GS

F650GS

650cc

650cc

650cc

VARADERO

VARADERO

VARADERO

200cc

200cc

200cc

650cc

650cc

650cc

SERROW

SERROW

SERROW

F650GS

F650GS

F650GS

F650

F650

F650

NJ

NJ

NJ

125cc

125cc

125cc

1000cc

1000cc

1000cc

SERROW

SERROW

SERROW

Motivating Example ()
  • Pawel Placek wants to buy a motorbike that he can easily find spare parts for.
  • He searches in many different tree structures.

source integration

Stefanos Souldatos - HDMS 2006

motivation
Motivation

Querying tree-structured data

BUT

structure is not always strictly defined

 user does not always deal with structure:

 Find Honda spare parts in Greece.

Stefanos Souldatos - HDMS 2006

introduction data model additional concepts query containment experiments conclusion9

IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion

dimension graph

R

C

L

E

B

M

T

r

GREECE

USA

ATHENS

YAMAHA

BMW

HONDA

YAMAHA

BMW

ON-OFF

TRAVEL

TRAVEL

ON-OFF

TRAVEL

200cc

F650GS

650cc

VARADERO

200cc

650cc

SERROW

F650GS

F650

NJ

125cc

1000cc

SERROW

Dimension Graph

dimension graph = summary of the tree structure

DIMENSIONS

R (oot)

C (ountry)

L (ocation)

B (rand)

T (ype)

M (odel)

E (ngine)

Stefanos Souldatos - HDMS 2006

partially specified tree pattern query

R

C = {Greece}

C

L

B = {BMW}

B = {BMW}

E

B

M = ?

E = ?

M

T

Partially Specified Tree-pattern Query
  • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info)

DIMENSIONS

R (oot)

C (ountry)

L (ocation)

B (rand)

T (ype)

M (odel)

E (ngine)

Stefanos Souldatos - HDMS 2006

partially specified tree pattern query12

R

C = {Greece}

C

L

B = {BMW}

B = {BMW}

E

B

M = ?

E = ?

M

T

PSP p1

PSP *p2

Partially Specified Tree-pattern Query
  • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info)

DIMENSIONS

R (oot)

C (ountry)

partially specified

paths (PSP)

L (ocation)

B (rand)

T (ype)

M (odel)

E (ngine)

Stefanos Souldatos - HDMS 2006

partially specified tree pattern query13

R

C = {Greece}

C

L

B = {BMW}

B = {BMW}

E

B

M = ?

E = ?

M

T

PSP p1

PSP *p2

Partially Specified Tree-pattern Query
  • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info)

DIMENSIONS

R (oot)

C (ountry)

output

path (*)

partially specified

paths (PSP)

L (ocation)

B (rand)

T (ype)

M (odel)

E (ngine)

Stefanos Souldatos - HDMS 2006

partially specified tree pattern query14

R

C = {Greece}

C

L

B = {BMW}

B = {BMW}

E

B

M = ?

E = ?

M

T

PSP p1

PSP *p2

Partially Specified Tree-pattern Query
  • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info)

parent

child

ancestor

descendant

DIMENSIONS

R (oot)

C (ountry)

output

path (*)

partially specified

paths (PSP)

L (ocation)

B (rand)

T (ype)

M (odel)

E (ngine)

Stefanos Souldatos - HDMS 2006

partially specified tree pattern query15

R

C = {Greece}

C

L

B = {BMW}

B = {BMW}

E

B

M = ?

E = ?

M

T

PSP p1

PSP *p2

Partially Specified Tree-pattern Query

node sharing

expression

(NSE)

  • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info)

parent

child

ancestor

descendant

DIMENSIONS

R (oot)

C (ountry)

output

path (*)

partially specified

paths (PSP)

L (ocation)

B (rand)

T (ype)

M (odel)

E (ngine)

Stefanos Souldatos - HDMS 2006

introduction data model additional concepts query containment experiments conclusion16

IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion

additional concepts

C = {Greece}

C = {Greece}

B = {BMW}

B = {BMW}

M = ?

E = ?

PSP p1

PSP *p2

Additional Concepts

Full Form Query

Stefanos Souldatos - HDMS 2006

additional concepts18

R

C = {Greece}

C = {Greece}

R

C

L

B = {BMW}

B = {BMW}

C = {Greece}

E

B

B = {BMW}

M = ?

E = ?

M

T

T

PSP p1

PSP *p2

E

M

Additional Concepts

Full Form Query

Dimension Trees

DIMENSION TREES = QUERY + GRAPH

Stefanos Souldatos - HDMS 2006

introduction data model additional concepts query containment experiments conclusion19

IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion

absolute containment
Absolute Containment

Each result of Q1

is a result of Q2.

Q1  Q2

Stefanos Souldatos - HDMS 2006

absolute containment21
Absolute Containment

Each result of Q1

is a result of Q2.

Q1  Q2

homomorphism

from Q2

to Q1

Stefanos Souldatos - HDMS 2006

absolute containment22

C

C

C

C

B

B

M

M

E

E

Absolute Containment

Each result of Q1

is a result of Q2.

Q1  Q2

homomorphism

from Q2

to Q1

Q1

Q2

PSP *p1

PSP p2

PSP *p3

PSP p4

Stefanos Souldatos - HDMS 2006

relative containment w r t g
Relative Containment (w.r.t. G)

Each result of Q1 in G

is a result of Q2 in G.

Q1 G Q2

Stefanos Souldatos - HDMS 2006

relative containment w r t g24
Relative Containment (w.r.t. G)

Each result of Q1 in G

is a result of Q2 in G.

Q1 G Q2

homomorphism

from the Dimension Trees of Q2 to the Dimension Trees of Q1

Stefanos Souldatos - HDMS 2006

relative containment w r t g25

R

R

C

C

B

B

T

T

M

E

E

Relative Containment (w.r.t. G)

Each result of Q1 in G

is a result of Q2 in G.

Q1 G Q2

homomorphism

from the Dimension Trees of Q2 to the Dimension Trees of Q1

A dimension tree

of Q1

A dimension tree

of Q2

Stefanos Souldatos - HDMS 2006

relative containment heuristic
Relative Containment Heuristic

100msec

Relative Containment

(RC)

1msec

Absolute Containment

(AC)

Stefanos Souldatos - HDMS 2006

relative containment heuristic27
Relative Containment Heuristic

 sound but not complete

  • extract structural information from the Dimension Graph
  • insert it in the query Q1
  • check Q1  Q2 instead of Q1 G Q2

Relative Containment Heuristic (RCH)

100msec

Relative Containment

(RC)

1msec

Absolute Containment

(AC)

Stefanos Souldatos - HDMS 2006

relative containment heuristic28

R

C

L

E

B

M

T

Relative Containment Heuristic
  • Example

Q1

Q2

Q1  Q2

C = ?

B = ?

B = ?

T = ?

PSP *p1

PSP *p2

Stefanos Souldatos - HDMS 2006

relative containment heuristic29

R

C

L

E

B

M

T

Relative Containment Heuristic
  • Example

Q1

Q2

B=>T : R->C, C=>B

Q1  Q2

C = ?

B = ?

B = ?

T = ?

PSP *p1

PSP *p2

Stefanos Souldatos - HDMS 2006

relative containment heuristic30

R

C

L

E

B

M

T

Relative Containment Heuristic
  • Example

Q1

Q2

B=>T : R->C, C=>B

Q1  Q2

R = ?

C = ?

C = ?

B = ?

B = ?

Q1 G Q2

T = ?

PSP *p1

PSP *p2

Stefanos Souldatos - HDMS 2006

introduction data model additional concepts query containment experiments conclusion31

IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion

experiments
Experiments
  • We measured…
    • execution time for
      • Absolute Containment (AC)
      • Relative Containment (RC)
      • Relative Containment Heuristic (RCH)
    • accuracy for RCH
  • …for various graph sizes
  • …for various query sizes

Stefanos Souldatos - HDMS 2006

slide33
Time

Graph dimensions: 30

Graph dimensions: 40

Graph dimensions: 20

RC

RC

RC

RCH

RCH

RCH

Time (msec)

AC

AC

AC

Graph paths: 10 - 80

Graph paths: 15 - 120

Graph paths: 20 - 160

Query PSPs: 1

Query PSPs: 2

RC

RC

Time (msec)

RCH

RCH

AC

AC

Nodes per PSP: 3 - 6

Nodes per PSP: 3 - 6

Stefanos Souldatos - HDMS 2006

accuracy of rch
Accuracy of RCH
  • 80% for graphs of common sizes
    • based on XML benchmarks (XMach, XMark, etc.)
  • 50% for graphs of higher density

Stefanos Souldatos - HDMS 2006

introduction data model additional concepts query containment experiments conclusion35

IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion

conclusion
Conclusion
  • Query Containment forPartially Specified Tree-Pattern Queries (PSTPQs).
  • Soundtechnique for checking Relative Query Containment
    • Time: one order of magnitude
    • Accuracy: over 80%

Stefanos Souldatos - HDMS 2006

future work

A

A

B

B

C

C

PSP p1

PSP p2

PSP *p3

Future Work
  • Heuristics for checking Relative Containment
    • precomputed and on-the-fly
    • trade-off between time and accuracy
  • Special forms of queries, e.g. swings:

Stefanos Souldatos - HDMS 2006

links
Links

Introduction (2-9)

Data Model (10-17)

Additional Concepts (18-20)

Query Containment (21-32)

Experiments (33-36)

Conclusion (37-41)

Appendix (42-46)

Stefanos Souldatos - HDMS 2006

who defines the dimensions
Who defines thedimensions?
  • Automatic
    • XML tags (dimension graph = “path summary”, “path index”, “structural summary”)
  • Semi-automatic
    • Graph administrator + XML tags

(dimension = group of XML tags)

    • Graph administrator + ontology
  • Manual
    • Graph administrator

Stefanos Souldatos - HDMS 2006

inference rules

R

C = {Greece}

C = {Greece}

C

L

B = {BMW}

B = {BMW}

E

B

M = ?

E = ?

M

T

PSP p1

PSP *p2

Inference Rules

INFERENCE RULES

(IR1) |- R[p1]  R[p2]

(IR2) A[p1]  A[p2], A[p2]  A[p3] |- A[p1]  A[p3]

(IR3) a structural expression that involves A[p] |- R[p] => A[p]

(IR4) A[p]  B[p] |- A[p] => B[p]

(IR5) A[p] => B[p], B[p] => C[p] |- A[p] => C[p]

(IR6) A[p]  B[p], A[p => C[p] |- B[p] => C[p]

(IR7) A[p]  B[p], C[p] => B[p] |- C[p] => A[p]

(IR8) A[p1]  B[p1], B[p1]  B[p2] |- A[p2]  B[p2]

(IR9) A[p1] => B[p1], B[p1]  B[p2] |- A[p2] => B[p2]

(IR10) A[p1] => B[p1], A[p1]  A[p2], R[p2] => B[p2] |- A[p2] => B[p2]

(IR11) A[p1] => B[p1], B[p1]  B[p2] |- A[p1]  A[p2]

(IR12) A[p1]  B[p1], C[p2]  B[p2], D[p1]  D[p2] |- D[p1] => A[p1]

(IR13) A[p1]  B[p1], A[p2]  C[p2], D[p1]  D[p2] |- D[p1] => A[p1]

(IR14) A[p1] => B[p1], B[p2] => A[p2], C[p1]  C[p2] |- C[p1] => A[p1]

1. Full Form Query

Stefanos Souldatos - HDMS 2006

dimension trees

R

R

C = {Greece}

C = {Greece}

R

C = {Greece}

C

L

B = {BMW}

B = {BMW}

C = {Greece}

B = {BMW}

E

B

B = {BMW}

M = ?

E = ?

T

M

T

T

R

R

PSP p1

PSP *p2

M

E

C = {Greece}

C = {Greece}

M

E

B = {BMW}

B = {BMW}

T

T

E

M

E

M

E

M

Dimension Trees

r/Greece/BMW/

*T[*E]/*M

r/Greece/BMW/

*T/*M [*E]

r/Greece/BMW/

*T/*E/*M

r/Greece/BMW/

*T[*M/*E]/*E*M

Stefanos Souldatos - HDMS 2006

previous approaches
Previous Approaches
  • Keyword-based search approach
    • Absence of structure
  • Naive approach
    • All possible query patterns are generated

(Honda=>Greece, Greece=>Honda)

  • Approximation techniques
    • Relax the query  more answers
  • Traditional integration approach
    • Global structure and mapping rules

Stefanos Souldatos - HDMS 2006

ad