the role of a mediator in r gma n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
The role of a Mediator in R-GMA PowerPoint Presentation
Download Presentation
The role of a Mediator in R-GMA

Loading in 2 Seconds...

play fullscreen
1 / 17

The role of a Mediator in R-GMA - PowerPoint PPT Presentation


  • 100 Views
  • Uploaded on

The role of a Mediator in R-GMA. Manfred Oevers IBM Andrew Cooke Heriot Watt Laurence Field RAL Steve Fisher RAL James Magowan IBM Werner Nutt Heriot Watt Howard Williams Heriot Watt. Schema & Contributions. Contributions are Views. SELECT * FROM CPULoad

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The role of a Mediator in R-GMA' - shamus


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the role of a mediator in r gma

The role of a Mediator in R-GMA

Manfred OeversIBM

Andrew CookeHeriot Watt

Laurence Field RAL

Steve FisherRAL

James MagowanIBM

Werner NuttHeriot Watt

Howard Williams Heriot Watt

contributions are views
Contributions are Views

SELECT * FROM CPULoad

WHERE Country = ‘UK’ AND Site = ‘RAL’

SELECT * FROM CPULoad

WHERE Country = ‘UK’ AND Site = ‘GLA’

the scenario
The Scenario

Ga relational schema (for a virtual database)

qqueries posed againstG

pproducers, associated with views onG

Currently views have the form:

SELECT *

FROM r

WHERE < ??? >

The Mediator: how to match q with the p’s

a concise notation
A Concise Notation
  • CREATE TABLE cpuLoad(Loc,M,L)
  • SELECT Loc,M FROM cpuLoad WHERE Loc=‘RAL’ and L >= 70
  • (Loc,M) | cpuLoad(RAL,M,L) & L >=80
satisfiability
Satisfiability

The Problem: “For all locations give me all machines with a cpu load L >= 70”

q: (Loc, M) | cpuLoad(Loc, M, L) & L >= 70

p1: (ral, M, L) | cpuLoad(ral, M, L) & L >= 80

p2: (hw, M, L) | cpuLoad(hw, M, L) & L >= 50

p3: (gla, M, L) | cpuLoad(gla, M, L) & L <= 20

The Query Plan:

(Loc, M) | p1(Loc, M, L)

U

(Loc, M) | p2(Loc, M, L) & L >= 70

slide7

Satisfiability (issues)

Implementation:

  • What are suitable sources?

This involves checking satisfiability of constraints - a task for the Registry?

  • Who computes “load L >= 70” ?
    • The Mediator? Or the Producer?
    • What are the capabilities of a Producer?
    • Which are relevant?
    • Where are these recorded?
slide8

Completeness

The Problem: “Find all machines that are not in USA and have diskspace S >= 100”

q: M | DiskSpace(M, S) & S > 100

& NOT InUSA(M)

p1: (M, S) | DiskSpace(M, S)

p2: M | InUSA(M)

The Query Plan:

M | p1(M, S) & S > 100 & NOT InUSA(M)

slide9

Completeness(issues)

Implementation:

  • What if p1 doesn’t know about all machines?

We might not get all answers for our query (“incompleteness”)

  • What if p2 doesn’t know about all US machines?
    • We might get answers that don’t satisfy our

query (“incorrect” answers).

    • What is the yardstick for completeness?
slide10

Projection Views (1)

Popular queries stored by an Archiverarmay

involve projection, e.g.

“all machines with disk space S >= 50”

ar: M | DiskSpace(M, S) & S >= 50

The Problem: “get all machines with S >= 30”

q: M | DiskSpace(M, S) & S >= 30

Can we compute answers forq, even though no

diskspace values are stored?

slide11

Projection Views (2)

Query Plan:

  • In all possible instances of this database, machines stored inarhave diskspace S >= 50
  • Thus,arprovides certain answers to queryq

What if the values 50/ 30 are swapped?

slide12

Projection Views (3)

“all machines with disk space S >= 30”

ar: M | DiskSpace(M, S) & S >= 30

The Problem: “get all machines with S >= 50”

q: M | DiskSpace(M, S) & S >= 50

  • In some instances, all machines inarwill be correct answers toq… in others, not.
  • Thus,arwould not provide certain answers.
slide13

Link(x,y)

ral007

ibm747

Diskspace = 24

Diskspace = 90

hw666

Diskspace= ?

gla999

Diskspace = 10

Computing certain answers can be costly (1)

The Problem:

q: M | Link(X,Y) & DiskSpace(Y, S1) & S1 >= 50

& Link(Y,Z) & DiskSpace(Z, S2) & S2 < 50

Is ral007 a certain answer?

slide14

Computing certain answers can be costly (2)

The Problem:

“Find all machines that are linked to another with a

diskspace >= 50, which is in turn linked to one

with a diskspace < 50.”

q: X | Link(X,Y) & DiskSpace(Y, S1) & S1 >= 50

& Link(Y,Z) & DiskSpace(Z, S2) & S2 < 50

Is ral007 a certain answer?

The Answer:

It is! But we have to reason about all cases...

early conclusions 1
Early Conclusions (1)

First Problem: Semantics

  • What are the answers we expect from our queries? Certain answers? A subset of these?
  • So far we have not looked at time, which will raise further questions.
  • We need to clarify what producer views mean? (Completeness? To what degree?)

Semantics are not too difficult when there are no projection views (or aggregation).

Query planning techniques exist for special cases, e.g. select/project/join views and queries without comparisons (<, >, …).

early conclusions 2
Early Conclusions (2)

The Mediator needs Helpers

  • Who decides which sources are relevant for a query?
    • The Registry?
    • The Mediator? (but higher network load).
  • Can Producers do:
    • selections?
    • joins?

(several producers may be attached to one DBMS)

early conclusions 3
Early Conclusions (3)

What will the Mediator do?

  • Construct a set of logical plans

= query over some producers

  • Identify logical plans that are feasible

(e.g. input bindings: “no phone no. without a name”)

  • Construct an execution plan
    • which concrete operations, when (e.g. selection, sort-merge join...
    • joining becomes complex!
  • Choose the best/ cheapest plan
  • Execute the plan