Data integration under the schema tuple query assumption
Download
1 / 10

Slides - PowerPoint PPT Presentation


  • 240 Views
  • Uploaded on

Data Integration under the Schema Tuple Query Assumption Michael Minock The University of Umeå, Sweden Introduction Problem: Queries may be over information that is not (yet) covered by the data integration system ”List museums in Vienna or Bratislava holding paintings by Klimt or Picasso.”

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Slides ' - jaden


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Data integration under the schema tuple query assumption l.jpg

Data Integration under the Schema Tuple Query Assumption

Michael Minock

The University of Umeå, Sweden

Michael Minock ([email protected])


Introduction l.jpg
Introduction

  • Problem:

    • Queries may be over information that is not (yet) covered by the data integration system

      • ”List museums in Vienna or Bratislava holding paintings by Klimt or Picasso.”

        • A purely extensional response misleads

  • Solution:

    • Give available extension, but contextualize with intensional descriptions of coverage

      • Certain: ”The following are all the museums in Vienna that hold paintings of Picasso: …”

      • Possible: ”The following museums in Vienna do not provide inventory records, so they may have paintings by Klimt:…”

      • Incomplete: ”There is no information for museums in Bratislava.”

Michael Minock ([email protected])


Approach l.jpg
Approach

  • LAV (Local as View) architecture

    • user queries and data source descriptions restricted to schema tuple queries in L(or Q)

    • currently sources must contain complete and correct views

    • broker mediates user query over sources and supplies a mixed extensional/intensional response

  • Use ’algebraic’ properties of L (or Q) to derive:

    • query plan (using cache)

    • logical descriptions of certain, uncertain and incomplete sets

  • Exploit subsumption properties for:

    • query simplification

    • natural language generation

Michael Minock ([email protected])


The schema tuple query languages l and q l.jpg
The Schema Tuple Query Languages L (and Q)

  • Assumptions:

    • L :Tuple relational queriesof the form:

    • Q:

  • Properties:

    • L and Q decidable for satisfiability

    • Unlike , Q closed over negation

    • May calculate difference and intersection and decide containment, equivalence and disjointness for queries built using L and Q

Michael Minock ([email protected])


Example art museum domain l.jpg
Example: Art museum domain

  • QUERY: ”List museums in Vienna or Bratislava

  • holding paintings by Klimt or Picasso.”

Artist(id, name, country, DOB,DOD)

Museum (id, name, address, city, country)

Painting (id, title,year, artistId)

HasPainting (museumId, paintingId)

Central European

Museums

MAK

Inventory

Picasso

Locator

Albertina

Inventory

Michael Minock ([email protected])


Example input expressions l.jpg
Example: Input Expressions …

(m Museum

(IN m city ("Vienna" "Bratislava"))

(+ (y1 y2 y3)

(HasPainting y1)(Painting y2)(Artist y3)

(= m id y1 museumId)(= y1 paintingId y2 id)(= y2 artistId y3 id)

(IN y3 name ("Klimt" "Picasso"))))

(h HasPainting

(+ (y1 y2)

(Painting y1)

(Artist y2)

(= h paintingId y1 id)

(= y1 artistId y2 id)

(= y2 name "Picasso"))))

(m Museum

(IN m city

("Vienna" "Prague”

"Berlin” …))))

(h HasPainting

(+ (y1)

(Museum y1)

(= h museumId y1 id)

(= y1 name "MAK")

(= y1 city "Vienna"))))

(h HasPainting

(+ (y1)

(Museum y1)

(= h museumId y1 id)

(= y1 name ”Albertina")

(= y1 city "Vienna"))))

Michael Minock ([email protected])


Example output expressions l.jpg
Example: Output Expressions …

(m Museum

(= m city ”Vienna")

(+ (y1 y2 y3)

(HasPainting y1)(Painting y2)(Artist y3)

(= m id y1 museumId)(= y1 paintingId y2 id)

(= y2 artistId y3 id)(= y3 name "Picasso")))

(m Museum

(= m city ”Vienna")

(IN m name (”Albertina” ”MAK”))

(+ (y1 y2 y3)

(HasPainting y1)(Painting y2)(Artist y3)

(= m id y1 museumId)(= y1 paintingId y2 id)

(= y2 artistId y3 id)(= y3 name "Klimt")))

Certain

(m Museum

(= m city ”Vienna")

(NOT_IN m name (”Albertina” ”MAK”))

(+ (y1 y2 y3) (HasPainting y1)(Painting y2)(Artist y3)

(= m id y1 museumId)(= y1 paintingId y2 id)(= y2 artistId y3 id)

(= y3 name "Klimt")))

Uncertain

(m Museum

(= m city "Bratislava")

(+ (y1 y2 y3)

(HasPainting y1)(Painting y2)(Artist y3)

(= m id y1 museumId)(= y1 paintingId y2 id)(= y2 artistId y3 id)

(IN y3 name ("Klimt" "Picasso"))))

Incomplete

Michael Minock ([email protected])


Example to natural language l.jpg
Example: To Natural Language

  • QUERY: ”List museums in Vienna or Bratislava

  • holding paintings by Klimt or Picasso.”

”Museums in Vienna named

’Albertina’ or ’MAK’

that have paintings by Klimt.”

Certain

”Museums in Vienna that have paintings by Picasso”

Museums in Vienna not named

’Albertina’ or ’MAK’

that have paintings by Klimt.”

Uncertain

Incomplete

”Museums in Bratislava that have paintings by Picasso or Klimt.”

Michael Minock ([email protected])


Pros and cons of l and q l.jpg
Pros and cons of L and Q

  • Pros

    • May represent n-ary relations

      • Direct translation to SQL!

  • Some negation

  • General cyclic queries

    ”The artists without paintings in a museum in the country of their origin.”

  • Cons

    • No projection!

    • Certain quantifier prefixes prohibited

      ”The artists with paintings in all of the museums in the country of their origin”

  • Michael Minock ([email protected])


    Next step l.jpg
    Next ’STEP’…

    • STEP 1.0 (Schema Tuple Expression Processor)

    • Incomplete and/or incorrect source views

    • Real applications

    Datasource

    Descriptions

    Phrasal

    Lexicon

    Cache

    DB

    Broker

    NLG

    Differencing Engine/Simplifier

    L2DomainCalculus

    SPASS theorem prover

    Michael Minock ([email protected])


    ad