1 / 18

The Identification of Missing Information Resources through the Query Difference Operator

The Identification of Missing Information Resources through the Query Difference Operator. Michael Minock. Global Schema. Agree on a non-cyclic set of equi-joins over a set of relations. Virtual outer-join into single relation. Queries.

megan-lucas
Download Presentation

The Identification of Missing Information Resources through the Query Difference Operator

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Identification of Missing Information Resources through the Query Difference Operator Michael Minock

  2. Global Schema Agree on a non-cyclic set of equi-joins over a set of relations Virtual outer-join into single relation

  3. Queries • Simple queries have projection (entire relations) and selection • Despite the similar appearance, this is not (exactly) relational algebra. • Query super-imposition operator gives “union” over non-union compatible projections. Such queries are termed compound queries. A conjunction of simple conditions

  4. The Query Difference Operator • Theorem 1: • May compute query intersection, subsumption, and equivalence. • Theorem 2: • Query Difference is distributive over compound queries

  5. Query Simplification • Horizontal Merge • Vertical Merge • Absorption

  6. Limitations • Caveat - take care in applying negation! • Simplify when condition attribute is both: • 1.) functionally dependent on the key of each relation in projection set • 2.) not a proper subset of a primary key • Schema • Requires a non-cyclic set of equi-joins to be predetermined among a set of relations • Queries • Projections are (currently) entire relations • No self-joins • Implicit inclusion of predetermined schema equi-joins

  7. Where can I see JAWS? Jaws austin Jaws LA … but no agent know s about AA Internet Distributed Conceptual Information Spaces User’s asks conceptual query to Broker Broker agent knows schema B • Global schema models domain • (E.g. Movies, Social Events, Electronics, etc.) Data resources (agents) advertise their contents to Broker

  8. Problems • Complex Semantics of Global Schema • Agreement • Common Understanding • Quality of Data • Completeness • Consistency • Quality of Access • Novice query construction • Non-misleading answers • (identification of missing resources, conceptual answers, etc.)

  9. Approach • Use the Query Difference Operator (defined here) : • Valid only over a restricted class of schemas • Defines a syntactic method of computing query (concept) difference • Applied here to: • Identify the exact portion of a user’s query that is not covered by any agent in the information space • Relevant to other problems as well...

  10. Catalog Textbook Movie Information and reviews for all the drama and comedy films made between 1927 and 1983 Movie Information for all the action, documentary, drama, and comedy films made between 1955 and 1999 Movie Example “All films made after 1927” Movie(title,year,type) Review(title,source,evaluation) Show(title, theater, time, city) ...

  11. Example Queries/Responses Query 1: “Give movie information and reviews for all dramas.” Response: “…, but no reviews for dramas made after 1983.” Query 2: “Give movie information and reviews for all dramas and documentaries made in the 1950’s.” Response: “…, but no movie information or reviews for documentaries made between 1950 and 1954. Also no review information for documentaries made between 1955 and 1959. ” Query 3: “Give movie information for all films made in the thirties, forties, or fifties.” Response: “…, but no movie information for non-drama and non-comedy films made between 1930and 1954. Also no movie information for non-drama, non-action, non-comedy and non-documentary films made between 1930 and 1959.”

  12. Query: Textbook: Query’ = Example Calculation(1 of 3) “Give movie information and reviews for all dramas and documentaries made in the 1950’s.” Query’ = Query -Textbook

  13. Query’’ = Example Calculation(2 of 3) Query’ = Catalog = Query’’ = Query’ -Catalog

  14. Via Vertical Merge = Example Calculation(3 of 3) Query’’ = Via Horizontal Merge = “No movie information or reviews for documentaries made between 1950 and 1954. Also no review information for documentaries made between 1955 and 1959. ”

  15. Prototype • Proof of concept in LISP • Condition types: • On PC calculates query plan over 1000 resources in <10 seconds • On PC calculates query residual over 1000 resources in <30 seconds • Worst case query lengths governed by number of attributes • Planning a straight C (or JAVA) implementation.

  16. Formal Extensions • Materialized values, open and closed-world modifiers in language • Mixed Intensional/Extensional responses • Summarization • Encompassing less-restrictive schemas and queries • Inheritance, Cycles through Cliques • Spatial and Temporal conditions • Meta-schema (Meta-Meta-schema, …) • Row types and Semi-structured Data (XML) • Agents • Knowledge and Belief Operators

  17. Applications • Conversational (mixed intensional/extensional) access to distributed information spaces • ‘Perfect’ semantic caching over distributed agent systems • Reasoning over contracts, access restrictions, and regulations for electronic commerce among competitive agents • And more...

  18. Conclusions • The Query Difference Operator solves conceptual equations over schema of non-cyclic equi-joins • Applied to the problem of identifying missing information resources • Prototype proves concept and gives good performance • Set of formal extensions and new application ideas proposed

More Related