Lore l ight o bject re pository
Download
1 / 45

LORE L ight O bject Re pository - PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on

LORE L ight O bject Re pository. by Othman Chhoul CSC5370 Fall 2003. Outline. Introduction What is Lore? History Lore’s Forensic Conclusion Questions Demo. Introduction. Limitations faced by traditional Databases: force all data to adhere to an explicitly specified schema

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'LORE L ight O bject Re pository' - sondra


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Lore l ight o bject re pository

LORELight Object Repository

by

Othman Chhoul

CSC5370 Fall 2003


Outline
Outline

  • Introduction

  • What is Lore?

  • History

  • Lore’s Forensic

  • Conclusion

  • Questions

  • Demo


Introduction
Introduction

  • Limitations faced by traditional Databases:

    • force all data to adhere to an explicitly specified schema

    • Data Elements may change

    • Structures may change along the execution path of an application

    •  Head ache when it comes to decide on a fixed schema for irregular or unstable data


Semistructured data
SemiStructured Data

  • Widespread SemiStructured Data:

    • “Self-describing”

    • “Schemaless”

  • Examples:

    • Data from the web

      • Overall site structure may change often.

      • It would be nice to be able to query a web site.

    • Data integrated from multiple, heterogeneous data sources.

      • Information sources change, or new sources added.


What is lore
What is Lore?

  • Lore is a DBMS designed specifically for managing semistructured information, such as XML

  • Among the Pioneers in this domain


History
History

  • Built, from scratch, by the DB Group at Stanford University, with research funding from DARPA, NASA and others.

  • Introduced in 1995, with the first version of the query language called Lorel, and used OEM as data model.

  • A lightweightsystem, because it was designed for a single-user, read-only access.

  • 1999 - changed to support XML


Lore s forensic
Lore’s Forensic

  • Lore’s Data model

  • Lore’s Query Language

  • Lore’s General Architecture

  • When XML gets into action


Oem object exchange model
OEM (Object Exchange Model)

  • Simple, self-describing, nested object model for semi structured data (XML???)

  • Data in this model can be thought of as a labeled directed graph

  • Vertices in graph are objects.

    • Each object has a unique object identifier (oid), such as &5.

    • Atomicobjects have no outgoing edges and are types such as int, real, string, gif, etc.

    • All other objects that have outgoing edges are called complex objects.


Oem summary
OEM (Summary)

  • An OEM object has:

    • Label: a character string, object aliases

    • OID: Object unique identifier

    • Type: Atomic (int, real, string), Complex

    • Value: If it is a complex object list of OIDs

      If it is an atomic object atomic value of type int, real, string…


Oem example
OEM (Example)


Lorel lore s query language
Lorel (Lore’s Query Language)

  • Lorel is an extension of OQL

  • Lorel supports path expressions for traversing graph data

  • A simple path expression is a name followed by a sequence of labels.

    • DBGroup.Member.Office: Set of objects that can be reached starting with the DBGroup object, following edges labels member and then office.


Lorel
Lorel

  • Range variables can be assigned to path expression

  • Path expression are used directly in queries in an SQL style:

    select DBGroup.Member.Office where DBGroup.Member.Age > 30


Lorel1
Lorel

Result:

Office “Gates252”

Office

Building “CIS”

Room “411”


Lorel behind the scenes
Lorel (Behind the scenes)

  • Previous query rewritten to OQL style:

    • select Ofrom DBGroup.Member M, M.Office Owhere exists y in M.Age : y > 30

  • Comparison on age transformed to existential condition:

    • A user can ask DBGroup.Member.Age < 30 regardless of whether Age is single valued, set valued, or unknown.


Lorel more examples
Lorel (More examples)

  • select DBGroup.Member.Name where DBGroup.Member.Office(.Room%)? like “%252”

  • Result: Name “Jones” Name “Smith”

  • Update: update P.Member +=( select DBGroup.Member where DBGroup.Member.Name = "Clark" )

    from DBGroup.Project P

    where P.Title = "Lore" or P.Title = "Tsimmis"



Lore s general architecture1
Lore’s General Architecture

  • Query and Update Processing

  • External Data

  • DataGuides


Query and update processing
Query and Update Processing

Queries

Data Engine

Results

(A Set of OEM objects)


Query plan generator
Query Plan Generator

  • select Ofrom DBGroup.Member M, M.Office Owhere exists y in M.Age : y > 30


Query iterators
Query Iterators

  • Use recursive iterator approach:

    • execution begins at top of query plan

    • each node in the plan requests a tuple at a time from its children and performs some operation on the tuple(s).

    • pass result tuples up to parent.


Tuples object assignment
Tuples (Object Assignment)

  • OA is a data structure containing slots for range variables with additional slots depending on the query.

  • Each slot within an OA will holds the oid of a vertex on a path being considered by the query engine.

  • We should end up at the end of a query with complete OAs


Query operators
Query Operators

  • The Scan operator returns all oids that are sub-objects of a given object following a specified path expression:

    • Scan (StartingOASlot, Path_expression, TargetOASlot)

    • For each oid in StartingOASlot, check to see if object satisfies path_expression and place oid into TargetOASlot.

  • For each returned OA of the left child, the join operator calls exhaustively the right child until no more OA is returned


Query operators cont
Query Operators (cont)

  • The aggregation operator (Aggr) adds to the target slot the result of the aggregation.

  • The Join, Project and Select are almost identical to their corresponding relational operators

  • Other operators: CreateSet, GoupBy, ArithOp




Query optimizer
Query Optimizer

  • Does only a few optimizations:

    • Push selection ops down query tree.

    • Eliminate/combine redundant query operators.

  • Explores query plans that use indexes when possible.

    • Two kinds of indexes:

    • Lindex (link index): returns all parents OIDs of a given OID via a label, impl. as hashing.

    • Vindex (value index): returns all atomic objects of a label that satisfies a condition, impl. as B+-trees


Vindexes
Vindexes

  • Because of non-strict typing system, have String Vindex, Real Vindex, and String-coerced-to-real Vindex.

  • Separate B-Trees of each type are constructed for each label.

  • Using Vindex for comparison

    • If type is string, do lookup in String Vindex

    • If can convert to real the do lookup in String-coerced-to-real Vindex.

    • If type is real or int, do almost the same thin



Index query plans
Index Query plans

  • If the user’s query contains a comparison between a path expression and a value + appropriate Vindex and Lindex exist generate an index query plan

  • Previous query:

    select O from DBGroup.Member M, M.Office O where exists y in M.Age : y > 30



Update query plans
Update Query plans

update P.Member +=( select DBGroup.Member where DBGroup.Member.Name = "Clark" )

from DBGroup.Project P

where P.Title = "Lore" or P.Title = "Tsimmis"


External data
External Data

  • Enables retrieval of information from other data sources, transparent to the user.

  • An external object in Lore is a “placeholder” for the external data and specifies how lore interacts with an external data source.


External data1
External Data

  • During query processing Scan operator notifies the external data manager whenever an external object is encountered

  • The spec for an external object includes:

    • Location of a wrapper program to fetch and convert data to OEM,

    • timeout interval

    • a set of arguments used to limit info fetched from external source.


Dataguides
DataGuides

  • A DataGuide is a concise and accurate summary of the structure of an OEM database (stored as OEM database itself, kind of like the system catalog).

  • Very Helpful:

    • No explicit database schema  difficult to formulate meaningful queries

    • Query processor may perform unnecessary work with no knowledge of the database structure.

    • What if a path expression doesn’t exist (waste).

  • Each possible path expression is encoded once.


Dataguides cont
DataGuides (cont)

  • DataGuides are dynamically generated and maintained over an existing database

  • Can store statistics in DataGuide For example, the # of atomic objects of each type reachable by p.



When xml gets into action
When XML gets into Action

  • Little reminder:

    • Lore first proposal in 1995

    • XML new standard for data representation and data exchange over the WWW.

    • Public class XML_data extends Semi_structured_data

    • Lore among the pioneers to integrate XML in their DBMS architecture


From semistructured data to xml
From Semistructured Data to XML

  • Data Model

  • Query Language

  • DataGuides


Changes in the data model
Changes in The Data Model

  • Similar to an OEM, an XML element in Lore is a pair of < EID, VALUE>

  • EID: is a unique element identifier

  • VALUE: is either an atomic string text or a complex value containing:

    • A String value: tag  XML tag

    • An ordered list of attribute-name/atomic-value

    • An ordered list of crosslink subelements of the form <label,EID>, reachable via IDREF or IDREFS

    • An ordered list of subelements of the form <label,EID>


Changes in the data model cont
Changes in The Data Model (cont)

  • Comments are ignored

  • When an XML document is mapped into this new data model, it can be seen as a directed labeled graph



Query language
Query Language

  • Extended path expression to distinguish between subelements and attributes, by using qualifiers:

    • DBGroup.Member.>Name &6, use > to implicitly specify a subelement

    • DBGroup.Member.@Name  “Smith”, use @ to implicitly specify an attribute

    • DBGroup.Member.Name &6 “Smith”, when no @ or > qualifier is used, both attributes and subelements are matched


Dataguides1
DataGuides

  • Provide a DTD from which Lore builds the corresponding DataGuide

  • Otherwise if no DTD is provided, a DataGuide is generated from the XML document

  • Problems when updating:

    • With a DTD is provided, validity is assured

    • With no DTD, DataGuide is updated as the XML document is updated


Conclusion
Conclusion

  • Lore was originally developed for OEM data model since 1995, XML was integrated later in 1999

  • Lore Provided a clear and robust solution for storing, querying, and updating semistructured data (XML came after)

  • The Lore project was declared pretty much out of business in 2000 by The Stanford Database Group