Mdbs schema integration the relational integration model
Download
1 / 25

MDBS Schema Integration: The Relational Integration Model - PowerPoint PPT Presentation


  • 162 Views
  • Uploaded on

MDBS Schema Integration: The Relational Integration Model. Researchers: Ramon Lawrence, Ken Barker University of Manitoba. TR Labs - Winnipeg. Outline. Introduction The MDBS architecture and the Integration problem A schema integration taxonomy Previous Work

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'MDBS Schema Integration: The Relational Integration Model' - ishi


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Mdbs schema integration the relational integration model

MDBS Schema Integration: The Relational Integration Model

Researchers:

Ramon Lawrence, Ken Barker

University of Manitoba

TRLabs - Winnipeg


Outline
Outline

  • Introduction

  • The MDBS architecture and the Integration problem

  • A schema integration taxonomy

  • Previous Work

  • The RIM Architecture and the RIM Model

  • Future work and conclusions


Database terminology
Database Terminology

  • database system - a database and a system to manage the data

  • transaction - an atomic sequence of operations applied to the database

  • global transaction - a transaction spanning more than one database

  • schema integration - the process of combining local schemas into a global, integrated schema

  • multidatabase system (MDBS) - a collection of autonomous, local databases participating in a global database system to share data


Mdbs architecture

GTS

GTS

GTS

GTS

LDBS

LDBS

LDBS

LDBS

MDBS Architecture

Global Transactions

  • Global Transaction Manager (GTM)

    • processes global transactions

    • insures information in all LDBSs is consistent

    • submits subtransactions to the GTSs for each LDBS

GTM

subtransactions

  • Global Transaction Servers (GTSs)

    • one for each LDBS

    • converts subtransactions from the GTM into a form usable by the LDBS and vice versa

  • Local Database Systems (LDBSs)

    • databases combined into MDBS

    • unchanged as still process local transactions

Local Transactions


The integration problem
The Integration Problem

  • Integrating diverse data sources is an important issue as organizations interconnect their operations and demand more from their database systems

  • Integration is a hard problem because structural and semantic conflicts exist

  • Two levels of integration:

    • schema integration

    • data integration


Schema integration
Schema Integration

  • Schema integration is the process of combining database schemas into a coherent global view

  • Integration problems include:

    • different data models

    • incompatible concept representations

    • different user or view perspectives

    • structural conflicts within a model

    • naming conflicts (homonym, synonym)


A schema integration taxonomy
A Schema Integration Taxonomy

Automation Level

automatic (dynamic)

automatic (static)

semi-automatic

manual

Conflicts

Resolved

interschema

all

naming

NONE

structural

structural

semantic

behavioral

both

Transparency


Previous work
Previous Work

  • semantic models:

    • Batini (86), canonical models, SDM, DAPLEX

  • schema re-engineering:

    • model mapping tools, schema transformations

  • metadata systems:

    • rule-based systems

  • object-oriented methods:

    • use as a canonical model, schema transformations

  • application-level integration:

    • language systems, MSQL, IDL, higher order views


Previous work cont
Previous Work (cont.)

  • Interdatabase dependencies:

    • Sheth - relaxed consistency, integration rules

  • AI techniques:

    • Pegasus (spheres of knowledge), knowledge packets, Carnot project (Cyc knowledge base)

  • Lexical semantics:

    • Summary Schemas Model (Bright et al.) - user interface for imprecise queries

  • Industrial systems:

    • Interbase


Rim objective
RIM: Objective

  • The objective of the RIM model is to provide a system for automatically integrating diverse relational schemas into a multidatabase

  • Desirable properties:

    • individual mappings - information sources integrated one-at-a-time and independently

    • global view constructed for query transparency

    • handles schema conflicts - including semantic, structural, and naming conflicts

    • automated global integration - global view constructed efficiently and automatically


Rim the idea
RIM: The Idea

  • The idea behind the RIM model is that most (and probably all) schema conflicts can be resolved if we:

    • eliminate all naming conflicts

    • define a language capable of determining schema equivalence and performing transformations

  • With these two properties, schema conflicts can be resolved automatically at the global level


Rim the plan
RIM: The Plan

  • The first task is eliminating naming conflicts:

    • use a global thesaurus/dictionary like SSM

    • map local schema names into global counterparts

    • identical concepts can be identified by global name

  • The integration language must be defined:

    • RIM specifications - records capturing semantics of each LDBS in a machine-processable form

    • global names captured in RIM specs. to identify concepts stored in LDBS


Rim the plan cont
RIM: The Plan (cont.)

  • Integrate RIM specifications:

    • To query the MDBS, the client downloads and integrates only RIM specs. of LDBSs accessed

    • Global view is constructed from RIM specs. by automatically combining them at client site using global names and semantic metadata they contain

    • Use of global names allows system to determine identical concepts even though structural representations may be different

    • Semantic information captured using metadata


Rim the plan cont1
RIM: The Plan (cont.)

  • Querying the MDBS:

    • queries are posed to the MDBS through the global view at each client

    • translation from the GV back to the original RIM spec. for each LDBS is performed

    • the translated queries are sent to each LDBS which transforms the query (specified using RIM) into a query for the LDBS

    • results are returned to the client which integrates them based on its GV


Rim architecture

RDBS

RDBS

RDBS

RDBS

RIM

spec.

RIM

spec.

RIM

spec.

RIM

spec.

RIMIntegration

RIMIntegration

Global View

Global View

Client

Client

RIM: Architecture

  • RIM Specifications:

  • constructed at each RDBS

  • local concepts mapped to global names

  • schema can be automatically extracted

  • RIM Integration:

  • uses needed RIM specs.

  • constructs global view

  • resolves conflicts by:

    • identifying concepts using global names

    • transforming concepts into a form consistent with the global view


Rim using global names
RIM: Using Global Names

  • Global names attempt to capture semantics of data and its structure

  • Research has found that a single dictionary term is insufficient to capture all semantics of a given data item

  • Current proposed global name term:

    • [context term] [concept name] ([adjective phrases])

    • [adjective phrase] = [adjective] [preposition] ([context term] or [concept name])


Rim using global names cont
RIM: Using Global Names (cont.)

  • Here a few examples of using global names:

    • the database stores damage claim information

  • Example 1:

    • attribute of claim is called net_amount in system

    • GN: [Claim] Net Amount

  • Example 2:

    • attribute of claim is called claim_date in system

    • GN1: [Claim] Claim date (received by system)

    • GN2: [Claim] Claim date (received by company)

    • GN3: [Claim] Claim date (submitted by claimant)


Rim the global dictionary
RIM: The Global Dictionary

  • To match concepts across systems, a global dictionary is required. Global names are taken from this dictionary.

  • Dictionary currently chosen is WordNet developed at Princeton:

    • complete on-line dictionary with a browser interface

    • defines multiple definitions per term

    • has built in hypernym and synonym searching and referencing features

  • Future work involves determining how to add locally defined terms into the dictionary if required


Rim basic concepts
RIM: Basic Concepts

  • There are 3 basic modeling constructs in RIM:

    • entity - a concept whose existence does not depend on any other entities

    • relationship - a combination of two or more entities which does not exists without them

    • attribute - a characteristic of an entity or a relationship

  • All entities and attributes should be identifiable by a global name from the dictionary.


Rim rim specifications
RIM: RIM Specifications

  • A RIM specification consists of two parts:

    • table headers - table-level information for each relation in database

    • table schemas - information at the attribute level of a database relation

  • Most of the information can be automatically extracted, however the DBA must assign global names to local concepts manually


Rim the table header
RIM: The Table Header

  • The table header provides table-level information for each relation and has fields:

    • name - unique table name (local)

    • record size and count

    • foreign key list and foreign key access list

    • record insert/delete/update mechanisms

    • record name - semantic name for a table record

    • record type - entity, relationship instance, ...

    • record grouping - why are records in the table?

    • record distinction/duplicates - primary key

    • table comment


Rim the table schema
RIM: The Table Schema

  • The table schema contains attribute-level information. Some fields include:

    • field name - database system name

    • semantic name - global name

    • field use:

      • attribute, key, categorization, summation, date/time, foreign key, logical, numeric, reference


Rim semantic conflicts
RIM: Semantic Conflicts

  • There are 6 basic semantic conflicts in RIM:

    • attribute-entity conflict

    • attribute-relationship conflict

    • entity-relationship conflict

    • entity-entity conflict (not studied)

    • attribute-attribute conflict (not studied)

    • relationship-relationship conflict (not studied)

  • There is some basic ideas on how to automatically resolve the first 3 conflicts.

  • Conflict resolution is an area of future work.


Conclusions
Conclusions

  • Current integration methodologies are insufficient because they rely on manual intervention and do not resolve all types of conflicts

  • The RIM model may be able to integrate diverse relational schemas using a global dictionary, a systematic method for capturing data semantics, and automated procedures for performing client run-time integration


Future work
Future Work

  • Determining how the RIM specifications can be constructed and what information can be automatically extracted

  • Deciding the format for the global dictionary

  • Studying conflict resolution procedures and testing methodology on simple integration problems