Mdbs schema integration the relational integration model
1 / 25

MDBS Schema Integration: The Relational Integration Model - PowerPoint PPT Presentation

  • Uploaded on

MDBS Schema Integration: The Relational Integration Model. Researchers: Ramon Lawrence, Ken Barker University of Manitoba. TR Labs - Winnipeg. Outline. Introduction The MDBS architecture and the Integration problem A schema integration taxonomy Previous Work

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'MDBS Schema Integration: The Relational Integration Model' - ishi

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Mdbs schema integration the relational integration model

MDBS Schema Integration: The Relational Integration Model


Ramon Lawrence, Ken Barker

University of Manitoba

TRLabs - Winnipeg


  • Introduction

  • The MDBS architecture and the Integration problem

  • A schema integration taxonomy

  • Previous Work

  • The RIM Architecture and the RIM Model

  • Future work and conclusions

Database terminology
Database Terminology

  • database system - a database and a system to manage the data

  • transaction - an atomic sequence of operations applied to the database

  • global transaction - a transaction spanning more than one database

  • schema integration - the process of combining local schemas into a global, integrated schema

  • multidatabase system (MDBS) - a collection of autonomous, local databases participating in a global database system to share data

Mdbs architecture









MDBS Architecture

Global Transactions

  • Global Transaction Manager (GTM)

    • processes global transactions

    • insures information in all LDBSs is consistent

    • submits subtransactions to the GTSs for each LDBS



  • Global Transaction Servers (GTSs)

    • one for each LDBS

    • converts subtransactions from the GTM into a form usable by the LDBS and vice versa

  • Local Database Systems (LDBSs)

    • databases combined into MDBS

    • unchanged as still process local transactions

Local Transactions

The integration problem
The Integration Problem

  • Integrating diverse data sources is an important issue as organizations interconnect their operations and demand more from their database systems

  • Integration is a hard problem because structural and semantic conflicts exist

  • Two levels of integration:

    • schema integration

    • data integration

Schema integration
Schema Integration

  • Schema integration is the process of combining database schemas into a coherent global view

  • Integration problems include:

    • different data models

    • incompatible concept representations

    • different user or view perspectives

    • structural conflicts within a model

    • naming conflicts (homonym, synonym)

A schema integration taxonomy
A Schema Integration Taxonomy

Automation Level

automatic (dynamic)

automatic (static)















Previous work
Previous Work

  • semantic models:

    • Batini (86), canonical models, SDM, DAPLEX

  • schema re-engineering:

    • model mapping tools, schema transformations

  • metadata systems:

    • rule-based systems

  • object-oriented methods:

    • use as a canonical model, schema transformations

  • application-level integration:

    • language systems, MSQL, IDL, higher order views

Previous work cont
Previous Work (cont.)

  • Interdatabase dependencies:

    • Sheth - relaxed consistency, integration rules

  • AI techniques:

    • Pegasus (spheres of knowledge), knowledge packets, Carnot project (Cyc knowledge base)

  • Lexical semantics:

    • Summary Schemas Model (Bright et al.) - user interface for imprecise queries

  • Industrial systems:

    • Interbase

Rim objective
RIM: Objective

  • The objective of the RIM model is to provide a system for automatically integrating diverse relational schemas into a multidatabase

  • Desirable properties:

    • individual mappings - information sources integrated one-at-a-time and independently

    • global view constructed for query transparency

    • handles schema conflicts - including semantic, structural, and naming conflicts

    • automated global integration - global view constructed efficiently and automatically

Rim the idea
RIM: The Idea

  • The idea behind the RIM model is that most (and probably all) schema conflicts can be resolved if we:

    • eliminate all naming conflicts

    • define a language capable of determining schema equivalence and performing transformations

  • With these two properties, schema conflicts can be resolved automatically at the global level

Rim the plan
RIM: The Plan

  • The first task is eliminating naming conflicts:

    • use a global thesaurus/dictionary like SSM

    • map local schema names into global counterparts

    • identical concepts can be identified by global name

  • The integration language must be defined:

    • RIM specifications - records capturing semantics of each LDBS in a machine-processable form

    • global names captured in RIM specs. to identify concepts stored in LDBS

Rim the plan cont
RIM: The Plan (cont.)

  • Integrate RIM specifications:

    • To query the MDBS, the client downloads and integrates only RIM specs. of LDBSs accessed

    • Global view is constructed from RIM specs. by automatically combining them at client site using global names and semantic metadata they contain

    • Use of global names allows system to determine identical concepts even though structural representations may be different

    • Semantic information captured using metadata

Rim the plan cont1
RIM: The Plan (cont.)

  • Querying the MDBS:

    • queries are posed to the MDBS through the global view at each client

    • translation from the GV back to the original RIM spec. for each LDBS is performed

    • the translated queries are sent to each LDBS which transforms the query (specified using RIM) into a query for the LDBS

    • results are returned to the client which integrates them based on its GV

Rim architecture















Global View

Global View



RIM: Architecture

  • RIM Specifications:

  • constructed at each RDBS

  • local concepts mapped to global names

  • schema can be automatically extracted

  • RIM Integration:

  • uses needed RIM specs.

  • constructs global view

  • resolves conflicts by:

    • identifying concepts using global names

    • transforming concepts into a form consistent with the global view

Rim using global names
RIM: Using Global Names

  • Global names attempt to capture semantics of data and its structure

  • Research has found that a single dictionary term is insufficient to capture all semantics of a given data item

  • Current proposed global name term:

    • [context term] [concept name] ([adjective phrases])

    • [adjective phrase] = [adjective] [preposition] ([context term] or [concept name])

Rim using global names cont
RIM: Using Global Names (cont.)

  • Here a few examples of using global names:

    • the database stores damage claim information

  • Example 1:

    • attribute of claim is called net_amount in system

    • GN: [Claim] Net Amount

  • Example 2:

    • attribute of claim is called claim_date in system

    • GN1: [Claim] Claim date (received by system)

    • GN2: [Claim] Claim date (received by company)

    • GN3: [Claim] Claim date (submitted by claimant)

Rim the global dictionary
RIM: The Global Dictionary

  • To match concepts across systems, a global dictionary is required. Global names are taken from this dictionary.

  • Dictionary currently chosen is WordNet developed at Princeton:

    • complete on-line dictionary with a browser interface

    • defines multiple definitions per term

    • has built in hypernym and synonym searching and referencing features

  • Future work involves determining how to add locally defined terms into the dictionary if required

Rim basic concepts
RIM: Basic Concepts

  • There are 3 basic modeling constructs in RIM:

    • entity - a concept whose existence does not depend on any other entities

    • relationship - a combination of two or more entities which does not exists without them

    • attribute - a characteristic of an entity or a relationship

  • All entities and attributes should be identifiable by a global name from the dictionary.

Rim rim specifications
RIM: RIM Specifications

  • A RIM specification consists of two parts:

    • table headers - table-level information for each relation in database

    • table schemas - information at the attribute level of a database relation

  • Most of the information can be automatically extracted, however the DBA must assign global names to local concepts manually

Rim the table header
RIM: The Table Header

  • The table header provides table-level information for each relation and has fields:

    • name - unique table name (local)

    • record size and count

    • foreign key list and foreign key access list

    • record insert/delete/update mechanisms

    • record name - semantic name for a table record

    • record type - entity, relationship instance, ...

    • record grouping - why are records in the table?

    • record distinction/duplicates - primary key

    • table comment

Rim the table schema
RIM: The Table Schema

  • The table schema contains attribute-level information. Some fields include:

    • field name - database system name

    • semantic name - global name

    • field use:

      • attribute, key, categorization, summation, date/time, foreign key, logical, numeric, reference

Rim semantic conflicts
RIM: Semantic Conflicts

  • There are 6 basic semantic conflicts in RIM:

    • attribute-entity conflict

    • attribute-relationship conflict

    • entity-relationship conflict

    • entity-entity conflict (not studied)

    • attribute-attribute conflict (not studied)

    • relationship-relationship conflict (not studied)

  • There is some basic ideas on how to automatically resolve the first 3 conflicts.

  • Conflict resolution is an area of future work.


  • Current integration methodologies are insufficient because they rely on manual intervention and do not resolve all types of conflicts

  • The RIM model may be able to integrate diverse relational schemas using a global dictionary, a systematic method for capturing data semantics, and automated procedures for performing client run-time integration

Future work
Future Work

  • Determining how the RIM specifications can be constructed and what information can be automatically extracted

  • Deciding the format for the global dictionary

  • Studying conflict resolution procedures and testing methodology on simple integration problems