Schema evolution
1 / 20

Schema Evolution - PowerPoint PPT Presentation

  • Uploaded on

Schema Evolution. Schema Evolution. Ref: Alex Borgida & K. E. Williamson. “Accommodating Exceptions in Databases, and Refining the Schema by Hearing From Them”. VLDB, 1985. SCHEMA User Describes the “domain of discoveries” of the database Check the correctness of the data entry DBMS

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Schema Evolution' - xerxes

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Schema evolution1
Schema Evolution

Ref: Alex Borgida & K. E. Williamson. “Accommodating Exceptions in Databases, and Refining the Schema by Hearing From Them”. VLDB, 1985.


  • User

    • Describes the “domain of discoveries” of the database

    • Check the correctness of the data entry

  • DBMS

    • Storage efficiency (store extra attributes and non-unique keys, attribute domains constrain violations)

    • Faster retrieval

    • Semantic query optimization

  • Integrity constraints - used to enforce additional conditions not captured by the restrictive language of class definition

    • All employees have a supervisor

    • Supervisor wage > Employee wage

Exceptions In Database

  • Some employees make more than their supervisor.

  • Some employees may earn a foreign degree, which is not even close to the American ones.

    How To Live With Exceptions

  • Storage and access efficiency.

  • Semantic computation (employee is a consultant).

  • Sharing (US dollars are now Swiss francs).

  • Validating of future updates.

    • Use object concept to represent exceptional information so that such information can be maintained

    • Alert the user when exceptional information is being manipulated as well as when constraints are violated

Learning From Exceptions

  • Modifying integrity constraints to apply in more restricted circumstances.

  • Adding new attributes to a class definition.

  • Defining a new class and placing it in the current sub-hierarchy.

    Using Empirical Generalization For Schema Refinement

  • DBA keeps track of violations and exceptions to constraints in the schema.

  • When sufficient evidence is accumulated, it suggests one or more alternative changes to the schema.

The generalization algorithm
The Generalization Algorithm

  • Introduce/expand range

  • Introduce/generalize class

    Supervisor: Accountant -> Supervisor: Employee

  • Drop attribute specification

    Research (age 30, Ph.D.) and Research (age 35, MS)

    -> Research (age 30-35, MS-Ph.D.)

Refinement to the generalization process
Refinement to the Generalization Process

  • If certain attribute specification is dropped, then the resulting description becomes overly generalized.

  • If no longer detects some of the errors it detects earlier.

  • Need heuristic for determining when an attribute constraint is essential.

  • Notion of RELEVANCE

    • Keep the attribute in the generalization if more relevant

    • Discard the attributes in the generalization if less relevant

  • Integrity constraints must be refined to disregard the contradictions raised by exception value.

  • Machine learning to find commonalities among the elements of a set of objects, such as exception to a rule encountered so far and characterize these in the form of a class description.

  • Use such characterization to adjust schema.

    • Changing restriction on property value range

    • Modifying integrity constraints

    • Define new classes


    • Heuristic nature of the RELEVANCE function.

    • Inability to handle negation and disjunction.

A proposal for automatic schema evolution
A Proposal For Automatic Schema Evolution contradictions raised by exception value.

  • Predefine Schema templates

  • Monitoring the exception frequency of constraints

  • Switch over to new schema when exception frequency reaches a certain pre-specified level

  • New schema is selected based on the exception characteristics

Ref: Q. Li and D. McLeod. “Object Flavor Evolution Through Learning in an Object-Oriented Database System”. Proceedings of the 2nd International Conference on Expert Database Systems, 1989, pp. 469-495.

PKM: Personal Knowledge Model

  • A simple object-oriented model

  • Based on a small number of simple concepts

  • Stepwise develop more complex and abstract database models

    Level 1 Atomic Object - Non-decomposable units of information at a certain stage (e.g., John Smith).

    Level 2 Open Atomic Set - A collection of atomic objects, members are not fixed (e.g., P names).

Closed Atomic Set Through Learning in an Object-Oriented Database System”. Proceedings of the 2 - A collection of atomic objects, members are fixed [e.g., sex, age (>= 0, integer)]

Mapping - Mapping from one object to another

- Domain & range mapping (e.g., has sender mapping from letter to person)

Social - Concept describes relationship (e.g., Jon has GPA 3.75)

Procedure - Operation/methods (e.g., method for creating or destroying objects)

Level 3 Open Social Set - classified a collection of objects (e.g., applicants)

Closed Social Set – (e.g., courses)

Composed Mapping - virtual objects derived by composing other objects

Composed Procedure - virtual object derived by grouping a collection of other procedure objects

Evolution system architecture
Evolution System Architecture objects (e.g., applicants)

ICE Intelligent Concept Evolver

  • How, what & when to evolve

    PKM Personal Knowledge Manager

    LFI Learning From Instruction (passive learning)

  • Initiated by user

  • Acquire knowledge from user

  • Dialogue-based

  • Involved some inference

    LFE Learning From Exceptions (passive learning)

    Learning is triggered by:

  • Violations of constraints

  • Definition of the object in the DB

  • Involved more inference than LFI

Evolution system architecture cont d
Evolution System Architecture (cont’d) objects (e.g., applicants)

LFO Learning From Observation (active learning)

  • Check if any evolution should be conducted

  • Discovery of target evolution

  • Creation of classification criteria for set object and similar tasks

  • LFO is automatic, more inference than LFI and LFE


1) Find GPA of all students in the objects (e.g., applicants) third world country. Since third world country is not defined in the schema, ICE conducts LFI (learning from instruction)

Acquire the concept of third world country given the member of the country and ask the user to identify if they are a third world country.

ICE learning all the third world countries in the DB introduces a new object “countries-class” with member “third world”, “second world”, and “first world”. Countries evolved to countries class.

LFO (learning from observation) is triggered to find mapping cardinality constraint of “has class”, also determines if it is a many-to-one mapping.

2) CS 10 -> CS 10a, 10b, 10c

CS 10 violation! ICE calls LFE and LFI

LFI learns there are 10a, 10b, 10c (same course name, unit number, instructor, and students)

Students of each section should be proper subset of original student set.


3) Honor student has GPA >= 3.75 objects (e.g., applicants)

  • Honor students can also be selected based on recommendation letters from Professors but GPA < 3.75

  • ICE’s LFO (learning from observation) learning that every instance satisfies a generalization condition GPA >= 3.70

  • ICE changes the honor student’s condition from GPA>=3.75 to 3.70

Planning scheduling initiative pi
Planning & Scheduling Initiative (PI) objects (e.g., applicants)

Sponsored by the Defense Advanced Research Project Agency (DARPA) Rome Laboratories (RL)

  • Goal:

    • Promote automated planning technology

    • Smooth transition of the planning technology to operational planning and scheduling problems

  • Approach:

    • Tier 1: Independent research effects

      develops automated reasoning capabilities that are relevant to PI

    • Tier 2: Integrated Feasibility Demos

      Merging efforts in Tier 1 into a single system

      and single operational problem

  • Sizes: Involved more than 20 contractors

Operational problems focus
Operational Problems Focus objects (e.g., applicants)

Transportation Planning:

  • Provide transportation planner with an ability to rapidly generate/evaluate/modify transportation plan

  • Initial version used in DESERT SHIELD


  • Interconnected with a large number of Heterogeneous DBs (thousands)

  • Distributed, Multi-Agent Architecture (delegation, negotiation)

  • Knowledge-based (representative, acquisition)

  • Dynamic

  • Common Prototype Environment

  • Visionary Demo

Database requirements
Database Requirements objects (e.g., applicants)

  • Distributed

  • Heterogeneous

  • Intelligent (KB/DB)

  • Temporal & Evolutionary

  • Multimedia

  • Security