Taking constraints out of constraint databases
This presentation is the property of its rightful owner.
Sponsored Links
1 / 20

Taking Constraints out of Constraint Databases PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Taking Constraints out of Constraint Databases. Dina Goldin University of Connecticut Applications of Constraint Databases Paris, France, June 2004. queries. Table-based Logical Layer. Physical Layer. Relational Databases. Codd[70] provided an additional level of abstraction

Download Presentation

Taking Constraints out of Constraint Databases

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Taking Constraints out of Constraint Databases

Dina GoldinUniversity of Connecticut

Applications of Constraint Databases

Paris, France, June 2004



Logical Layer

Physical Layer

Relational Databases

Codd[70] provided an additional level of abstraction

between physical data and queries



data layout for

each application

Advantages of Relational Model

  • Data model: Uniform table-based representation for all data at logical level

  • Data independence: Can modify physical layer without affecting queries

  • Simple set-of-points semantics, RA=RC

  • Efficient indexing methods

    A commercial success in the 1980s!

Object-Relational Databases

  • Disadvantages of RDBs:

    • only good for traditional, “administrative” data

  • OO technology corrects this:

    • encapsulate non-administrative data

    • provide methods to access it

  • Object-relational databases provide this technology within a relational framework.

    They are the latest commercial success.


  • Introduction

    • relational, OR data models

  • GIS systems:

    • CDB technology to the rescue

  • Constraint Databases:

    • it’s not just about constraints

    • one more level of abstraction

  • Constraint-backed databases:

    • practical considerations

    • getting constraint-backed technology right

Geographic Information Systems

  • Until recentlly, leading commercial systems for spatial data

  • Not database systems per se

    • cannot manage non-geographic data

    • no ad-hoc querying (users perform built-in operations or execute predefined queries)

    • single-layered architecture (no data independence when writing queries)

    • in-memory (no index stuctures)

Newer Approaches to Managing Spatial Data

  • Marrying GIS and object-relational databases

    • Example: Oracle Spatial Data Option

    • Full power of a relational DB plus…

  • Spatial data

    • encapsulated as new data types within the OR framework

    • same data types as in ARC/Info (leading GIS system)

  • Spatial operations

    • as methods over the new data types

    • based on GIS operations

  • Spatial data access structures

    • based on bounding boxes

Data Separation in OR/GIS Databases

  • Spatial data stored in spatial relations

    • predefined set of spatial data types (point, region, etc…)

    • each relation is a set of spatial objects of one type, with a key

    • predefined set of operations over spatial objects

  • “Traditional” data stored in regular relations

    • Including thematic/descriptive data pertaining to spatial objects

  • Spatial & administrative data are logically separate

    • only keys of spatial objects to correlate between them

    • spatial data processing limited to predefined types and operators

  • Separation applies to query output as well

    • limited query expressiveness

      Can constraint databases offer a better solution?

Constraint Databases

  • Contribution of KKR[90,95]

  • Key idea: Allow relations that include infinitely many points

    • “Finite relations are generalized to finitely representable relations” [GK96]

  • Generalized: original term for tuples and relations with infinite semantics

    • We now prefer the term constraint for such tuples and relations

      Goal: next commercial success (for GIS applications)



Logical Layer

Physical Layer

Revisiting the Logical Layer

  • Components of the logical database layer:

    • set-of-tuples data semantics

    • implementation-independent (logical) data representation

  • Relational databases

    • finite semantics

    • trivial one-to-one correspondence between the two components

  • Constraint databases:

    • infinite semantics

    • correspondence between data semantics and data representation no longer trivial

      Infinite semantics of finitely representable data imply an additional level of abstraction; we need to separate logical layer into two

Logical Layer:

(queries defined over this layer)

finite set-of-point semantics;table-based representation;


Abstract Logical Layer:(queries defined over this layer)

infinite set-of-point semantics

Concrete Logical Layer:

Finite data representation;


Physical Layer:

File-based data storage; indexing structures, data access methods; implementation-dependent

Additional Level of Abstraction

RDB to CDB: from two layers to three


  • Introduction

    • relational, OR data models

  • GIS systems:

    • CDB technology to the rescue

  • Constraint Databases:

    • it’s not just about constraints

    • one more level of abstraction

  • Constraint-backed databases:

    • practical considerations

    • getting constraint-backed technology right

Concrete Data Model in CDBs

  • Requirements for the concrete layer

    • clean set-of-point semantics

    • efficient (index-based) data access methods

    • not required to use constraints (queries are over the abstract layer, so actual choice of representation is transparent to user)

  • Pure Constraint Databases

    • concrete layer is constraint-based

    • examples: CDB/CQA (query algebra), MLPQ (logic programming)

  • Constraint-backed databases

    • concrete layer is not purely constraints

    • data may be represented geometrically

Practical Considerationsof GIS Applications

  • Data input/output is not based on constraints

    • data often obtained by digitization (generates points and segments)

    • geometrical, visual, some standard spatial format…

    • in pure CDBs, converted to constraints

  • Spatial features are never straight lines or convex polytopes

    • many short segments

    • frequent local change of direction

    • broken up into many constraint tuples (convex cells) per spatial object

  • Continuous (real time) data visualization

    • most users do NOT want to see constraints, but a GUI

    • visualization requires spatial outline (boundary points)

    • constraints need to be converted back to geometrical representation

    • conversions carry heavy performance penalty (not real-time)

  • Experience shows that practical systems are not pure

    • E.g. Dedale uses geometrical representations, explicitly translating to the constraint representation for the constraint engine [GSSG03]

Geometric Data Representation

  • In the physical layer, need for geometry-based representations recognized early on

    • KKR90 suggested computational geometry algorithms as evaluation primitives

  • Examples of geometric representations:

    • Points

    • Polylines: for trajectories, regions

    • Triangulated Irregular Networks (TINS): for terrains (2.5 dimensional)

  • Efficient visualization

  • Efficient query evaluation

    • If region R(x,y) is stored as a sequence of points that outline it, pXR can be obtained by finding extrema of X-coordinates for these points.

    • Bounding boxes equally easy to compute.

Role of Constraints in Constraint-Backed Databases

Define query semantics (abstract level)

  • for proving query correctness

  • to spare users from ad-hoc operators with arbitrary restrictions

  • Provide default data model (concrete level)

    • one of the available data representations

    • e.g. when data is truly multidimensional

  • For data integration

    • as intermediate representation between non-compatible systems


    • Not a pure constraint database

    • Nesting takes place at abstract level




      • Queries use nest and unnest operations explicitly

    • Geometric representation in the concrete layer

      • geom in Country is represented as a TIN

      • traj in Flight is represented as a set of sample points along the flight path

    • Data model does not separate spatial and administrative data

    • R0 := SELECT t=t1 from Flight

    • R1 := PROJECT R0 on fname,x,y

    • R0 := JOIN LandUse and Rect

    • R0 := JOIN LandUse and Rect

    • R1 = PROJECT R0 on lname

    • R2 = JOIN R1 and LandUse


    • LandUse(lname,geom[x,y])

    • Flight(fname,traj[t,x,y,a])

    • Country(cname,geom[x,y,h])

    • LandUse(lname,x,y)

    • Flight(fname,t,x,y,a)

    • Country(cname,x,y,h)

    • Over which location were the airplanes flying at time t1?

      MAP lX [X.fname, px,y ( st=t1 (X.traj))] (Flight)

    • Return the part of the parcels contained in rectangle Rect(x,y)

      MAP lX [X.lname, X.geom ∩ Rect] (LandUse)

    • Return all land parcels that have a point in Rect(x,y)

      plname,geom (MAP lX [X.lname, X.geom, s(x,y) in Rect (X.geom)] (LandUse))

      Output limited to 2 spatiotemporal dimensions (3 in case of interpolated attributes)

    Pure constraint DB not practical

    Getting Constraint-Backed Systems Right

    • Clean semantics and full expressiveness of constraint databases

    • Geometrical representation issues not a user concern

      • though expert users may want to take more control

    • System support for three-tier architecture

      • More sophisticated than for pure constraint databases, or for current spatial databases

    • Query processing engine must

      • choose the best concrete representation for output queries, among those supported by system

      • select query evaluation strategies in the presence of a wider mix of possible representations and techniques

      • take into account storage and visualization

      • perhaps maintain multiple representations for the same data?


  • Login