- 84 Views
- Uploaded on
- Presentation posted in: General

Taking Constraints out of Constraint Databases

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Taking Constraints out of Constraint Databases

Dina GoldinUniversity of Connecticut

Applications of Constraint Databases

Paris, France, June 2004

queries

Table-based

Logical Layer

Physical Layer

Codd[70] provided an additional level of abstraction

between physical data and queries

queries

Customized

data layout for

each application

- Data model: Uniform table-based representation for all data at logical level
- Data independence: Can modify physical layer without affecting queries
- Simple set-of-points semantics, RA=RC
- Efficient indexing methods
A commercial success in the 1980s!

- Disadvantages of RDBs:
- only good for traditional, “administrative” data

- OO technology corrects this:
- encapsulate non-administrative data
- provide methods to access it

- Object-relational databases provide this technology within a relational framework.
They are the latest commercial success.

- Introduction
- relational, OR data models

- GIS systems:
- CDB technology to the rescue

- Constraint Databases:
- it’s not just about constraints
- one more level of abstraction

- Constraint-backed databases:
- practical considerations
- getting constraint-backed technology right

- Until recentlly, leading commercial systems for spatial data
- Not database systems per se
- cannot manage non-geographic data
- no ad-hoc querying (users perform built-in operations or execute predefined queries)
- single-layered architecture (no data independence when writing queries)
- in-memory (no index stuctures)

- Marrying GIS and object-relational databases
- Example: Oracle Spatial Data Option
- Full power of a relational DB plus…

- Spatial data
- encapsulated as new data types within the OR framework
- same data types as in ARC/Info (leading GIS system)

- Spatial operations
- as methods over the new data types
- based on GIS operations

- Spatial data access structures
- based on bounding boxes

- Spatial data stored in spatial relations
- predefined set of spatial data types (point, region, etc…)
- each relation is a set of spatial objects of one type, with a key
- predefined set of operations over spatial objects

- “Traditional” data stored in regular relations
- Including thematic/descriptive data pertaining to spatial objects

- Spatial & administrative data are logically separate
- only keys of spatial objects to correlate between them
- spatial data processing limited to predefined types and operators

- Separation applies to query output as well
- limited query expressiveness
Can constraint databases offer a better solution?

- limited query expressiveness

- Contribution of KKR[90,95]
- Key idea: Allow relations that include infinitely many points
- “Finite relations are generalized to finitely representable relations” [GK96]

- Generalized: original term for tuples and relations with infinite semantics
- We now prefer the term constraint for such tuples and relations
Goal: next commercial success (for GIS applications)

- We now prefer the term constraint for such tuples and relations

queries

Table-based

Logical Layer

Physical Layer

- Components of the logical database layer:
- set-of-tuples data semantics
- implementation-independent (logical) data representation

- Relational databases
- finite semantics
- trivial one-to-one correspondence between the two components

- Constraint databases:
- infinite semantics
- correspondence between data semantics and data representation no longer trivial
Infinite semantics of finitely representable data imply an additional level of abstraction; we need to separate logical layer into two

Logical Layer:

(queries defined over this layer)

finite set-of-point semantics;table-based representation;

Implementation-independent

Abstract Logical Layer:(queries defined over this layer)

infinite set-of-point semantics

Concrete Logical Layer:

Finite data representation;

implementation-independent

Physical Layer:

File-based data storage; indexing structures, data access methods; implementation-dependent

RDB to CDB: from two layers to three

- Introduction
- relational, OR data models

- GIS systems:
- CDB technology to the rescue

- Constraint Databases:
- it’s not just about constraints
- one more level of abstraction

- Constraint-backed databases:
- practical considerations
- getting constraint-backed technology right

- Requirements for the concrete layer
- clean set-of-point semantics
- efficient (index-based) data access methods
- not required to use constraints (queries are over the abstract layer, so actual choice of representation is transparent to user)

- Pure Constraint Databases
- concrete layer is constraint-based
- examples: CDB/CQA (query algebra), MLPQ (logic programming)

- Constraint-backed databases
- concrete layer is not purely constraints
- data may be represented geometrically

- Data input/output is not based on constraints
- data often obtained by digitization (generates points and segments)
- geometrical, visual, some standard spatial format…
- in pure CDBs, converted to constraints

- Spatial features are never straight lines or convex polytopes
- many short segments
- frequent local change of direction
- broken up into many constraint tuples (convex cells) per spatial object

- Continuous (real time) data visualization
- most users do NOT want to see constraints, but a GUI
- visualization requires spatial outline (boundary points)
- constraints need to be converted back to geometrical representation
- conversions carry heavy performance penalty (not real-time)

- Experience shows that practical systems are not pure
- E.g. Dedale uses geometrical representations, explicitly translating to the constraint representation for the constraint engine [GSSG03]

- In the physical layer, need for geometry-based representations recognized early on
- KKR90 suggested computational geometry algorithms as evaluation primitives

- Examples of geometric representations:
- Points
- Polylines: for trajectories, regions
- Triangulated Irregular Networks (TINS): for terrains (2.5 dimensional)

- Efficient visualization
- Efficient query evaluation
- If region R(x,y) is stored as a sequence of points that outline it, pXR can be obtained by finding extrema of X-coordinates for these points.
- Bounding boxes equally easy to compute.

Define query semantics (abstract level)

- for proving query correctness
- to spare users from ad-hoc operators with arbitrary restrictions

- one of the available data representations
- e.g. when data is truly multidimensional

- as intermediate representation between non-compatible systems

- Not a pure constraint database
- Nesting takes place at abstract level
LandUse(lname,geom[x,y])

Flight(fname,traj[t,x,y,a])

Country(cname,geom[x,y,h])

- Queries use nest and unnest operations explicitly

- Geometric representation in the concrete layer
- geom in Country is represented as a TIN
- traj in Flight is represented as a set of sample points along the flight path

- Data model does not separate spatial and administrative data

- R0 := SELECT t=t1 from Flight
- R1 := PROJECT R0 on fname,x,y

- R0 := JOIN LandUse and Rect

- R0 := JOIN LandUse and Rect
- R1 = PROJECT R0 on lname
- R2 = JOIN R1 and LandUse

- LandUse(lname,geom[x,y])
- Flight(fname,traj[t,x,y,a])
- Country(cname,geom[x,y,h])

- LandUse(lname,x,y)
- Flight(fname,t,x,y,a)
- Country(cname,x,y,h)

- Over which location were the airplanes flying at time t1?
MAP lX [X.fname, px,y ( st=t1 (X.traj))] (Flight)

- Return the part of the parcels contained in rectangle Rect(x,y)
MAP lX [X.lname, X.geom ∩ Rect] (LandUse)

- Return all land parcels that have a point in Rect(x,y)
plname,geom (MAP lX [X.lname, X.geom, s(x,y) in Rect (X.geom)] (LandUse))

Output limited to 2 spatiotemporal dimensions (3 in case of interpolated attributes)

Pure constraint DB not practical

- Clean semantics and full expressiveness of constraint databases
- Geometrical representation issues not a user concern
- though expert users may want to take more control

- System support for three-tier architecture
- More sophisticated than for pure constraint databases, or for current spatial databases

- Query processing engine must
- choose the best concrete representation for output queries, among those supported by system
- select query evaluation strategies in the presence of a wider mix of possible representations and techniques
- take into account storage and visualization
- perhaps maintain multiple representations for the same data?