- 116 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Taking Constraints out of Constraint Databases' - randall-rosario

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Taking Constraints out of Constraint Databases

Dina GoldinUniversity of Connecticut

Applications of Constraint Databases

Paris, France, June 2004

Table-based

Logical Layer

Physical Layer

Relational DatabasesCodd[70] provided an additional level of abstraction

between physical data and queries

queries

Customized

data layout for

each application

Advantages of Relational Model

- Data model: Uniform table-based representation for all data at logical level
- Data independence: Can modify physical layer without affecting queries
- Simple set-of-points semantics, RA=RC
- Efficient indexing methods

A commercial success in the 1980s!

Object-Relational Databases

- Disadvantages of RDBs:
- only good for traditional, “administrative” data
- OO technology corrects this:
- encapsulate non-administrative data
- provide methods to access it
- Object-relational databases provide this technology within a relational framework.

They are the latest commercial success.

Outline

- Introduction
- relational, OR data models
- GIS systems:
- CDB technology to the rescue
- Constraint Databases:
- it’s not just about constraints
- one more level of abstraction
- Constraint-backed databases:
- practical considerations
- getting constraint-backed technology right

Geographic Information Systems

- Until recentlly, leading commercial systems for spatial data
- Not database systems per se
- cannot manage non-geographic data
- no ad-hoc querying (users perform built-in operations or execute predefined queries)
- single-layered architecture (no data independence when writing queries)
- in-memory (no index stuctures)

Newer Approaches to Managing Spatial Data

- Marrying GIS and object-relational databases
- Example: Oracle Spatial Data Option
- Full power of a relational DB plus…
- Spatial data
- encapsulated as new data types within the OR framework
- same data types as in ARC/Info (leading GIS system)
- Spatial operations
- as methods over the new data types
- based on GIS operations
- Spatial data access structures
- based on bounding boxes

Data Separation in OR/GIS Databases

- Spatial data stored in spatial relations
- predefined set of spatial data types (point, region, etc…)
- each relation is a set of spatial objects of one type, with a key
- predefined set of operations over spatial objects
- “Traditional” data stored in regular relations
- Including thematic/descriptive data pertaining to spatial objects
- Spatial & administrative data are logically separate
- only keys of spatial objects to correlate between them
- spatial data processing limited to predefined types and operators
- Separation applies to query output as well
- limited query expressiveness

Can constraint databases offer a better solution?

Constraint Databases

- Contribution of KKR[90,95]
- Key idea: Allow relations that include infinitely many points
- “Finite relations are generalized to finitely representable relations” [GK96]
- Generalized: original term for tuples and relations with infinite semantics
- We now prefer the term constraint for such tuples and relations

Goal: next commercial success (for GIS applications)

Table-based

Logical Layer

Physical Layer

Revisiting the Logical Layer- Components of the logical database layer:
- set-of-tuples data semantics
- implementation-independent (logical) data representation
- Relational databases
- finite semantics
- trivial one-to-one correspondence between the two components
- Constraint databases:
- infinite semantics
- correspondence between data semantics and data representation no longer trivial

Infinite semantics of finitely representable data imply an additional level of abstraction; we need to separate logical layer into two

(queries defined over this layer)

finite set-of-point semantics;table-based representation;

Implementation-independent

Abstract Logical Layer:(queries defined over this layer)

infinite set-of-point semantics

Concrete Logical Layer:

Finite data representation;

implementation-independent

Physical Layer:

File-based data storage; indexing structures, data access methods; implementation-dependent

Additional Level of AbstractionRDB to CDB: from two layers to three

Outline

- Introduction
- relational, OR data models
- GIS systems:
- CDB technology to the rescue
- Constraint Databases:
- it’s not just about constraints
- one more level of abstraction
- Constraint-backed databases:
- practical considerations
- getting constraint-backed technology right

Concrete Data Model in CDBs

- Requirements for the concrete layer
- clean set-of-point semantics
- efficient (index-based) data access methods
- not required to use constraints (queries are over the abstract layer, so actual choice of representation is transparent to user)
- Pure Constraint Databases
- concrete layer is constraint-based
- examples: CDB/CQA (query algebra), MLPQ (logic programming)
- Constraint-backed databases
- concrete layer is not purely constraints
- data may be represented geometrically

Practical Considerationsof GIS Applications

- Data input/output is not based on constraints
- data often obtained by digitization (generates points and segments)
- geometrical, visual, some standard spatial format…
- in pure CDBs, converted to constraints
- Spatial features are never straight lines or convex polytopes
- many short segments
- frequent local change of direction
- broken up into many constraint tuples (convex cells) per spatial object
- Continuous (real time) data visualization
- most users do NOT want to see constraints, but a GUI
- visualization requires spatial outline (boundary points)
- constraints need to be converted back to geometrical representation
- conversions carry heavy performance penalty (not real-time)
- Experience shows that practical systems are not pure
- E.g. Dedale uses geometrical representations, explicitly translating to the constraint representation for the constraint engine [GSSG03]

Geometric Data Representation

- In the physical layer, need for geometry-based representations recognized early on
- KKR90 suggested computational geometry algorithms as evaluation primitives
- Examples of geometric representations:
- Points
- Polylines: for trajectories, regions
- Triangulated Irregular Networks (TINS): for terrains (2.5 dimensional)
- Efficient visualization
- Efficient query evaluation
- If region R(x,y) is stored as a sequence of points that outline it, pXR can be obtained by finding extrema of X-coordinates for these points.
- Bounding boxes equally easy to compute.

Role of Constraints in Constraint-Backed Databases

Define query semantics (abstract level)

- for proving query correctness
- to spare users from ad-hoc operators with arbitrary restrictions
- Provide default data model (concrete level)
- one of the available data representations
- e.g. when data is truly multidimensional
- For data integration
- as intermediate representation between non-compatible systems

DEDALE

- Not a pure constraint database
- Nesting takes place at abstract level

LandUse(lname,geom[x,y])

Flight(fname,traj[t,x,y,a])

Country(cname,geom[x,y,h])

- Queries use nest and unnest operations explicitly
- Geometric representation in the concrete layer
- geom in Country is represented as a TIN
- traj in Flight is represented as a set of sample points along the flight path
- Data model does not separate spatial and administrative data

- R1 := PROJECT R0 on fname,x,y

- R0 := JOIN LandUse and Rect

- R0 := JOIN LandUse and Rect
- R1 = PROJECT R0 on lname
- R2 = JOIN R1 and LandUse

- LandUse(lname,geom[x,y])
- Flight(fname,traj[t,x,y,a])
- Country(cname,geom[x,y,h])

- LandUse(lname,x,y)
- Flight(fname,t,x,y,a)
- Country(cname,x,y,h)

- Over which location were the airplanes flying at time t1?

MAP lX [X.fname, px,y ( st=t1 (X.traj))] (Flight)

- Return the part of the parcels contained in rectangle Rect(x,y)

MAP lX [X.lname, X.geom ∩ Rect] (LandUse)

- Return all land parcels that have a point in Rect(x,y)

plname,geom (MAP lX [X.lname, X.geom, s(x,y) in Rect (X.geom)] (LandUse))

Output limited to 2 spatiotemporal dimensions (3 in case of interpolated attributes)

Pure constraint DB not practical

Getting Constraint-Backed Systems Right

- Clean semantics and full expressiveness of constraint databases
- Geometrical representation issues not a user concern
- though expert users may want to take more control
- System support for three-tier architecture
- More sophisticated than for pure constraint databases, or for current spatial databases
- Query processing engine must
- choose the best concrete representation for output queries, among those supported by system
- select query evaluation strategies in the presence of a wider mix of possible representations and techniques
- take into account storage and visualization
- perhaps maintain multiple representations for the same data?

Download Presentation

Connecting to Server..