- 669 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Spatial Databases: Lecture 2' - johana

Download Now**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

Download Now

Presentation Transcript

Topics

- Spatial Joins
- Spatial databases
- OGC
- Themes
- Indexing
- Spatial Database Architecture
- PostGIS

Spatial Join

A spatial join associates two tables based on a spatial relationship, rather than an the classic non-spatial relational attribute. A spatial join operation is used to combine two or more dataset with respect to a spatial predicate or spatial operation. Predicates can be a combination of directional, distance, and topological spatial relations (e.g. overlap, contains). In case of non-spatial join, the joining attributes must of the same type, but for spatial join they can be of different types.

Spatial Join Example1

Query: For all the rivers listed in the River table, find the counties through which they pass.

SELECT r.name, c.name

FROM river AS r, country AS c

WHERE crosses(r.the_geom,c.the_geom)=True

The spatial predicate “Cross” is used to join River and Country tables

To view this we would add asbinary(R.the_geom,C.the_geom).

Note this will not work with our Irish data set.

Spatial Joins

- In practice, spatial join operations are divided into a filter step and a refinement step to efficiently process complex spatial data types such as point collections in a row instance. In the filter step, the spatial objects are represented by simpler approximations such as their Minimum Bounding Rectangle or Box (MBR or MBB).

Spatial Joins Example 2

- A spatial join associate two tables based on a spatial relationship, rather than an attribute relationship. For example the query:

Summarize the election results by municipality.

- Could be answered using the following SQL:

SELECT m.name, sum(v.fiannfail) AS FF, sum(v.gp) AS Green, sum(v. finegael) AS FG, sum(v.vtotal) AS total FROM voting_areas v, municipality m, WHERE intersects(v.the_geom, m.the_geom) GROUP BY m.name ORDER BY m.name;

http://www.oireachtas.ie/documents/publications/dublin2.pdf

http://www.oireachtas.ie/viewdoc.asp?DocID=10838

Dynamic and Static Data

- Static non-spatial data is usually maintained in the table with the geometry (e.g. county name or unique identified). In some cases this data along with the geometry is considered immutable.
- Dynamic non-spatial data is often maintained in a separate table e.g. land ownership.
- There can be more than one dynamic table for a geometry table.
- Dynamic spatial can include moving objects or a changing world (temporality requires different treatment)

Traditional Joins

- Pure data tables can be joined with the geometry tables for querying purposes
- A primary key is used to relate the 2 tables together
- A primary key is a unique identifier for each row in a table

Primary Key

Spatial Join1

A typical example of spatial join is “Find all pair of rivers and cities that intersect”. The result of join between the set of rivers {R1, R2} and cities {C1, C2, C3, C4, C5} is { (R1, C1), (R2, C5)}.

Guting’s1 definition of a spatial database

- (1) A spatial database system is a database system
- (2) It offers spatial data types in its data model and query language
- (3) It supports spatial data types in its implementation, providing at least spatial indexing and efficient algorithms for spatial join2.

Why use a database for GIS?

- GIS are not database systems, although they can be connected to a DBMS.
- A GIS cannot efficiently manage large quantities of non-spatial data (e.g. at government department level).
- They lack ad hoc querying capability (they provide a restricted form of predefined queries)
- They lack indexing structures for fast external data access (they use in memory techniques).
- They lack a 'logic' (e.g. first order logic of the relational calculus)

Why use a database for GIS?

- Databases offer the following functions:

- Data independence
- Data Abstraction
- Self-describing
- Concurrency
- Distributed capabilities
- High performance
- Supports spatial data types using ADTs.
- Alternative: files

- Reliability
- Integrity: enforces consistency
- Security
- User views
- User interface
- Querying
- Updating
- DB theory has a Mathematical basis

Programming Languages

- The basic components of current languages are:
- Data types e.g. Integers, String, Polygon.
- Variables to refer to data types e.g. a = 2
- Operations on those data types e.g. area(polygon)
- Control structures e.g. sequence, iteration, and conditions.
- Logic is an important part of programming, but it is often implicit and external to the language. Some languages like SQL are quite close to logic.

Data types

- A data type represents a constraint placed upon the interpretation of data in a type system, describing representation, interpretation, legal operations and structure of values.
- Data types are a way to limit the kind of data that can be used by a particular program or stored in a database table. Types restrict the data to a certain set of values (e.g. 1,2,3,..for Integers).
- Data types also are restricted to certain operations on the type (e.g. addition for Integers). SQL comes with a range of standard data types that can be used to represent strings, integers, etc, PostgreSQL can be extended to have additional types e.g. spatial data types.

SQL supports data abstraction

Data Abstraction- allows users to ignore

unimportant details

View Level– a way of presenting data to a particular group of users

Logical Level– how data is interpreted when writing queries

Physical Level– how data is manipulated at storage level by a computer. Most users are not interested in the physical level.

Databases use high level declarative languages (SQL)

- Data Definition Language (DDL)
- Create, alter and delete data
- CREATE TABLE, CREATE INDEX
- Data Manipulation Language (DML)
- Retrieve and manipulate data
- SELECT, UPDATE, DELETE, INSERT
- Data Control Languages (DCL)
- Control security of data
- GRANT, CREATE USER, DROP USER

Spatial Types – OGC Simple Features for SQL

Composed

Sub Type

SpatialReferenceSystem

Geometry

Relationship

GeometryCollection

Point

Curve

Surface

Polygon

MultiSurface

MultiCurve

MultiPoint

LineString

Line

LinearRing

MultiPolygon

MultiLineString

OGC Simple Features for SQL1(*)

- The OGC SF (similar to ISO 19125-1) describes 2-D geometry with linear interpolation between vertices. The simple feature model consists of a root class Geometry and its subclasses Point, Curve, Surface, GeometryCollection. The class Geometry collection has the subclasses Multipoint, Multicurve, MultiSurface.

OGC Simple Features for SQL1 (*)

- The OGC does not include complexes, a third dimension, non-linear curves, `networking or topology (i.e. connectivity information).
- Because of it relative simplicity and its support in both the commercial & open source community SFSQL is widely used in DBMS and is supported in many Web applications.
- It is expected that newer more sophisticated standards such as ISO-19107 will gradually replace OGC SF.

OGC Simple Features for SQL (*)

- Brief description
- A simple feature is defined to have both spatial and non-spatial attributes. Spatial attributes are geometry valued, and simple features are based on 2D geometry with linear interpolation between vertices. Each feature is stored as a row in a database table. This course covers the OGC: GEOMETRY type with subtypes such as POINT, LINE, POLYLINE, POLYGON, and collections of these.

OGC Simple Features for SQL (*)

- Functionality can be described under the following headings.
- Basic Methods on Geometry
- Methods for testing Spatial Relations between geometric objects
- Methods that support Spatial Analysis
- Geometry Collection

OGC Simple Features for SQL (*)

- Basic Methods on Geometry
- Describes the dimensions and reference system (SRID) of the geometry.
- Operations include Dimension, GeometryType, , conversions AsText, AsBinary, tests on geometry include IsEmpty, IsSimple. Operations that return geometry Boundary, Envelope returns bounding box
- Methods for testing Spatial Relations between geometric objects
- These polymorphic methods check relations on the generic or super class GEOMETRY and usually return a Boolean. Main methods Equals, Disjoint, Intersects, Touches, Crosses, Within, Contains, Overlaps, Relate( testing for intersections between the Interior, Boundary and Exterior of the two geometries)
- Methods that support Spatial Analysis
- A set of geometric and ‘metric’ methods. Methods calculate distances and areas with respect to the spatial reference system of this Geometry. Methods include Distance, Buffer, ConvexHull, Intersection, Union, Difference, SymDifference.
- Geometry Collection
- A GeometryCollection is a geometry that is a collection of 1 or more geometries. All the elements in a GeometryCollection must be in the same Spatial Reference. Subclasses of GeometryCollection may restrict membership based on dimension and may also place other constraints on the degree of spatial overlap between elements. Methods
- NumGeometries( ):Integer—Returns the number of geometries in this GeometryCollection.
- GeometryN(N:integer):Geometry—Returns the Nth

OGC Spatial Relations

- Equals – same geometries
- Disjoint – geometries share common point
- Intersects – geometries intersect
- Touches – geometries intersect at common boundary
- Crosses – geometries overlap
- Within– geometry within
- Contains – geometry completely contains
- Overlaps – geometries of same dimension overlap
- Relate – intersection between interior, boundary or exterior

Semantics of geometry

Here are some of Euclid’s definitions from about 300 BC

A point is that which has no part.

A line is breadthless length.

The ends of a line are points.

A straight line is a line which lies evenly with the points on itself.

A surface is that which has length and breadth only.

The edges of a surface are lines.

A plane surface is a surface which lies evenly with the straight lines on itself.

A boundary is that which is an extremity of anything

A figure is that which is contained by any boundary or boundaries.

Are these definitions consistent with your understanding?

Is a line really composed of points? How does a computer draw a line?

Semantics of geometry

Most computer displays use raster data structures to store pixel information.

http://en.wikipedia.org/wiki/Raster_graphics

Drawing a line in JavaScript1

Note: Code use variable b for y-intercept.

The equation of a line is y = mx + c.

dx, dyare changes in x,y

m is the slope.

c is y-intercept.

Drawing a line in JavaScript1

The code on the previous slide is fine when the slope is less than or equal to 1 (dx >= dy).

If the slope greater than 1, the line will have gaps (dx < dy).

If dx = 0 then only a single point is plotted.

Most computer displays are pixel based.

Calculating Distance in Java

import java.util.Scanner;

public class DistanceApp {

public static void main(String[] args){

int x1, y1, x2, y2;

double distance;

Scanner scan = new Scanner (System.in);

System.out.print("Enter the x coordinate for point 1: ");

x1 = scan.nextInt();

System.out.println("The x coordinate for point 1: "+ x1);

System.out.print("Enter the y coordinate for point 1: ");

y1 = scan.nextInt();

System.out.println("The y coordinate for point 1: "+ y1);

System.out.print("Enter the x coordinate for point 2: ");

x2 = scan.nextInt();

System.out.println("The x coordinate for point 2: "+ y1);

System.out.print("Enter the y coordinate for point 2: ");

y2 = scan.nextInt();

System.out.println("The y coordinate for point 2: "+ y2);

distance = Math.sqrt( Math.pow((x2 - x1),2) + Math.pow((y2 - y1),2));

System.out.println("The distance between the two points is " + distance);

}}

Change to floating point input using nextDouble and change coord to double.

Contains Relation

Does the base geometry (small circles) contain the comparison geometry (big circles)?

For the base geometry to contain the comparison geometry it must be a superset of that geometry.

Geographic Information Systems and Science, ,Longley,Goodchild,Maguire,Rhind

Geographic Information Systems and Science , Longley,,Goodchild,,Maguire,Rhind

Touches RelationDoes the base geometry (small circles) touch the comparison geometry (big circles) ?

Two geometries touch when their boundaries intersect. Raises deep mathematical issues e.g. what is the boundary of a point?, what about tolerance + or - a metre?

Spatial Methods

- Distance – shortest distance
- Buffer – geometric buffer
- ConvexHull – smallest convex polygon geometry
- Intersection – points common to two geometries
- Union – all points in geometries
- Difference – points different between two geometries
- SymDifference – points in either, but not both of input geometries

Convex Hull

The convex hull of a set of points is the intersection of all convex sets which contain the points. A set of points is convex if and only if for every pair of points p,q in S, the line segment pq is completely contained in S.

Left is convex set and right non-convex set

Convex hulls constructed around objects.

Operations on themes (*)

- Theme projection1: ‘selecting’ some attributes from the countries theme. Get the Population and Geometry of European countries
- Theme selection: Name and population of European countries with a population of 50 million or more.
- Theme union : European countries with population less than 10 million joined with those over 10 million.
- Theme overlay: See example
- Theme merge : See example

Operations on themes (*)

- Theme overlay1: Generates a new theme and new geometry from the overlaid themes. We get the geometric intersection of spatial objects with the required themes. See European language example.
- Theme merge : The merge operation performs the geometric union of the spatial part of n geographic objects that belong to the same theme under a constraint condition supplied by the user. See East/West Germany example.

Projection on Theme (*)

Find the countries of western Europe with population greater than 50 million. This is a projection on the attribute population. Unlike a conventional database query we often want the query result and the original context, in this case Europe.

Theme Merge (*)

Merging two geographic objects in a selected theme (say country) into a single object.

Theme Overlay (*)

Anglo-Saxon

Latin languages

The lower map represents the overlay of European countries and languages.

Indexing

- Indexing is used to speed up queries and locate rows quickly
- Traditional RDBMS use 1-d indexing (B-tree)
- Spatial DBMS need 2-D, hierarchical indexing
- Grid
- Quadtree
- R-tree
- Others
- Multi-level queries often used for performance (MBR)

R-tree

Examples of R – Tree Index of polygons

Spatial enabled DB Summary

- Database – an integrated set of data on a particular subject. Can include spatial and non-spatial and possible temporal.
- Databases offer many advantages over files
- Relational databases dominate for non-spatial use, object-relational databases (ORDBMS) often used for spatial data.
- Databases address some limitations for GIS

Choice database for GIS?

- Choice of DBMS:
- Commercial (Oracle, DB2) or
- Open source (PostgreSQL, MySQL) .
- We will PostgreSQL with PostGIS spatial extensions.
- PostgreSQL is an Object Relational Database System (ORDBMS).

Database Architecture for GIS(*)

- PureRelational Approach
- Spatial data can be stored in a pure RDBMS. The coordinates for the spatial data can be stored in tables. Uses existing technologies, requires no additional software (for the pure DBMS perspective).
- Drawbacks
- It is difficult to represent and query complex spatial structures (such as a polygon with holes) or topological relationships ( network connectivity, polygon adjacency). No ordered lists.
- Violates independence principle, user must know about data storage. Change of geometric representation requires deep reorganization of the database and query formulation.
- Poor performance, requires a lot of processing of the relational tuples that represent the spatial information.
- Lack of user friendliness because users have to manipulate tables of points.
- Difficulty of defining new spatial types.
- The impossibility of expressing geometric computations such as adjacency tests, point query, or window query.

Database Architecture for GIS(*)

- Loosely Coupled
- Many current commercial such as ArcInfo use this approach.
- Uses a RDBMS to store 'attribute' or descriptive information e.g. the name of a road not its geometry.
- A specific module for spatial data management.
- Drawbacks
- The coexistence of heterogeneous data models, which implies difficulties in modeling use and integration.
- A partial loss of DBMS techniques e.g. recovery, querying, optimization.

Database Architecture for GIS(*)

- Loosely Coupled Architecture

Application Programs

Relational DBMS Geometric Processing

(GIS)

Database Files

Database Architecture for GIS(*)

- Integrated
- DBMS extensibility address many of the problems inherent in the RM and LC approaches.
- Most commercial databases that offer facilities to handle spatial data (PostgreSQL, Oracle, DB2) take this approach. The basic idea is to add new types and operations to the RM as follows:
- The query language is extended to manipulate spatial data as well as descriptive data. New spatial types (point, line, and polygon) are handled as basic types by the DBMS.
- Many other DBMS functions such as query optimization, are adapted in order to handle geo-spatial data efficiently.
- Drawback: Does not provide full GIS functionality (cartography). We must use additional software such as Geoserver to make (or render) an attractive map from the raw vectors stored in the DBMS.

What can PostGIS do?

- Many PostGIS functions available via SQL
- Compliant with OGC1 Simple Features Specification

- Crosses
- Overlaps
- Contains
- Area
- Length
- Point on surface
- Return geometry as SVG

- Coordinate transformation
- Identify (SRID)
- Buffer
- Touches
- Crosses
- Within
- Overlaps
- Contains

What can PostGIS do?

- PostGIS supports a geometry type which is compliant with the OGC standard for Simple Features.
- POINT( 50 100 )
- LINESTRING ( 10 10, 20 20 )
- POLYGON ( ( 0 0, 5 5, 5 0, 0 0 ) )
- MULTIPOINT ( ( 1 1 ), ( 0 0 ) )
- MULTILINESTRING ( … )
- MULTIPOLYGON ( … )

Internet

Editing

Mapping

GIS

Loading

WebClient

Database

Features

Analysis

Other

GIS

HOW Spatial Databases Fit into GISImage from Paul Ramsey Refractions Research

ProstgreSQL

- PostgreSQL itself provides the main features of a RDBMS. Includes other advanced features such as:
- Inheritance
- Functions
- Constraints
- Triggers
- Rules
- Transactional integrity
- Permits an ‘OO like’ style of programming

PostgreSQL/PostGIS

The data is stored in a relatively simple format with the attributes and geometry stored in a single table. It can be viewed as Well Known Text (WKT) or displayed graphically using the asBinary(the_geom) function.

Spatial reference number

Coordinates

Attribute Data

Data type

How does it work?

- Spatial data is stored using the coordinate system of a particular projection.
- That projection is referenced with a Spatial Reference Identification Number (SRID)
- This number relates to another table (spatial_ref_sys) which holds all of the spatial reference systems available.
- This allows the database to know what projection each table is in, and if need be, re-project from those tables for calculations or joining with other tables.

Coordinate Projection

SRID=3005;MULTILINESTRING((1004687.04355194594291.053764096,1004729.74799931 594258.821943696))

SRID=4326;MULTILINESTRING((125.934150.3640700000001,-125.9335 50.36378))

Coordinates of one table can be converted to those of another table. This permits the ‘geometry’ in each table to match. Relatively easy to do in PostGIS

Spatial Database Components

- The Geometry metadata table

spatial_ref_sys

- postgis=# \d spatial_ref_sys

Table "public.spatial_ref_sys"

Column | Type | Modifiers

-----------+-------------------------+-----------

srid | integer | not null

auth_name | character varying(256) |

auth_srid | integer |

srtext | character varying(2048) |

proj4text | character varying(2048) |

Indexes:

"spatial_ref_sys_pkey" PRIMARY KEY, btree (srid)

geometry_columns

- postgis=# \d geometry_columns

Table "public.geometry_columns"

Column | Type | Modifiers

-------------------+------------------------+-----------

f_table_catalog | character varying(256) | not null

f_table_schema | character varying(256) | not null

f_table_name | character varying(256) | not null

f_geometry_column | character varying(256) | not null

coord_dimension | integer | not null

srid | integer | not null

type | character varying(30) | not null

Indexes:

"geometry_columns_pk" PRIMARY KEY, btree (f_table_catalog, f_table_schema, f

_table_name, f_geometry_column)

Database Rules

- Rules help prevent human error when modifying a data set
- Rules are user defined
- Rules are such things as;
- “A fire hydrant must be located on a water line”
- Rivers should flow down hill.

Constraints

- Constraints are similar to rules, but are less assertive.
- Constraints are provided by the DBMS and are applied by the user
- A Constraint would be “Parcel_ID Not Null” - meaning a number ID has to be provided when a parcel is created.

Constraints

Constraint GIS examples

Constraints

How can we define in front of?

Temporal Example: roads, buildings, and regions

Consider a line. From the properties of metric spaces it has a length.

Temporal Example: roads, buildings, and regions

A

B

Lets call it a road. From graph theory we have a path

Temporal Example: roads, buildings, and regions

A

F1

B

Purple line segment represents both a road and a fence.

Lets add a field (F1) with an area and a topology.

Example: roads, buildings, and regions

A

F1

B

Lets add an administrative region (outer red rectangle) and some houses

Example: roads, buildings, and regions

A

F2

F3

B

Lets divide the field in two by inserting a new fence.

We need to delete the old area and add two new areas.

What about adjacency relation between fields?

Example: roads, buildings, and regions

Time1

Time2

A

A

F1

F2

F3

B

B

Imagine a picture of the world at Time1 and Time2.

Not only have some objects changed but some spatial relationships have changed.

An addition can induce a deletion and a deletion can induce an insertion.

Example of temporal queries

- Is there a route from A to B? (now is assumed)
- Was there a route from A to B in Time1?
- Does the route in query 1 pass through the administrative region?
- Does the route in query 1 pass touch the administrative region ?
- What fields were adjacent to F2 in Time2?

Download Presentation

Connecting to Server..