390 likes | 876 Views
Spatial Databases. ENVE/CE 424/524. Definitions. Database – an integrated set of data on a particular subject Spatial database - database containing geographic data of a particular subject for a particular area
E N D
Spatial Databases ENVE/CE 424/524
Definitions • Database – an integrated set of data on a particular subject • Spatial database - database containing geographic data of a particular subject for a particular area • Database Management System (DBMS) – software to create, maintain and access databases • Data load • Editing • Visualization • Mapping • Analysis • Storage • Indexing • Security • Query
GIS: old and new GIS used to be monolithic systems all-in-one, proprietary applications that stored, queried, and visualized data New systems follow more of a tool-box approach modularized applications that interoperate
Who can benefit from spatial data management? Army Commander: Has there been any significant enemy troop movement in the past week? Insurance Risk Manager: Which houses are most likely to be affected in the next great flood on the Mississippi? Medical Doctor: Based on this patient’s MRI, have we treated somebody with a similar condition? Molecular Biologist: Is the topology of the amino acid biosynthesis gene in the genome found in any other sequence feature map in the database? Astronomer: Find all blue galaxies within 2 arcmin of quasars.
Three classes of users for spatial databases Major database managers: specialized products for enterprise management GIS users: analysis of data Internet user: more generalized requirements
Advantages of Databases over Files • Avoids redundancy and duplication • Reduces data maintenance costs • Applications are separated from the data • Applications persist over time • Support multiple concurrent applications • Better data sharing • Security and standards can be defined and enforced
Expense Complexity Performance – especially complex data types Integration with other systems can be difficult Disadvantages of Databases over Files
Types of DBMS Model • Hierarchical • Network • Relational – RDBMS • Object-oriented – OODBMS • Object-relational - ORDBMS
Characteristics of DBMS • Data model support for multiple data types • e.g MS Access: Text, Memo, Number, Date/Time, Currency, AutoNumber, Yes/No, OLE Object, Hyperlink, Lookup Wizard • Load data from files, databases and other applications • Index for rapid retrieval • Query language – SQL • Security – controlled access to data • Multi-level groups • Controlled update using a transaction manager • Backup and recovery
Relational DBMS • Data stored as tuples (tup-el), conceptualized as tables • Table – data about a class of objects • Two-dimensional list (array) • Rows = objects • Columns = object states (properties, attributes)
Table Column = property Table = Object Class Row = object Object Classes with Geometry called Feature Classes
Relational DBMS • Most popular type of DBMS • Over 95% of data in DBMS is in RDBMS • Commercial systems • IBM DB2 • Informix • Microsoft Access • Microsoft SQL Server • Oracle • Sybase
Spatial Database Example Land parcel with boundary id: 1050
Relational Database Example Four tables needed in the land parcel relational database
Relation Rules (Codd, 1970) • Only one value in each cell (intersection of row and column) • All values in a column are about the same subject • Each row is unique • No significance in column sequence • No significance in row sequence
SQL • Structured (Standard) Query Language – (pronounced SEQUEL) • Developed by IBM in 1970s • Now standard for accessing relational databases • Three types of usage • Stand alone queries • High level programming • Embedded in other applications (ArcGIS)
Types of SQL Statements • Data Definition Language (DDL) • Create, alter and delete data • CREATE TABLE, CREATE INDEX • Data Manipulation Language (DML) • Retrieve and manipulate data • SELECT, UPDATE, DELETE, INSERT • Data Control Languages (DCL) • Control security of data • GRANT, CREATE USER, DROP USER
Spatial Types – OGC Simple Features Data Model: A set of constructs for representing objects and processes in a digital environment Composed Type SpatialReferenceSystem Geometry Relationship GeometryCollection Point Curve Surface Polygon MultiSurface MultiCurve MultiPoint LineString Line LinearRing MultiPolygon MultiLineString
Spatial Relations • Equals – are the geometries the same? • Disjoint – do the geometries share common point? • Intersects – do the geometries intersect? • Touches – do the geometries intersect at their boundaries? • Crosses – do the geometries overlap? • Within– is one geometry within another? • Contains – does one geometry completely contain another? • Overlaps – do the geometries overlap? • Relate – are their intersections between the interior, boundary or exterior of the geometries?
Spatial Methods • Distance – determines shortest distance between any two points in two geometries • Buffer – returns a geometry that represents all the points whose distance from the geometry is less than or equal to a user-defined distance • ConvexHull – returns a geometry representing the small polygon that can enclose another geometry without any concave areas • Intersection – returns a geometry that contains just the points common to both input geometries • Union – returns a geometry that contains all the points in both input geometries • Difference – returns a geometry containing the points that are different between the two geometries • SymDifference – returns a geometry containing the points that are in either of the input geometries, but not both
Convex Hull and Difference Methods Convex Hull Difference
Indexing • Used to locate rows quickly • Like a book index, it is a special representation of the content that adds order and makes finding items faster • RDBMS use simple 1-d indexing • Spatial DBMS needs 2-d, hierarchical indexing • Grid • Quadtree • R-tree • Multi-level queries often used for performance (MBR)
Grid Index (multi-level) • Overlay uniform grid • Assign objects a grid id Multi-level grids are used for variable sized objects within a database
Point and Region Quadtree Indexing Based on recursive division of space. Region Quadtree Point Quadtree
R-tree Use minimum bounding rectangle (MBR) or minimum bounding box (MBB) Add a new object to the MBR that would expand the least to accommodate the object
Minimum Bounding Rectangle Study Area Minimum Bounding Rectangle
Order Dependence of a Query Query: Select all households within 3 km of a store that have an income greater than $100,000 1. Select all households with an income greater than $100,000; from this selected set, select all households within 3 km of a store 2. Select all households within 3 km of a store; from this selected set, select all households with an income greater than $100,000
Distributed Databases www.midcarb.org
References Longley et al., Geographic Information Systems and Science, 2001 Chapter 11 Guenther, Environmental Information Systems, 1998 Chapter 3
Final Few Weeks • Lecture: April 15, Metadata and Interoperability • Lab: April 17 (next Thursday), project/problem set work • I’ll spend a few minutes with each of you to get an update on your progress. • Article review due April 17 • Lab: April 22, project lab session. • Lecture April 24, GIS in decision-making • Project Presentation: May 8