2. Learning objectives. In this section you will learn: how the conceptual perception is already based on a model of space and how this conceptual perception determines the selection of an appropriate data model, and how the selection of a particular data model influences the appropriate data type
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
1. 1 Conceptual models of space, data models and representation (data structures) Josef Fürst
2. 2 Learning objectives In this section you will learn:
how the conceptual perception is already based on a model of space and how this conceptual perception determines the selection of an appropriate data model, and
how the selection of a particular data model influences the appropriate data types to represent the phenomena of interest and the possible spatial analyses
Understanding of important geographical data structures and their relevance,
Importance of topology and its utility for spatial analysis,
Relevance and implementation of thematic information (attributes)
Recognition of thematic hierarchies and their robust and efficient representation
Advantages and disadvantages of vector and raster data structures
3. 3 Outline Introduction
Conceptual models of real world geographical phenomena
Coding the basic data models for input to the computer
Data organisation in vector data structures
Object oriented data structures
Data organisation in raster data structures
Attributes and topological consistency
Vector versus raster data structures
4. 4 Introduction ‘Models‘ are created from key features
What we regard as the respective key characteristics and how we build the model, depends on our cultural background and the purpose of the view.
‘A fisherman’s view of a river is different from an engineer’s view’
Should the Danube be seen as a polyline, a waterbody, a trench in the earths’s surface or a place to live for animals and plants? Is it an exactly defined and bounded object or rather given by the continuously varying field of the river bed’s elevation?
Conceptual models of space:
5. 5 Conceptual models of space 2 questions
What is present? ? house, river, ...
Where is it? ? coordinates
Collection of objects (entities): Definition and recognition (house, utility line, forest, river, etc.); Properties to describe boundaries and position
Continuous function of cartesian coordinates in 2, 3 or 4 (if time is included) dimensions
The variable is a „smooth“ function, continuously varying in space
6. 6 Levels of creating a model A view of reality; the conceptual model.
Human conceptualization leading to an analogue abstraction (analogue model).
A formalization of the analogue abstraction without any conventions or restrictions of implementation (spatial data model)
A representation of the data model that reflects how the data are recorded in the computer (database model)
A file structure (physical computational model)
Accepted axioms and rules for handling the data
Accepted rules and procedures for displaying and presenting spatial data (graphical model)
7. 7 From observation to standardized data models
8. 8 Entities or continuous field? Pragmatic decision, determined by the application,
In administration rather entities, in science more often continuous fields
9. 9 Entities or continuous fields? Representation of reality in a data model requires 3 steps:
Conceptual model of the „world“ (Entities ? ? continuous variation)
Data model: collection of discrete objects ? ? smooth continuous field
Entities: primitives, combinations of them or
Fields: a) discrete; b) differentiable, mathematical functions; c) non differentiable functions
administration: entity model preferred
sciences, dynamic processes: continuous fields (mostly raster)
10. 10 Geographical data models and geographical data primitives geographical data primitives : points, lines, polygons (vector model)
Pixels (raster model)
11. 11 Data models of entities Vector models
simple points, lines and polygons
complex points, lines, polygons and objects
More complex definitions of points, lines and polygons can be used to capture the internal structure of an entity; functional or descriptive.
E.g.: ‘city’ contains streets, houses and parks, each having different functionality and may respond differntly to queries or operations.
Object-oriented systems support a hierarchical construction of objects from simple building blocks and a framework for description of properties as well as behaviour.
Mainly integer grids with lookup table of categories
Loss of spatial resolution
12. 12 Data models of continuous fields Vector
Floating point grids
13. 13 Raster or vector model? Raster model can also represent points, lines and polygons
Loss of spatial resolution, because location is only expressed by multiples of pixel size
Contour maps: Entities or fields?
e.g: soil map
18. 18 Data modelling and spatial analysis There are links between the selected data model, the data types and the possible analyses
If location and shape of an entity are time-invariant and exactly known ? Vector model.
If attributes are fixed, but the entity may change shape but not position (e.g. drying up of a lake) ? Raster model
If attributes can vary, object changes position, but not shape ? behaviour can be represented by object-oriented models
If no clear entities can be discerned, then often a discretized, continuous field is preferable
19. 19 Examples for the use of data models Water bodies
Changing water levels in lakes, reservoirs or rivers can change their size, shape and position
During flood events, rivers can change their path (breakthrough of meanders, new reaches) ? change of geometry and topology!
Continuous fields in hydrological models
Continuous fields in GIS only spatially discretized (TIN, raster)
Dimension of time not represented
20. 20 Representation (data structures) Data models are independent of a specific implementation in a GIS. Also analogue maps are based on models of the area.
digital coding of the information, in several stages from the data model, data structures, data types up to their binary representation
We need efficient data structures, that
represent the selected data models completely and unambiguously,
efficiently support the desired analyses and
use storage space economically.
21. 21 Coding the basic data models for input to the computer discrete primitives of geographical data to create entities as well as continuous fields
Representation of position, relationships and attributes of different types
Points, lines and polygons are the spatial primitives in a vector model, pixels in raster models.
Spatial relationships between entities are defined by topology
22. 22 Data organisation in vector data structures Vector-based geographical databases are composing a complex theme of several layers, each of which combines a certain class of phenomena.
E.g. hydrological map: rivers, lakes, observation sites, land use, etc., each in a separate layer
Layer (Coverage) consists of entities of one type, with relationships, different attributes
Layers are handled independently from each other
E.g. data structure does not force a gauge to be on the river bank
Vector-GIS use implicit relations (tables) for storage
Software packages use different structures.
23. 23 Points Position of points is defined by a single pair of coordinates (X, Y)
Additional info: type of point, attributes
Layer of point entities created from simple table
E.g. event theme in ArcView
24. 24 Lines Sequence of (X, Y) coordinate pairs and connecting straight lines or curves
25. 25 Networks Information about connectivity with other line entities to represent street networks, utilities, rivers
topological information in the data structures ? connectivity tables
Topological terms: „node“ and „arc“
26. 26 Polygons Shape, neighbours, hierarchy
simple polygons: sequence of x,y coordinate pairs
Border lines between polygons are digitized and stored twice. Error: gaps, overlaps
No information on neighbourhood.
islands only graphically represented.
Difficult to validate
27. 27 Polygons Polygons by arc-node-topology
Underlying principle: free of redundancy
28. The shape file format (1) ESRI file based format for vector datasets
Proprietary, but open documentation ? „industry standard“
A shapefile consists of several files with different extensions, e.g. myshapes.shp, myshapes.dbf, …
A shape files holds only one type of entities: points OR lines OR polygons
myshapes.shp: the geometry in binary format
myshapes.dbf: the attributes, in dBase IV format
myshapes.shx: a spatial index
myshapes.sbx: index for joins
myshapes.prj: the projection
Myshapes.shp.xml Metadata for shapefile
29. The shape file format (2) Open source libraries to read and write
3D lines possible (but not widely supported!)
For details, see: http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf 29
30. 30 Triangulation of continuous fields Node list and triangle list
Optimal TIN by Delaunay criteria
31. 31 Object oriented data structures Object oriented systems encapsulate data objects together with methods applicable to them. Access to objects is only done by the methods defined for them
Data structures get a defined behaviour
„Hydrant“ should delete itself, when the last pipe connecting it to the network is removed.
Structural object orientation: capability, to create composed objects (Arc Hydro)
Behavioural object orientation: behaviour of data types with specifically defined functions and procedures
CASE-Tools (Computer Aided Software Engineering) for design and implementation
32. 32 Object oriented data structures UML (unified modeling language) diagram of a part of a geodatabase
33. 33 Object oriented data structures Arc Hydro Framework
34. 34 Data organisation in raster data structures Raster resembles photo
3 ways to interprete a pixel
classification: a range of values is allocated to certain objects (gray pixels are roads, blue pixels are water surfaces,...).
measure the value: intensity of a colour, concentration, etc.
relative height over reference height.
35. 35 Raster data structure Position is represented by discrete cells
Types of raster maps
Nominal data like land use (forest, grassland, farmland, ...)
Continuous values, concentration, light intensity
relative measures like elevation.
36. 36 Raster data structure Entities also in raster model
Cell size determines resolution: cell size max. 50% of smallest recognized object
37. 37 Raster data structure Topology described implicitly by raster
Cell raster – point raster
38. 38 Images Pictures can be used as map displays (e.g. satellite image, orthophoto) or also as attribute information (e.g. pictures of the measuring instruments linked to the measuring points on a measuring point map, photo of the houses in the information system of a real estate agent).
39. 39 Storage of images Similar to raster maps, with some specific properties
Pixel (from picture element)
1, 8, 24 or 32 bit for coding of a colour value
Number of bits per pixel: colour depth
monochrome, grayscale, RGB, CMY, CMYK
Use of a lookup table
40. 40 Georeferencing of images Depending on the orientation of the coordinate system, objects equal in nature are represented differently in the raster model
If distortions of image are small (flat terrain) ?georeferencing by affine projection with >= 4 ground control points (polynomial transformation).
Pixel are re-computed by interpolation or areal averaging, according to the type of variable ? loss of information
41. 41 Storing vector and raster data in DBMS Efficient access to large volumes of data with complex relationships
Use of DBMS
Hybrid concept: geometry data in vendor-specific binary format, attributes in RDBMS (INFO, ORACLE, INGRES, INFORMIX, MS ACCESS)
Storage of attribute data independently from spatial data
extension, updating, deletion of attribute data do not influence spatial data
Commercial RDBMS ensure use of latest developments and standardisation
Use of standard query language like SQL
42. 42 Attribute information Geoinformation is based on two main elements
Geometry and topology (Question: Where?) and
Thematic information (attributes) (Question: What?).
Approach via thematic maps
Analogy of transparencies
43. 43 Attribute information in a raster model Geometry and topology determined by definition of the raster (origin, resolution)
Attribute information is additional dimension
Spatial query, thematic query, and combined query
44. 44 Thematic information in vector models A theme is assigned to each topological object (node, edge, polygon) by one or more attributes (often tied to a label point)
M:N relationship between different thematic layers ? resolve into n units with 1:M relationship
45. 45 Thematic attributes thematic attributes quantitatively classify objects, e.g., a land parcel is attributed by ID, area, prize, owner, address, etc.
Logically organized in tables
A key-field uniquely relates attributes and topological objects
required and optional attributes
computed attributes (area, length)
Hierarchical inheritance of attributes in object-oriented systems
46. 46 Thematic information and topological consistency Consistency of data is one of the most important criteria of an information system. It must be warranted when new data are added as well as after updates
47. 47 Thematic information and topological consistency Topology- rules in ArcGIS 8.3
48. 48 Thematic information and topological consistency Topology- rules in ArcGIS 8.3
49. 49 Thematic information and topological consistency Topology- rules in ArcGIS 8.3
50. 50 Vector versus raster data structures Decision depends on classes of represented objects
Linear phenomena are better handled in vector models
Raster model has advantages with areal data
If high positional accuracy is important, rasters need too much storage
Applications define the criteria
Coordinate transformation is easy in vector models
Coordinate transformation is more difficult for raster models, because input pixel generally do not have only a single output pixel ? irreversible process
51. 51 Data exchange, standardization De-facto-standards for exchange of geometry and attribute information
Topology is not so common
national, European and international standards
OpenGIS Consortium (OGC): Interoperability of GIS
52. 52 Summary Data structures should represent spatial phenomena completely and clearly and support efficient analysis
Discrete primitives for entities and continuous fields
Topology: Relationship between entities. Arc-node topology supports connectivity, definition of areas and connectivity
Vector-GIS generally implement a layer concept with basic elements of points, lines, networks or polygons
TIN is a vector data structure for continuous fields
Object oriented data structures encapsulate data objects together with methods for their behavour.
53. 53 Summary Raster data structures: generally grid of square cells, aligned with coordinate axes
Images are special raster data sets with high resolution
Thematic attributes describe „what“ an object is (semantics)
DBMS for robust storage of geo-data
Adherence to national and international standards for exchange of geo-data between different systems and institutions