geographic information and spatial data types n.
Skip this Video
Loading SlideShow in 5 Seconds..
Geographic Information and Spatial Data Types PowerPoint Presentation
Download Presentation
Geographic Information and Spatial Data Types

Loading in 2 Seconds...

play fullscreen
1 / 89

Geographic Information and Spatial Data Types - PowerPoint PPT Presentation

  • Uploaded on

GIS in the Sciences ERTH 4750 (38031). Geographic Information and Spatial Data Types. Xiaogang (Marshall) Ma School of Science Rensselaer Polytechnic Institute Tuesday, January 29, 2013. Review of last week’s work. In the lecture Definition of Geographic Information System

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Geographic Information and Spatial Data Types' - sage

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
geographic information and spatial data types

GIS in the Sciences

ERTH 4750 (38031)

Geographic Information and Spatial Data Types

Xiaogang (Marshall) Ma

School of Science

Rensselaer Polytechnic Institute

Tuesday, January 29, 2013

review of last week s work
Review of last week’s work
  • In the lecture
    • Definition of Geographic Information System
    • Concepts of Data, Information, Geospatial data
    • GPS, RS, Database, GIS
  • In the lab
    • MapInfo Professional environment
    • Data Input: default formats, external file formats
    • Create points: geocode, coordinates
    • Questions?
  • This lecture is partly based on:
    • Augustijn, E.-W., 2009. Geographic information and spatial data types. E-lecture of the Distance Course Principles of GIS. ITC, Enschede, The Netherlands
    • Huisman, O., de By, R.A. (eds.), 2009. Principles of Geographic Information Systems. ITC Press, Enschede, The Netherlands
  • Geographic phenomena
  • Computer representations
  • Autocorrelation and topology
  • Representations for fields and objects
  • The temporal dimension
1 geographic phenomena
1 Geographic phenomena

Land use type


Soil type

  • Geographic phenomena are the study objects of a GIS.
    • The real world is complex. A certain spot contains many different phenomena.
    • Different phenomena require different digital representations and multiple representations are possible for a same phenomenon.

Water temperature

Surface rock type

Water body

Lake Mendota


Tourism site

Water quality

1 geographic phenomena1
1 Geographic phenomena
  • A digital representation is a model. It is not the real thing itself.
  • Our representation will never be perfect, some facts will not be found.

“Essentially, all models are wrong, but some are useful.”-- Empirical Model-Building and Response Surfaces (1987)

by George E. P. Box and Norman R. Draper

1 geographic phenomena2
1 Geographic phenomena
  • This lecture studies geographicphenomena more deeply, and looks into different types of computerrepresentations for them.
1 1 geographic phenomenon defined
1.1 Geographic phenomenon defined
  • A geographic phenomenon is something of interest that:
    • Can be named or described
    • Can be geo-referenced
    • Can be assigned a time (interval) at which it is/was present

Description: Winslow Building


LatLng(42.73091, -73.68425)

Built in: 1886

1 2 different types of phenomena
1.2 Different types of phenomena
  • There are two groups of geographic phenomena, fields and objects:
    • A (geographic) fieldis a geographic phenomenon for which, for every point in the study area, a value can be determined (e.g., temperature, barometric pressure and elevation)
    • (Geographic) objectspopulate the study area, and are usually well distinguishable, discrete, bounded entities. The space between them is potentially empty (e.g., buildings, rivers)
1 2 1 field
1.2.1 Field
  • Elevation is an example of field
    • You can measure the height everywhere
1 2 1 field1
1.2.1 Field
  • There are two types of geographic fields, discrete fields and continuous fields
    • In a continuous field, the underlying function is assumed to be continuous. Continuity means that all changes in field values are gradual. (for example elevation)
    • Discrete fields cut up the study space in mutually exclusive bounded parts, with all locations in one part having the same field value. (for example land use)
1 2 1 field2
1.2.1 Field
  • Continuous fields
    • Continuous means that all changes in field values are gradual
    • In a differentiable field we can measure the change
    • In the example on the left, we can measure the gradient (slope) as the change of elevation
1 2 1 field3
1.2.1 Field
  • Discrete fields
    • Discrete fields cut up the study space in subparts with a clear boundary, with all locations in one part having the same value
    • Typical examples are land classifications, geological classes, soil types, land use types, crop types or natural vegetation types
1 2 2 objects
1.2.2 Objects
  • Objects are discrete and bounded entities
    • The space between the objects is potentially ‘empty’ or ‘undetermined’

Three rocks (objects), in between no rocks (empty)

1 2 2 objects1
1.2.2 Objects
  • The position of an object in space is determined by a combination of one or more of the following parameters
    • Location (where is it?)
    • Shape (what form?)
    • Size (how big?)
    • Orientation (direction)

A bridge is an object, with a location, shape, size (length of the bridge) and a direction (maybe north –south)

1 2 2 objects2
1.2.2 Objects
  • We usually do not study objects in isolation (a single object) but whole collections of objects
  • Observe that collections of objects can be interesting phenomena at a higher aggregation level:
    • Forest plots form forests
    • Parcels form blocks and blocks form suburbs
    • Streams, brooks and rivers form a river drainage system

We can study each individual tree, or the combination of trees, as one object

1 3 boundaries
1.3 Boundaries
  • Both objects and discrete fields have boundaries
  • Two different types of boundaries:
    • Crisp boundaries
    • Fuzzy boundaries
1 3 1 crisp boundaries
1.3.1 Crisp boundaries
  • A crisp boundary is one that can be determined with almost arbitrary precision
    • As a general rule of thumb, crisp boundaries are more common in man-made phenomena
1 3 2 fuzzy boundaries
1.3.2 Fuzzy boundaries
  • Fuzzy boundaries contrast with crisp boundaries in that the boundary is not a precise line, but rather an area of transition
2 computer representations
2 Computer representations
  • Computer representations can be divided into two groups: tessellations and vector-based representations
    • The next step is to understand how the computer representations can be applied to represent geographic fields and objects
2 1 tessellations
2.1 Tessellations
  • A tessellation is a partition of space into mutually exclusive cells that together make up the complete study area
  • There are two groups of tessellations:
    • Regular tessellations, the cells are the same shape and size
    • Irregular tessellations, the cells vary in shape and size
2 1 1 regular tessellations
2.1.1 Regular tessellations
  • All regular tessellations have in common that the cells are of the same shape and size, and the field attribute value assigned to a cell is associated with the entire area occupied by the cell




2 1 1 regular tessellations1
2.1.1 Regular tessellations
  • The size of the area that a single raster cell represents is called the raster’s resolution
2 1 1 regular tessellations2
2.1.1 Regular tessellations
  • Some convention is needed to state which value prevails on cell boundaries
  • Lower and left boundaries belong to the cell
2 1 1 regular tessellations3
2.1.1 Regular tessellations
  • When we represent a continuous field, values are changing constantly
  • In a regular tessellation each cell has only one value, that represents the total area of a cell (e.g., average elevation)
  • There will be a continuity gap between adjacent cells
2 1 1 regular tessellations4
2.1.1 Regular tessellations

Make the cell size smaller

  • Two ways to improve on this continuity issue:
    • Make the cell size smaller
    • Assume that the cell value only represents one specific location and provide a good interpolation function for all other locations
2 1 1 regular tessellations5
2.1.1 Regular tessellations
  • A raster stores a long list of values, one for each cell preceded by a small list of extra data (called header file) that informs how to interpret the list of values
2 1 1 regular tessellations6
2.1.1 Regular tessellations

Dark blue line indicates the order of cell values in the cell value list

  • The values in the list can be ordered in different ways.
  • The header file will indicate which space filling schema has to be used.
  • Examples of space filling curves: row order, row-prime order, Morton (Z) and Peano-Hilbert order

(a) Row order, (b) row-prime order, (c) Morton (z) order, (d) Peano-Hilbert order

2 1 1 regular tessellations7
2.1.1 Regular tessellations
  • Disadvantages of regular tessellation:
    • Not adaptive to the spatial phenomenon we want to represent. No matter how many cells have the same value, it will store this value for every cell
  • Advantages of regular tessellation:
    • We know how they partition space
    • We can make computations specific to this partitioning
    • Fast algorithms
2 1 2 irregular tessellations
2.1.2 Irregular tessellations
  • Besides regular tessellations, there are also irregular tessellations:
    • Partition space into mutually disjoint cells
    • Cells vary in size and shape
    • Adapts to spatial phenomena
    • Example: quadtree






2 1 2 irregular tessellations1
2.1.2 Irregular tessellations

Only one value: quadrant is not split further

  • Quadtree:
    • The area is split into four quadrants until parts have only a single field value

Multiple values, continue splitting

2 1 2 irregular tessellations2
2.1.2 Irregular tessellations

Cell size:

100 x 100 m

  • Quadtrees have various interesting characteristics for example, square nodes at the same level represent equal area sizes. This allows quick computation

160000 m2

40000 m2

10000 m2

2 2 vector representations
2.2 Vector representations
  • Besides tessellations (raster representations) we can also store our geographical phenomena in a vector representation
    • We discuss the difference between raster and vector representations
    • We discuss different types of vector representations
2 2 vector representations1
2.2 Vector representations
  • In vector representations, a georeferenceis explicitly associate with the geographic phenomena
  • A georeferenceis a coordinate pair from some geographic space, also known as a vector








2 2 vector representations2
2.2 Vector representations


  • Tessellations do not explicitly store georeferences of the phenomena. They provide a georeference of the lower left corner and the resolution
  • The georeference of all other cells can be derived from this information


2 2 vector representations3
2.2 Vector representations
  • We will discuss the following vector representations:
    • Triangulated Irregular Networks (TIN)
    • Point representations
    • Line representations
    • Area representations
2 2 1 triangulated irregular networks
2.2.1 Triangulated Irregular Networks
  • A TIN is built from a set of measurements for example points of height
    • These points can be scattered unevenly over the study area, with areas of more change having more points
    • Triangles are fitted through three points to form planes
2 2 1 triangulated irregular networks1
2.2.1 Triangulated Irregular Networks

Delaunay triangulation

  • There are many triangulations possible for the same input dataset
  • The best triangulations are Delaunay triangulations which have the following properties:
    • Triangles are as equal sided as possible
    • The circumcircle through the anchor points of a triangle does not contain any other anchor points

Other triangulation

2 2 1 triangulated irregular networks2
2.2.1 Triangulated Irregular Networks

No value is stored for this plane

  • A Tin is a vector representation and not an irregular tessellation because:
    • Each anchor point has a stored georeference
    • The planes do not have a stored values (like raster cells have)

A georeference and value is stored for each anchor point

2 2 1 triangulated irregular networks3
2.2.1 Triangulated Irregular Networks
  • Each plane fitted through three anchor points has a fixed gradient (Slope)
2 2 1 triangulated irregular networks4
2.2.1 Triangulated Irregular Networks
  • Each plane fitted through three anchor points has a fixed aspect
  • An aspect is the orientation of the slope, for example Northwest or Southeast
2 2 2 points
2.2.2 Points
  • Another vector representation are points
  • Points are defined as single coordinate pairs (x,y) when we work in 2D or coordinate triplets (x,y,z) when we work in 3D
  • Points are best used to represent objects that are best described as shape- and sizeless, single-locality features

Points representing trees along a road

2 2 3 lines
2.2.3 Lines
  • Line representations:
    • Used to represent one-dimensional objects (e.g., roads, railroads, canals, rivers…)
    • Line is defined by 2 end nodes and 0-n internal nodes to define the shape of the line.
    • An internal nodeor vertexis like a point that only serves to define the line

Begin node


Line or arc

End node

2 2 4 areas
2.2.4 Areas
  • Area representations:
    • When area objects are stored using a vector approach, the usual technique is to apply a boundary model
    • The area is defined by the boundary of the area

You store the boundary of the area

2 2 4 areas1
2.2.4 Areas
  • A simple but naïve representation of area features would be to list for each polygon the list of lines that describes its boundary. This is called a polygon-by-polygon representation
  • Each line in the list would be a sequence that starts with a node and ends with one

Total boundary of the polygon

2 2 4 areas2
2.2.4 Areas
  • The reason why this is not a good representation is called data redundancy
  • This means that shared boundaries between polygons are stored double

When store the second boundary, some line segments are duplicated

2 2 4 areas3
2.2.4 Areas

Line 4

Line 3

  • The boundary model or topological data model is an improved representation of the polygon-by-polygon model
  • It stores parts of a polygon’s boundary as separate line segments

Line 2

Line 1

2 2 4 areas4
2.2.4 Areas
  • It also indicates which polygon is on the left and which is on the right of each arc






Line I

Line O


Line N


Line P




2 2 4 areas5
2.2.4 Areas
  • We can determine the left and the right polygon, because the line segment has a direction
  • The direction of the line segment is from the “From node” to the “To node”






Line I

Line O


Line N


Line P




From node 15



To node 13

3 autocorrelation and topology
3 Autocorrelation and topology
  • We have seen different types of geographic phenomena (objects, fields, discrete, continuous) and their computer representations (raster, vector)
  • We will now look at two important concepts:
    • Spatialautocorrelation
    • Topology
3 1 spatial autocorrelation
3.1 Spatial autocorrelation

Option 1: store as many points as possible

  • Continuous phenomena like elevation can be measured at any point, and each location will have a different value
  • When we want to represent continuous phenomena we can:
    • Try to store as many locations as possible
    • Try to find a symbolic representation

Option 2: find a symbolic


(3.0678x2 + 20.08x – 7.34y)

3 1 spatial autocorrelation1
3.1 Spatial autocorrelation

Finite number of locations with “Elevations”

  • In GISs we combine both approaches:
    • We store a finite (but intelligently chosen) set of locations with their elevation
    • The stored values are paired with an interpolation function that allows us to calculate a value for locations that are not stored

Interpolation function

3 1 spatial autocorrelation2
3.1 Spatial autocorrelation
  • The underlying principle is called spatial autocorrelation
  • Spatial autocorrelation means: locations that are close are more likely to have similar values than locations that are far apart
3 2 topology and spatial relationships
3.2 Topology and spatial relationships
  • Like autocorrelation, topology is also an important concept
  • Topology deals with spatial properties that do not change under transformation
  • When a transformation is performed a lot of properties change like:
    • The area of a polygon
    • The length of a line
  • These properties are not topological

Boundary of the polygon does not have the same length

3 2 topology and spatial relationships1
3.2 Topology and spatial relationships
  • Examples of topological relationships:
    • Neighborhood relationships are intact
    • Areas are still bounded by the same boundary (only the shapes and lengths of their perimeter have changed)
3 2 topology and spatial relationships2
3.2 Topology and spatial relationships

The striped area is the interior of feature A

  • We want to find all possible relationships between two polygons for two-dimensional space
  • We can use the topological properties of interior and boundary to define relationships between spatial features

Boundary of feature A

3 2 topology and spatial relationships3
3.2 Topology and spatial relationships

A disjoint B

A contains B








A meets B

A equal B





A overlap B

A covered by B






A covers B

A inside B

3 2 topology and spatial relationships4
3.2 Topology and spatial relationships

A meets B

  • We can mathematically define each of these relationships by considering all possible combinations of intersection (∩) between the boundary and the interior of A with region B and test if they are the empty set (Ø) or not.
    • An empty set means there is no intersection
    • Unequal to empty (≠Ø) means there is an intersection.

A meets B =

Interior (A) ∩ interior (B) = Ø


Boundary (A) ∩ Boundary (B) ≠ Ø


Interior (A) ∩ Boundary (B) = Ø


Boundary (A) ∩ Interior (B) = Ø



Only the boundaries intersect, that is the only expression that does not lead to an empty set

3 2 topology and spatial relationships5
3.2 Topology and spatial relationships
  • What is the use of topology?
    • Topological relationships can be used in queries. For example find all parcels that border (meet) a certain road
    • But there are also other techniques to find adjacency besides topology. There must be other advantages
3 2 topology and spatial relationships6
3.2 Topology and spatial relationships
  • What is the use of topology?
    • Topological relationships can be used to check your data for consistency. For example after digitizing ( make sure you don’t have overlapping polygons when this is data wise unacceptable)
3 2 topology and spatial relationships7
3.2 Topology and spatial relationships
  • Topological consistency : the set of rules that determines what are valid spatial arrangements.
  • Rules of topological consistency:
  • Every line must be bounded by two nodes

Rule 1

3 2 topology and spatial relationships8
3.2 Topology and spatial relationships

Rule 2

  • Every line borders two polygons, namely the left and the right polygon
  • Every polygon has a closed boundary consisting of an alternating (and cyclic) sequence of points and lines

Rule 3

3 2 topology and spatial relationships9
3.2 Topology and spatial relationships

Rule 4

  • Around every point exists an alternating (and cyclic) sequence of lines and polygons
  • Lines only intersect at their bounding nodes

Rule 5

3 2 topology and spatial relationships10
3.2 Topology and spatial relationships
  • For the previous example, rule number 2 is violated, this line segment is not bordered by two polygons.

Only has a polygon on

one side of the line

3 3 three dimensional case
3.3 Three-dimensional case
  • Topologies apply for 2-dimensional space
  • Some applications require elevation data, this is normally accommodated by a 2½-D data structure
  • One important aspect of 2½-D data is their association of an additional z-value with each node. Note that this is like a TIN
3 3 three dimensional case1
3.3 Three-dimensional case

3-D object, the same node has two

different z-values. This can not be

modeled in the most common GIS

soft-ware packages.

  • 2½-D is not the same as 3-D. It is not possible to have two different nodes with identical XY-coordinates but different z-values.
  • Solid representation is an important feature for mineral exploration and urban modeling, but requires true 3D objects

Two z-values

3 4 scale and resolution
3.4 Scale and resolution
  • Map scale can be defined as the ratio between distance on a paper map and distance of the same stretch in the terrain
  • Large-scale → much detail of a small area
  • Small-scale → few detail (world map)
  • 1:50,000
  • 1 cm on map = 50,000 cm in reality
  • 1 cm on map = 500 meters in reality
4 representations for fields and objects
4 Representations for fields and objects
  • We have looked at the various representation techniques. Now we will study which of them can be used to represent:
    • (Continuous) geographic fields
    • (Discrete) geographic fields
    • Geographic objects
4 1 representations for continuous fields
4.1 Representations for continuous fields
  • Continuous fields like elevation can be represented as a raster
  • The raster can be thought of as a long list of field values.
  • You will see that field values are normally “floating point” values, and values change gradually.
4 1 representations for continuous fields1
4.1 Representations for continuous fields
  • In a TIN the amount of data stored is less compared to a regular tessellation.
  • The quality of a TIN depends on the choice of anchor points, as well as the triangulation
  • Anchor points on elevation ridges are a guarantee for correct peaks and mountain slope faces
4 1 representations for continuous fields2
4.1 Representations for continuous fields
  • An isoline is a linear feature that connects the points with equal value
  • When the field is elevation we also speak of contour lines
4 2 representations for discrete fields
4.2 Representations for discrete fields
  • Discrete field can be represented as a raster
  • Raster cells have an integer value
  • Value is a classification, a numeric code representing for example a land use class
  • Area boundaries may appear ragged
4 2 representations for discrete fields1
4.2 Representations for discrete fields
  • Discrete fields can be represented as vector polygons
  • A field because they cover the full map extent (study area)
4 3 representations of objects
4.3 Representations of objects
  • Tessellation:
  • Area objects similar as in discrete fields, except “NoData” values may appear in between objects.
  • Lines and point objects are awkward to represent in raster.
4 3 representations of objects1
4.3 Representations of objects
  • Objects can be represented in a (polygon) vector representation
4 4 data organization
4.4 Data organization
  • The main principle of data organization is that of a spatial data layer
  • A spatial data layer is either a representation of a continuous or discrete field or a collection of objects of the same kind
5 the temporal dimension
5 The temporal dimension
  • Besides geometric, thematic and topological properties, geographic phenomena change over time; they have temporal characteristics
  • Spatiotemporal data structures are not discussed here, but we do discuss the main characteristics of time
  • Observe that time is inherently of a continuous nature, when we represent it, we have to make it discrete
5 1 time density
5.1 Time density
  • Time can be measured along a discrete or continuous scale
  • Discrete time is composed of discrete elements (seconds, hours)
  • In continuous time, no such elements exist
5 2 dimensions of time
5.2 Dimensions of time
  • Valid time (or world time) is the time when something happens
  • Transaction time or database time is when the event is stored in a computer
5 3 time order
5.3 Time order
  • Linear, extending from the past to the present and future
  • Cyclic, in repeating cycles like seasons, or days of the week
  • Branching, in which different time lines from a point onwards are possible (different scenario’s)
5 4 measures of time
5.4 Measures of time
  • Chronon– shortest non-decomposable unit of time
  • Granularity – precision of time value. In a cadastre time granularity can be as short as a day, the database is updated every day
5 5 time reference
5.5 Time reference
  • Absolute time, fixed, for example a hour, date plus year
  • Relative time, indicated relative to another point in time, for example “yesterday”
5 6 summary properties of the time dimension
5.6 Summary – Properties of the time dimension
  • Time density, discrete or continuous scale
  • Dimensions of time, valid time (world time) or transition time (database time)
  • Time order, linear time, branching time, cyclic time (seasons)
  • Measure of time, chronon (the shortest non-decomposable unit of time) granularity (precision of time value)
  • Time reference, absolute (fixed time) or relative (implied time)
reading for this week
Reading for this week
  • Check the course web page
next classes
Next classes
  • Friday class:
    • MapInfo Professional: View and analyzing data
  • In preparation:
    • Next Tuesday: Spatial Referencing