Create Presentation
Download Presentation

Download Presentation
## Topic 4 – Geographical Data Analysis

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Topic 4 – Geographical Data Analysis**A – The Nature of Spatial Analysis B – Basic Spatial Analysis**A**The Nature of Spatial Analysis • 1. Spatial Analysis and its Purpose • 2. Spatial Location and Reference • 3. Spatial Patterns • 4. Topological Relationships**1**Spatial Analysis and its Purpose • Conceptual framework • Search of order amid disorder. • Organize information in categories. • Method • Inducting or deducting conclusions from spatially related information. • Deduction: Deriving from a model or a rule a conclusion. • Induction: Learning new concepts from examples. • Spatial analysis as a decision-making tool. • Help the user make better decisions. • Often involve the allocation of resources.**1**Spatial Analysis and its Purpose • Requirements • 1) Information to be analyzed must be encoded in some way. • 2) Encoding implicitly requires a spatial language. • 3) Some media to support the encoded information. • 4) Qualitative and/or quantitative methods to perform operations over encoded information. • 5) Ways to present to results in an explicit message. Information Encoding Media Methods Message**Remote sensing**Geomorphology Climatology Quantitative methods Physical Geography Geographic Techniques Biogeography Cartography Soils GIS Human Geography Historical Political Economic Behavioral Population 1 Spatial Analysis and its Purpose Spatial Analysis**1**Mapping Deaths from Cholera, London, 1854 (Snow Study)**1**Spatial Analysis and its Purpose • Data Retrieval • Browsing; windowing (zoom-in & zoom-out). • Query window generation (retrieval of selected features). • Multiple map sheets observation. • Boolean logic functions (meeting specific rules). • Map Generalization • Line coordinate thinning of nodes. • Polygon coordinate thinning of nodes. • Edge-matching. DB HD SHP**1**Spatial Analysis and its Purpose • Map Abstraction • Calculation of centroids. • Visual editing & checking. • Automatic contouring from randomly spaced points. • Generation of Thiessen / proximity polygons. • Reclassification of polygons. • Raster to vector/vector to raster conversion. • Map Sheet Manipulation • Changing scales. • Distortion removal/rectification. • Changing projections. • Rotation of coordinates.**6**5 4.5 7.5 1 Spatial Analysis and its Purpose • Buffer Generation • Generation of zones around certain objects. • Geoprocessing • Polygon overlay. • Polygon dissolve. • “Cookie cutting”. • Measurements • Points - total number or number within an area. • Lines - distance along a straight or curvilinear line. • Polygons - area or perimeter.**1**Spatial Analysis and its Purpose • Raster / Grid Analysis • Grid cell overlay. • Area calculation. • Search radius. • Distance calculations. • Digital Terrain Analysis • Visibility analysis of viewing points. • Insolation intensity. • Grid interpolation. • Cross-sectional viewing. • Slope/aspect analysis. • Watershed calculation. • Contour generation. 15**3**Spatial Patterns • Relativity of objects • Definition of an object in view of another. • Create spatial patterns. • Main patterns • Size. • Distribution/spacing : Uniform, random and clustered. • Proximity. • Density: Dense and dispersed. • Shape. • Orientation. • Scale. Size Form Orientation Scale Proximity**Uniform**Clustered Positive autocorrelation Random 3 Spatial Patterns • Spatial autocorrelation • Set of objects that are spatially associated. • Relationship in the process affecting the object. • Negative autocorrelation. • Positive autocorrelation.**4**Topological Relations • Proximity • Qualitative expression of distance. • Link spatial objects by their mutual locations. • Nearest neighbors. • Buffer around a point or a line. • Directionality**4**Topological Relations • Adjacency • Link contiguous entities. • Share at least one common boundary. • Intersection • Containment • Link entities to a higher order set. City B City A**1**2 3 4 5 6 4 Topological Relations • Connectivity • Adjacency applied to a network. • Must follow a path, which is a set of linked nodes. • Shortest path. • All possible paths.**4**Topological Relations Arable land • Intersection • What two geographical objects have in common. • Union • Summation of two geographical objects. • Complementarity • What is outside of the geographical object. Flat land Suitable for agriculture Land Non arable land**B**Elementary Spatial Analysis • 1. Statistical Generalization • 2. Data Distribution • 3. Spatial Inference**1**Statistical Generalization • Maps and statistical information • Important to display accurately the underlying distribution of data. • Data is generalized to search for a spatial pattern. • If the data is not properly generalized, the message may be obscured. • Balance between remaining true to the data and a generalization enabling to identify spatial patterns. • Thematic maps are a good example of the issue of statistical generalization.**15**25 88 0-30 34 56 7 31-65 92 61 45 65- 77 39 21 1 Statistical Generalization Spatial Pattern Data Classification**1**Statistical Generalization • Number of classes • Too few classes: contours of data distribution is obscured. • Too many classes: confusion will be created. • Most thematic maps have between 3 and 7 classes. • 8 shades of gray are generally the maximum possible to tell apart.**1**Statistical Generalization • Classification methods • Thematic maps developed from the same data and with the same number of classes, will convey a different message if the ranging method is different. • Each ranging method is particular to a data distribution.**Frequency**Value 2 Data Distribution • Histogram • The first step in producing a thematic map. • See how data is distributed. • Use of basic statistics such as mean and standard deviation. • An histogram plots the value against the frequency. Uniform Normal Exponential**C1**C2 C3 C4 L H 2 Data Distribution • Equal interval • Each class has an equal range of values. • Difference between the lowest and the highest value divided by the number of categories. • (H-L)/C • Easy to interpret. • Good for uniform distributions and continuous data. • Inappropriate if data is clustered around a few values. Frequency Value**C1**C2 C3 C4 n(C4) n(C1) n(C2) n(C3) 2 Data Distribution • Quantiles • Equal number of observations in each category. • n(C1) = n(C2) = n(C3) = n(C4). • Relevant for evenly distributed data. • Features with similar values may end up in different categories. • Equal area • Classes divided to have a similar area per class. • Similar to quantiles if size of units is the same. Frequency Value**C1**C2 C3 C4 X -1STD +1STD 2 Data Distribution • Standard deviation • The mean (X) and standard deviation (STD) are used to set cutpoints. • Good when the distribution is normal. • Display features that are above and below average. • Very different (abnormal) elements are shown. • Does not show the values of the features, only their distance from the average. Frequency Value**C1**C2 C3 C4 2 Data Distribution • Arithmetic and geometric progressions • Width of the class intervals are increased in a non linear rate. • Good for J shaped distributions. Frequency Value**C1**C2 C3 C4 2 Data Distribution • Natural breaks • Complex optimization method. • Minimize the sum of the variance in each class. • Good for data that is not evenly distributed. • Statistically sound. • Difficult to compare with other classifications. • Difficult to choose the appropriate number of classes. Frequency Value**2**Data Distribution • User defined • The user is free to select class intervals that fit the best the data distribution. • Last resort method, because it is conceptually difficult to explain its choice. • Analysts with experience are able to make a good choice. • Also used to get round numbers after using another type of classification method. • $5,000 - $10,000 instead of $4,982 - $10,123. • Using classification • Classification can be used to deliberately confuse or hide a message.**2**Data Distribution “no problems” - Equal steps “there is a problem” - Quantiles**2**Data Distribution “everything is within standards” - standard deviation**3**Spatial Inference • Filling the gaps • Sampling shortens the time necessary to collect data. • Requires methods to “fill the gaps”. • Interpolation and extrapolation • Data at non-sampled locations can be predicted from sampled locations. • Interpolation: • Predict missing values when bounding values are known. • Extrapolation: • Predict missing values outside the bounding area. • Only one side is known.**Interpolation line**Height Sample Location Extrapolation line Delay at the traffic light Sample Interpolation line Number of vehicles 3 Spatial Inference: Interpolation and Extrapolation**3**Spatial Inference: Best Fit 112 110 y = 0.1408x + 116.69 108 2 R = 0.6779 106 Sex Ratio 104 102 100 98 96 -130 -120 -110 -100 -90 -80 -70 -60 Longitude**3**Spatial Inference • Aggregation • Data within a boundary can be aggregated. • Often to form a new class. • Conversion • Data from a sample set can be converted for a different sample set. • Changing the scale of the geographical unit. • Switching from a set of geographical units to another.**Boreal Forest**District B1 District B2 3 Spatial Inference: Aggregation and Conversion Pine Trees Poplar Trees District A District B