1 / 0

Winter 2011 GIS Institute

Winter 2011 GIS Institute. Geocoding & Spatial Analysis. Spatial data are special. Modifiable Area Unit Problem (MAUP) Boundary problems Spatial sampling procedures Spatial Autocorrelation Ecological fallacy. Modifiable Area Unit Problem (MAUP).

kipp
Download Presentation

Winter 2011 GIS Institute

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Winter 2011 GIS Institute

    Geocoding & SpatialAnalysis Winter GIS Institute
  2. Spatial data are special Modifiable Area Unit Problem (MAUP) Boundary problems Spatial sampling procedures Spatial Autocorrelation Ecological fallacy Rachel Franklin
  3. Modifiable Area Unit Problem (MAUP) Our choice of spatial units (or zones) has a large influence on our analytical results For example, median household income by county versus state Two sides of the MAUP to be aware of: Placement of boundaries for units of a given size Choice of size of units Rachel Franklin
  4. Boundary problems It’s important to keep in mind that activity just outside the boundary of our study area may also affect the study area For example, studying shopping behavior in Rhode Island Size and shape of spatial units can affect our analysis and results Example: Tennessee and migration Possible solution in some cases: buffers Rachel Franklin
  5. Spatial sampling procedures How do we ensure that we sample in such a way that we have a representative and unbiased sample for the spatial units we’re interested in? In other words, we want an accurate representation of the earth’s surface without sampling each and every point Random spatial sample – choosing x and y coordinates and random (or from a range) Stratified spatial sample – random sampling within each strata Systematic spatial sample – applying the spatial configuration of random sample in one stratum to all other strata in the study area Rachel Franklin
  6. Spatial autocorrelation Tobler’s First Law of Geography: “Everything is related to everything else, but near things are more related than distant things.” A variable’s values are related to each other space – they’re correlated This means that observations are often not independent of each other For example, house values. If I tell you how much a particular house is worth, does it affect your prediction of the neighboring house’s value? We distinguish between two types of autocorrelation: positive and negative Rachel Franklin
  7. Ecological fallacy Assuming that individuals in a group possess the average characteristics of the entire group We risk doing this when we use aggregate data for spatial units to make inferences about individuals(e.g. median income and education levels) For example, in recent presidential elections, wealthier states have tended to vote Democratic and poorer states, Republican But at the individual level, it’s the opposite Rachel Franklin
  8. Geoprocessing – manipulating GIS data This is what GIS is all about – analyzing the spatial relationships between and within features Map overlay – combine layers to create single output Two categories: Tools that do not combine layer attributes (clip & erase) Those that do (intersect & union) Rachel Franklin
  9. Extraction tools Isolate a set of features from their larger group Similar to queries, except queries can only isolate – or select – features in their entirety Clip and erase can isolate entire features or just parts of features Clip – like a cookie cutter Cuts or clips one set of features based on the outline of another Erase is the opposite of clip – keeps only features that fall outside the erase layer Rachel Franklin
  10. Clip Erase Graphic source: Price Rachel Franklin
  11. Overlay with attributes tools These essentially combine layers Both areas and attributes are affected Similar to spatial joins Union – combines polygon layers Creates all possible polygons from combination of both layers Both input layers must contain polygons Intersect – Only keeps polygon areas that were common to both layers Makes it easier to identify locations where two conditions are in effect simultaneously E.g. habitat identification Accepts points, lines, or polygons Rachel Franklin
  12. Intersect Union Graphic source: Price Rachel Franklin
  13. Other common tools (found in ArcToolbox) Dissolve – groups features together, based on a common attribute Buffer – identifies areas that fall within a certain distance of a set of features Append and Merge – combine features from two or more layers Layers must be the same feature type And have the same coordinate system Rachel Franklin
  14. Geoprocessing with ArcGIS Geoprocessing tools are accessible via: ArcToolbox Menus and tool bars Command line ModelBuilder and scripts Pay special attention to: Coordinate systems and projections Areas and lengths Rachel Franklin
  15. Introduction to Spatial Analysis Types of spatial analysis (Longley) Queries and reasoning – no changes are made to the database and no new information is produced For example, how many cities within 300 miles of Kansas City? Measurements – Describing aspects of geographic data, like length, area, or shape For example, calculating the size (or area) of a parcel Transformations – Changing or combining data to create new data Using logical, mathematical, or geometrical rules Descriptive summaries – summary statistics for spatial data Optimization – Finding the best locations for a set of objects, given a set of criteria For example, bus stop locations in Australia Hypothesis testing – Making generalizations about a population from a sample Could this spatial pattern have occurred by chance? Winter GIS Institute
  16. Queries and Reasoning We can query our spatial data lots of ways: Through perusing the “catalog” or file view Map view Table view Histogram or scatterplot view Database queries, using SQL Remember, “computers are generally uncomfortable with vagueness.” (Longley) Winter GIS Institute
  17. Measurements How far apart are two points? How large is a parcel’s area? Area Distance or length Distance may be measured two ways: Straight line or Pythagorean distance. Also referred to as “as the crow flies” Assumes a flat plane, for latitude and longitude we need to think of great circle distances Manhattan or network distance Shape – for example Gerrymandering S=P/3.54√A Where P is perimeter and A is area; 3.54 is twice the square root of π S=1 for the most compact shape, a circle Slope and aspect Digital Elevation Models or DEMs Rasters whose cells contain the elevation at that location Winter GIS Institute
  18. Transformations Buffering – Creates an area of a specific and constant width around a point, line, or polygon This can be used to identify all objects falling within a certain distance of the original feature Point in polygon – Associates points with polygons Counts number of points within a polygon Attach polygon characteristics to points or vice versa Points can lie in only one polygon; point in polygon algorithm Polygon overlay – Determining whether two polygons overlap, the extent of their overlap, and what new polygons are created by the overlap Spurious polygons or slivers – the coastline weave problem Tolerance Spatial interpolation – “Guessing” the value of a variable for locations where no measurement has occurred. For example, rainfall, temperature, or elevation Inverse distance weighting Kriging Density estimation and potential – generates a surface from a set of discrete points Winter GIS Institute
  19. Characterizing Spatial Relationships Looking for patterns or anomalies Descriptive summaries Center Mean Center Centroid – summarizing an area (polygon) with a point That is, making points from polygons – uses the average of the polygon’s vertices Point of minimum aggregate travel (MAT) – The point that minimizes the total straight line distance y Winter GIS Institute
  20. Dispersion Mean distance from the centroid Spatial Dependence We can think of global and local measures of spatial dependence The scale we use will determine, in large part, whether we find spatial dependence across a set of objects Fragmentation – how broken up is the landscape into difference pieces? Are these pieces large or small? Compact or spread out? One measure is simply the number of patches that exist Or we can use the shape measure discussed a few minutes ago: S=P/3.54√A Winter GIS Institute
  21. Optimization Best location for a set of points “p-median problem” – seeking the best location for a set of p facilities, such that distance from each point to the closest facility is minimized School location, e.g. “Coverage problem” – seeking to minimize the furthest distance traveled Fire station location, e.g. “Location-Allocation” – We’re not only trying to locate facilities, but also allocate demand for each facility Winter GIS Institute
  22. Optimization, continued Routing on a network “Shortest path” – The best path through a network that minimizes distance or travel time Google Maps direction, e.g. “Traveling Salesman Problem” (TSP) – Seeks the best ordering of a set of stops to minimize total distance traveled My milkman, e.g. If there are n places to be visited including home base, then there are (n-1)! possible tours to choose from Or, really, (n-1)!/2, since it doesn’t matter if a given tour is done forwards or backwards. Large n problem and the use of heuristics Winter GIS Institute
  23. Optimization, continued Optimum paths - best paths in continuous space Locating highways or power lines, for example Routing airplane flights These are often solved using a raster, where each cell contains a friction value – cost or time associated with crossing the cell GIS then finds the least-cost path We can differentiate between optimal locations with a network or just in continuous space Winter GIS Institute
  24. Quantifying Spatial Relationships Point patterns Is the distribution of points random? Uniform? Can we identify clusters? Measures of spatial association Global – Do we see positive or negative autocorrelation across our study area Very dependent on scale Local – Are values correlated with local neighbors? House values Crime Winter GIS Institute
  25. Spatial Association All measures of spatial association depend on scale How do we define neighbors? Neighborhoods can be defined based on distance or contiguity Distance: My neighbors are those who live within a mile of me, for example Contiguity: Refers to polygons. My neighbors are those I share a border with: Queen’s case: Shared borders and corners count for contiguity Rook’s case: Only shared borders count for contiguity 1st order versus 2nd order, etc: We could choose our immediate neighbors, or those that are neighbors of our neighbors. Winter GIS Institute
  26. Neighbors When we define our neighborhood, this is implemented using a “weights matrix” Usually 1 and 0’s that indicate yes or no for whether a spatial unit is my neighbor This is then often “row standardized” – values are constrained to sum to 1 at the end of each row. Units are not considered neighbors of themselves These matrices are generally symmetric – If I’m your neighbor, then you’re my neighbor. Winter GIS Institute
  27. Hot Spots Local Indicators of Spatial Association (LISA) Local Indicators of Spatial Association (LISA) indicate the presence or absence of significant spatial clusters or outliers for each location. A Randomization approach is used to generate a spatially random reference distribution to assess statistical significance. Winter GIS Institute
  28. Hot Spots, continued Getis-OrdGi* Statistic The resultant Z score tells you where features with either high or low values cluster spatially. This tool works by looking at each feature within the context of neighboring features. A feature with a high value is interesting, but may not be a statistically significant hot spot. To be a statistically significant hot spot, a feature will have a high value and be surrounded by other features with high values as well. Winter GIS Institute
  29. Getting Data into a GIS A few options: Best case scenario: data are already in shapefile format, or similar Or you join e.g. excel data to shapefile data You collect or create your data yourself ArcGIS converts X,Y (lat, long) coordinates into point data Or, very commonly, we geocode Winter GIS Institute
  30. Geocoding – What’s that? Along with mapping, geocoding is one of the most commonly-used GIS applications When we geocode, we attach location information to tabular geographic information Addresses of all grocery stores in Providence Locations of all capital cities in the world We can think of a location-specificity continuum from general (e.g. cities) to specific (e.g. exact addresses) Winter GIS Institute
  31. Geocoding – What’s that? Winter GIS Institute
  32. The more specific we are in terms of location, the more geographic information we need Also, depending on use of geocoded information, exact location may be very important – for example, 911 calls Locating cities requires a reference file with city locations Location addresses in Providence requires street name and street number, at a minimum Locations can be attached to polygons or points, but the most challenging is attaching to addresses, or lines Winter GIS Institute
  33. What’s it used for? Emergency services GPS Driving directions Google maps Crime analysis Marketing Winter GIS Institute
  34. How does it work? Tabular data are compared to a spatial Reference layer This is what ArcMap uses to match addresses This happens in a few steps To work best, addresses need to be recognizable to the computer, or standardized Then standardized addresses in our table of locations (say, J. Crew stores) are compared to our reference layer To understand this, think about the standard components of a street address Prefix direction Street name Street type Number Suffix direction Winter GIS Institute
  35. Spatial Reference Layer The spatial reference layer includes the spatial information that will help locate our list of places in space The street name, obviously, if we’re geocoding addresses Or city and state Names of streets are attached to line segments, or polylines Each line segment is associated with a range of street numbers These are tabulated as “from address” to “to address” – allowing us to increase house numbers from bottom of line segment to top, since we know beginning and end number What we don’t know is where, exactly, a building lies on that line segment So geo-coding always has an element of approximation to it 100 200 Line segment Winter GIS Institute
  36. Address Geocoding One range method: A single address range for each chunk of street Two range method: An address range for each side of the street Obviously more desirable, but not always possible since this information needs to be coded into the reference layer ArcMap allows us to include an “offset” in this case In both cases, addresses are assigned to a place on the line in proportion to the starting and ending address on the line itself. So if the polyline starts at 100 Main St. and ends at 200 Main St., an address of 150 Main St. goes right in the middle Winter GIS Institute
  37. Types of geocoding styles Single field – Zip code, state name, power stations Alphanumeric Ranges – Helps narrow the search range for address identification, since ArcMap only has to look in that quadrant US Cities and states – Locates cities, given city and state names US One Address – Matches addresses to points or polygons US One Range – Matches addresses to one range of street values US Streets – Matches addresses to a range of street values for both sides of the street World City and country – Locates cities within countries on a world map Zip code – Matches zip codes to a point or polygon reference layer Zone option – Additional pieces of information (zip, state, city) that allow us to match over larger areas Winter GIS Institute
  38. Why it’s important to know your study location Quirky address styles: Queens, NY Washington, DC Phoenix, AZ Quickly growing locations Spelling quirks Saint and St. / Sainte and Ste. Value of “Alias Tables” Maxcy Hall v. 112 George Street Winter GIS Institute
  39. How geocoding works in ArcGIS First, load your address table and reference layer into ArcMap Then we need to set up an address locator Done in ArcCatalog This assembles the pieces of information we need in order to geocode What is our reference layer? What are the key fields we’ll use to locate addresses? A “snapshot” of the reference layer is taken at this time – important to remember Geocoding can be done interactively or in batch mode Usually we do a combination of both The output is a new shapefile or feature class Winter GIS Institute
More Related