Geocoding Public Health Data. Lecture 5 Locating Street Addresses and Global Positioning System GIS and RS in Public Health Edmund Seto, Ph.D. School of Public Health University of California, Berkeley. Spatial Data.
Locating Street Addresses
Global Positioning System
GIS and RS in Public HealthEdmund Seto, Ph.D.School of Public HealthUniversity of California, Berkeley
In previous lectures we talked about the wide availability of spatial data.
Public Health data are often inherently spatial:
Vital stats have residential street addresses
A cohort study of exposure to air pollution might
consider residence and work addresses
The problem is how to get these locations on a map. (ie. in a format that is readily usable within a GIS)
The process of getting such data placed onto a map or within a GIS is known as Geocoding.
For example: A table of data that is grouped at the county level… How do we match this up with GIS map of counties?
A GIS is based on the concept of relational databases, which allow us match geographic features with the corresponding attribute data.
In exercises 1 and 2, we saw that a table of attributes can be “joined”with a table of geographic features based on a common identifier in GIS.
Where common identifiers might be:
country name, county name, postal code, etc.
age-adj to yr 2000 pop
ICD codes I00-I99
Beware! Your choice, or lack of choice in terms of the scale, or choice of area-based measure (individual address vs census tract vs block vs zip, etc) can affect the results of your study.
Modifiable Area Unit Problem
Openshaw, S., and P. Taylor, 1979: A million or so correlation coefficients: Three experiments
on the modifiable area unit problem, in Statistical Applications in the Spatial Sciences, ed.
N. Wrigley, (London: Pion), 127-144.
Nancy Krieger, Jarvis T. Chen, Pamela D. Waterman, Mah-Jabeen Soobader, S. V. Subramanian and Rosa Carson
Geocoding and Monitoring of US Socioeconomic Inequalities in Mortality and Cancer Incidence: Does the Choice of Area-based Measure and Geographic Level Matter? The Public Health Disparities Geocoding Project Am J Epidemiol 2002; 156:471-482
For example: A table of individual street addresses… How do we match this up with a GIS map of streets?
This is known as
Street geography layer:
Street: name, starting & ending address
1234 University Ave
Coordinates for the address
The US Census Bureau’s TIGER files include street address information.
Arcview provides a tool known as Geocoding Services that allows us to geocode, in particular, street addresses.
For address matching, Geocoding Services works along the same principle as we have just discussed, relying on street geography, and interpolating the address numbers.
Arcview comes with a license for StreetMap USA.
For the following example, however, we will rely on TIGER files for our Geocoding Service.
From the Yellow Pages, I created a table of Berkeley Clinics and their addresses.
We will create a Geocoding Service in Arcview for geocoding these addresses. The Geocoding Service will be based on the Berkeley streets file that we clipped out from TIGER data in exercise 2.
1. Start up ArcCatalog. Under Geocoding Services, select “Create New Geocoding Service”.
Addresses can be formated in a number of different ways, and here you can choose the style that fits the data that you’re using. For TIGER data we will use:US Streets (File-based)
5. In the Geocoding Services Manager “Add” the service we just created.
Address Matching isn’t as easy as it seems. Even in our little example, we only had good matches for around 50% of our addresses. And we only tried 18 addresses in Berkeley!
Not all mailing addresses correspond to street addresses:
140 Warren Hall
Newly developed areas lack street maps for geocoding
Quality of data, which could be poorly formatted address data and/or errors in street geography data.
Texas DOH Guideline for Geocoding
New Jersey Geocoding problems
Jane McElroy’s talk - Univ of Wisc.
Geocoding addresses from a large population-based study: Lessons learned and applied
For example: Mapping data that cannot be easily located on existing maps.
Residential locations in rural villages
Environmental sampling sites
Infectious disease vector breeding & control sites
d3Distance from each satellite?
Radio link sends
at unknown locations
at known location
Irrigation ditch exposure
Irrigation ditch habitat
Irrigation ditch exposure
Irrigation ditch habitat
Kai Elgethun, Richard A. Fenske, Michael G. Yost, and Gary J. Palcisko
Time-Location Analysis for Exposure Assessment Studies of Children Using a Novel Global Positioning System Instrument
Environmental Health Perspectives Volume 111, Number 1, January 2003
Department of Environmental Health, School of Public Health and Community Medicine, University of Washington
Modelling Concentrations of and Human Exposure to Air Pollution in Danish Cities
Contribution to subproject SATURN
O. Hertel, S. S. Jensen, R. Berkowicz, J. Brandt and J. Christensen
National Environmental Research Institute (NERI), Frederiksborgvej 399, P. O. Box 358, DK-4000 Denmark