Socio-economic status of the counties in the US

Socio-economic status of the counties in the US By Jean Eric Rakotoarisoa GIS project / spring 2002

Background information • GIS has been known as a system that allows storage and retrieval, analysis, and display of spatial data • GIS is often used to assist in conducting socio-economic studies • In these studies, attributes come from geographic areas which are the units and levels of the study (e.g. county, state, or country)

Objective • Identifying the mostprosperouscounties in the US • Successful • Flourishing

Understanding the question Defining the key word: “prosperous” by • Per capita status • Income • Education • Social status of the county • Crime • Unemployment • Health care facilities

Methods Data • Source: ArcUSA 1:2M, published by ESRI in 1997 • Characteristics • 1:2,000,000 scale-data • Albers conic Equal-area projection • Lat / long

Criteria for choosing variables • Standardized variables (to avoid effect of area, population size; e.g. income per capita) • Variables that show enough variation (descriptive statistics) • Variables that can be seen as surrogates of other related variables (I.e. cause and effects relationship and simple correlation; for extrapolation of the results) • Ideally, data from the same year (some variables may be time sensitive)

Variables • Income: money per capita in 1985 • Education: percentage of people > 25 years old with 12 years or more education in 1980 • Unemployment: unemployment rate of civilian labor force in 1986 • Crime: serious crimes known to police per 100,000 population in 1985 • Health care facilities: number of hospital bed per 1000 population in 1985

Understanding each variable • Distribution (normal, skewed): information necessary for reclassification ( i.e. equal interval. quantile, SD) • Degree of variation • Mean, min, max, variance, SD • Scale to be used was chosen as a function of both the degree of variation of the variables and the desired resolution of the output theme (coarse or high resolution)

Relationship among variables • To classify variables: “primary ” (cause) and “secondary “ (effects) • Important when assigning weight (overlay)

Education Income Health care facilities Unemployment Crime Primary Intermediate Secondary

GIS operations • Extract the data • Convert to grid themes • Construct the model (reclassify and weighted overlay)

Characteristics of the model • Index model designed for unequal contribution of each variable • Scale of 1 to 5 with 1 being the worst and 5 the best • Assigning scale for each variable • Income: highest is given 5 • Education: highest is given 5 • Unemployment: highest is given 1 • Crime: highest is given 1 • Health care facilities: highest is given 5

Weight • Income: 30% (Primary variable) • Education: 30% (Primary variable) • *Crime: 20% (Secondary) • *Unemployment: 15% (Intermediate) • Health care facilities: 5% (secondary, not a very good variable) * Strong relationship which implies additive effects of weight

Expected output: counties that have… A higher income per capita, a higher percentage of people that have received at least 12 years of education, that are safe with a lower rate of unemployment, and that have more health care facilities.

Flowchart of the model Income Education Crime Unemployment Health care facilities Reclassify Reclassify Reclassify Reclassify Reclassify Rec income Rec education Rec crime Rec unemployment Rec health facilities 20% 15% 20% 5% 30% Weighted overlay Final map

Results Map of the county prosperity in the US L e v e l o f p r o s p e r i t y R e s t r i c t e d 1 2 3 4 5 N o D a t a 0 5 0 0 M i l e s

What is revealed by the map? • Many counties meet our criteria • Distribution of these counties follows a regional pattern • The most prosperous regions are: New England, Upper Midwest, Great plains, western states (Arizona, Nevada, Colorado) • There is not a huge difference between counties in terms of prosperity (based on our criteria) across the US (there are very few extremes values such as 1 and no 5, most counties fall into scale 3, 4)

Verifying the model • Study question: Randomly chosen counties should belong to the level of prosperity assigned by the model • GIS aspect: State-based study and county-based study should show the same pattern

Verifying the model (cont.) Map of the county prosperity in the US Scale: 1:2M Map of the state prosperity in the US Scale: 1:2M

Discussions • Resolution • County data were indeed appropriate given the fact that these variables are probably more uniform within counties than within states as shown by the map (e.g. income, rate of unemployment) • Source of error • GIS • Defining the extent of the output (decreases accuracy) • Label (misleading) • Study question • Data do not come from the same year

Discussions (cont.) • Limitations of the results • Despite the number of variables used, the output mainly refer to counties that have higher income and higher proportion of people graduated from high school (weighted overlay) • High income does not necessarily imply better standard of living (e.g. need to look into cost of living)

Discussions (cont.) • Did I have to use GIS ? • No !! • Simple equation: Y = aX1+bX2+cX3+dX4+eX5 • Y= Counties • Xi= variables (attributes) • a,b,c,d,e= weight • GIS was mostly used for visual purpose (e.g. distribution of the counties)

Discussions (cont.) • What can be improved? • Adding more variables to better characterize the feature of interest (e.g. number of doctors, nursing centers and hospitals) • Investigating relationship among variables (using inferential statistics) • Add other parameters (e.g. cost of living)

Discussions (cont.) • Difficulties - Theoretical background: choosing variables, understanding their behavior - GIS operations: understanding effects of different choices whenever options are being presented (e.g. equal interval, quantile, SD used for reclassification) - At every step of the analysis, try to understand the assumptions behind each option (e.g. defining scales) and always relate those to the objective i.e. how each option will affect the objective (how a choice for a particular scale will affect the objective) • Advice

Conclusions • Study of interest : based on our criteria, the most prosperous counties in the US are in New England, Upper Midwest, Great plains, western states • GIS is only a tool. A good understanding of the study phenomenon is crucial before any GIS operations can be undertaken • A good understanding of the different options given through the GIS operations is important • Poor knowledge of the study phenomenon or misuse of GIS only results in artifacts

Socio-economic status of the counties in the US