Geospatial Analysis in Public Health Spatial Cluster Detection M.J. College, Jalgaon India September 22-26, 2008 Glen D

Geospatial Analysis in Public Health Spatial Cluster Detection M.J. College, Jalgaon India September 22-26, 2008 Glen D

513 Views

Download Presentation
## Geospatial Analysis in Public Health Spatial Cluster Detection M.J. College, Jalgaon India September 22-26, 2008 Glen D

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Geospatial Analysis in Public Health**Spatial Cluster Detection M.J. College, Jalgaon IndiaSeptember 22-26, 2008 Glen D. Johnson New York State Department of Health and The University at Albany School of Public Health Department of Environmental Health Sciences**Acknowledgement:**Some of the following graphics on cluster detection are compliments of Tom Talbot, MSPH of the New York State Department of Health - co-teaches “GIS in Public Health” with Glen Johnson and Frank Boscoe at the University at Albany, S.U.N.Y.**Cluster**• A number of similar things grouped closely togetherWebster’s Dictionary • Concentrations of health events in space and/or time Public Health Definition**Clustering of health outcomes may be caused by a number of**community-level factors… • Occupation mix • Demographic mix (i.e. Race, Age, Sex) • Socioeconomic status • Cultural/Behavioral • Environmental Exposure (always a big question) • Time and/or Space(captures unexplained factors that co-vary with the outcome)**Cluster detection influenced by scaling and zoning effects:**… as must be considered for all spatial statistics and mapping/visualization - the Modifiable Area Unit Problem (MAUP)**Different scale of observational units:**Coarser aggregation**Different zonation:**Grid shift**Cluster Questions**• Does a disease cluster in space? • Does a disease cluster in both time and space? • Where is the most likely cluster?**More Cluster Questions**• At what geographic or population scale do clusters appear? • Are cases of disease clustered in areas of high exposure? - or more generally, “Can the cluster be explained as being associated with something other than chance?”**Nearest Neighbor AnalysisCuzick & Edwards Method**• Count the the number of cases whose nearest neighbors are cases and not controls. • When cases are clustered the nearest neighbor to a case will tend to be another case, and the test statistic will be large.**Advantages**• Accounts for the geographic variation in population density • Accounts for confounders through judicious selection of controls • Can detect clustering with many small clusters**Disadvantages**• Must have spatial locations of cases & controls • Doesn’t show location of the clusters**Knox Methodtest for space-time interaction**• When space-time interaction is present cases near in space will be near in time, the test statistic will be large. • Test statistic: The number of pairs of cases that are near in both time and space. • P value is calculated through random simulations of the time value of the cases • Need to define critical space and time distances. i.e. define what is near?**Advantages**• Do not have to map controls • Determines if there is a space-time interaction. • Can detect space-time clustering even when the overall disease rate has remained the same over time**Disadvantage**• Computationally time consuming with a large number of cases. • Does not determine areas or time periods of where clusters occur.**Spatial Scan StatisticMartin**Kulldorffhttp://www.satscan.org/ • Determines locations with elevated rates that are statistically significant. • Adjust for multiple testing of the many possible locations and area sizes of clusters. • Hypothesis testing based on Monte Carlo simulations of the null, completely random, spatial distribution**Following is an example of how the scan statistic algorithm**delineates all possible circular clusters, based on census blocks in the city of Albany …**A likelihood ratio is then computed for every circular**window, where each window represents a potential spatial cluster. For example, assuming a Poisson distribution of counts, the likelihood ratio is proportional to … for observed cases cand expected cases E[c] inside the search window, and C total observed cases throughout the region, including within the search window.**The circle with the maximum likelihood ratio is then**identified as the most likely cluster, and all others are rank-ordered below the maximum. A null distribution of maximum likelihood ratios is obtained by repeating the analysis on a randomized version of the data, obtaining the max. likelihood ratio, and repeating this exercise for, say, 999 times. A p-value is obtained for each circle by comparing it’s likelihood ratio to the simulated null distribution.So, for a likelihood ratio whose rank is R within the simulated null values, then the p-value = R/(# simulations +1).**Note that E[c] = n*C/Nfor population n in the circle and**total number of cases and Population = C and N respectively or for covariate category i (an “indirect standardization”) or E[c] may even be predicted from a regression model.**Recent advancements in the spatial scan statistic aimed at**overcoming the restriction of the rather arbitrary shape of circular clusters • Patil GP, TaillieC. Upper level set scan statistic for detecting arbitrarily shaped hotspots. Environ Ecol Stat 2004;183-197. • Duczmal L, Assuncao RM. A simulated annealing strategy for the detection of arbitrarily shaped spatial clusters. Comp Stat Data Anal 2004; 45:269-286. • Tango T, Takahashi K. A flexibly shaped spatial scan statistic for detecting clusters. Int J Health Geographics 2005; 4:11.**Regression Analysis**• Control for known risk factors before analyzing for spatial clustering • Analyze for unexplained clusters. • Follow-up in areas with large regression residuals with traditional case-control or cohort studies • Obtain additional risk factor data to account for the large residuals.