1 / 10

Scan Statistics via Permutation Tests

Scan Statistics via Permutation Tests. David Madigan. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. x. The curve represents a road Each “x” marks a police pull-over

westbrook
Download Presentation

Scan Statistics via Permutation Tests

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scan Statistics via Permutation Tests David Madigan

  2. x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x The curve represents a road Each “x” marks a police pull-over Red “x” means the police issued a ticket Black “x” means no ticket Is there a stretch of road where the police issue an unusally large number of tickets?

  3. Scan with Fixed Window • If we know the length of the “stretch of road” that we seek, e.g., we could slide this window long the road and find the most “unusual” window location x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

  4. How Unusual is a Window? • Let pW and p¬W denote the true probability of being red inside and outside the window respectively. Let (xW,nW) and (x¬W,n¬W) denote the corresponding counts • Use the GLRT for comparing H0: pW = p¬W versus H1: pW ≠ p¬W • lambda measures how unusual a window is • -2 log l here has an asymptotic chi-square distribution with 1df

  5. Permutation Test • Since we look at the smallest l over all window locations, need to find the distribution of smallest-l under the null hypothesis that there are no clusters • Look at the distribution of smallest-l over say 999 random relabellings of the colors of the x’s smallest-l xx x xxx x xx x xx x 0.376 xx x xxx x xx x xx x 0.233 xx xxxx x xx x xx x 0.412 xx x xxx x xx x xx x 0.222 … • Look at the position of observed smallest-l in this distribution to get the scan statistic p-value (e.g., if observed smallest-l is 5th smallest, p-value is 0.005)

  6. Variable Length Window • No need to use fixed-length window. Examine all possible windows up to say half the length of the entire road

  7. Spatial Scan Statistics • Spatial scan statistic uses, e.g., circles instead of line segments

  8. Spatial-Temporal Scan Statistics • Spatial-temporal scan statistic use cylinders where the height of the cylinder represents a time window

  9. Other Issues • Poisson model also common (instead of the bernoulli model) • Covariate adjustment • Andrew Moore’s group at CMU: efficient algorithms for scan statistics

  10. Software: SaTScan + others http://www.satscan.org http://www.phrl.org http://www.terraseer.com

More Related