1 / 76

Spatio-Temporal Outlier Detection in Precipitation Data

SensorKDD 2008 Sunday, 24 th August, 2008. Spatio-Temporal Outlier Detection in Precipitation Data. Elizabeth Wu, Wei Liu, Sanjay Chawla The University of Sydney, Australia. What is a spatio-temporal outlier? Motivation Previous Work Contributions Our Approach Future Work. Outline.

Download Presentation

Spatio-Temporal Outlier Detection in Precipitation Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SensorKDD 2008 Sunday, 24th August, 2008 Spatio-Temporal Outlier Detection in Precipitation Data Elizabeth Wu, Wei Liu, Sanjay ChawlaThe University of Sydney, Australia

  2. What is a spatio-temporal outlier? Motivation Previous Work Contributions Our Approach Future Work Outline

  3. What is a Spatio-Temporal Outlier? • “A spatio-temporal object whose thematic attribute values are significantly different from those of other spatially and temporally referenced objects in its spatial and/or temporal neighborhoods.” – Cheng and Li (2006) 5 4 3 2 1 5 4 3 2 1 5 4 3 2 1 5 4 3 2 1 5 4 3 2 1 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 t=4 t=5 t=2 t=3 t=1

  4. What is a spatio-temporal object? • “A time-evolving spatial object whose evolution or ‘history’ is represented by a set of instances (o_id, si, ti) where the spacestamp si is the location of object o_idat timestamp ti.” - Theodoris et. al. (1999) • Simply put, • A point becomes a line • A 2D region becomes a 3D region time time y co-ordinate y co-ordinate x co-ordinate x co-ordinate

  5. Data Figure: Stations used to produce gridded precipitation fields • South American precipitation data (NOAA) • 10 years (1995-2004) • 2.5 x 2.5° grids • 31 latitude x 23 longitude divisions • 713 grids total • 2,609,580 possible data values • Missing data – spatially and temporally • El Niño Southern Oscillation Data (NOAA) • Southern Oscillation Index (SOI) • Measures the difference in Sea Surface Temperature (SST) between Tahiti and Darwin • The lower the score, the more intense an El Niño event

  6. Why would we be interested in moving outlier regions in precipitation data? Knowing the location, time and duration of past extreme precipitation events helps to understand and prepare for future events. We can analyse how different phenomenon interact. E.g. ENSO and precipitation. Motivation

  7. Spatial Scan Statistics Used to find spatial outliers Cluster detection using the spatial scan statistic in spatio-temporal point data (Iyengar, 2004) Exact-Grid and Approx-Grid (Agarwal et. al., 2006) Uses the Kulldorff Spatial Scan Statistic Finds the highest discrepancy region (by location and size) in a spatial grid dataset. Spatio-temporal outlier detection (Birant and Kut, 2006) Limited to finding outliers over a single time period. time y co-ordinate x co-ordinate Previous Work

  8. Extended Exact-Grid and Approx-Grid to find the top-k outliers in a single time period. Developed the Outstretch & RecurseNodes algorithm to find outliers that repeatedly appear over several time periods. Apply to South American Precipitation data. Analyse the behaviour of the outliers against the El Niño Southern Oscilation (ENSO). Contributions

  9. Our Approach • Find the top-k outliers in a spatial grid for each time period • Extend Exact-Grid and Approx-Grid algorithms • Use Oustretch to find spatial outliers which extend over several time periods. • Use RecurseNodes to extract the sequences from the Outstretch tree.

  10. Finding the top-k outliers • Find every possible region size and shape in the grid. • Get each region’s discrepancy value to determine which is a more significant outlier. • Our extension keeps track of the top-k regions rather than just the top-1. right left top bottom

  11. Kulldorff Scan Statistic • Uses two values: • Measurement – Number of incidences of an event • E.g. In how many cells is precipitation extreme? • M – for the whole dataset • m(p) - for the cell p • mR = ΣpєR m(p) / M • Baseline – Total population at risk • I.e. How many cells have we recorded values for? • B – for the whole datasetb(p) - for the cell p • bR = ΣpєR b(p) / B • We find the discrepancy for local region R by subsitution into: • When mR > bRd(mR, bR) = mRlog(mR/bR) + (1-mR)log((1-mR)/(1-bR)) • Otherwise d(mR, bR) = 0

  12. Kulldorff Scan Statistic: Example • M = 6 = total # cells with “1” in entire grid • ΣpєRm(p) = 4= total # cells with “1” in R • mR = ΣpєRm(p)/M = 0.67 • B = 16= total # cells in entire grid • ΣpєRb(p) = Sum of b’s in region = 4= total # cells in R • bR = ΣpєRb(p)/B = 0.25 • Result: d(mR, bR) = 0.3836 4 3 2 1 1 2 3 4 4 3 2 1 1 2 3 4

  13. Finding the top-k outliers: Exact-Grid right left top bottom

  14. Finding the top-k outliers: Exact-Grid right left top bottom

  15. Finding the top-k outliers: Exact-Grid right left top bottom

  16. Finding the top-k outliers: Exact-Grid right left top bottom

  17. Finding the top-k outliers: Exact-Grid right left top bottom

  18. Finding the top-k outliers: Exact-Grid right left top bottom

  19. Finding the top-k outliers: Exact-Grid right left top bottom

  20. Finding the top-k outliers: Exact-Grid right left top bottom

  21. Finding the top-k outliers: Exact-Grid right left top bottom

  22. Finding the top-k outliers: Exact-Grid right left top bottom

  23. Finding the top-k outliers: Exact-Grid right left top bottom

  24. Finding the top-k outliers: Exact-Grid right left top bottom

  25. Finding the top-k outliers: Exact-Grid right left top bottom

  26. Finding the top-k outliers: Exact-Grid right left top bottom

  27. Finding the top-k outliers: Exact-Grid right left top bottom

  28. Finding the top-k outliers: Exact-Grid right left top bottom

  29. Finding the top-k outliers: Exact-Grid right left top bottom

  30. Finding the top-k outliers: Exact-Grid right left top bottom

  31. Finding the top-k outliers: Exact-Grid right left top bottom

  32. Finding the top-k outliers: Exact-Grid right left top bottom

  33. Finding the top-k outliers: Exact-Grid right left top bottom

  34. Finding the top-k outliers: Exact-Grid right left top bottom

  35. Finding the top-k outliers: Exact-Grid right left top bottom

  36. Finding the top-k outliers: Exact-Grid right left top bottom

  37. Finding the top-k outliers: Exact-Grid right left top bottom

  38. Finding the top-k outliers: Exact-Grid right left top bottom

  39. Finding the top-k outliers: Exact-Grid right left top bottom

  40. Finding the top-k outliers: Exact-Grid right left top Keeps moving top and bottom lines until all regions have been examined between the left and right lines… bottom

  41. Finding the top-k outliers: Exact-Grid right left top bottom

  42. Finding the top-k outliers: Exact-Grid right left top bottom

  43. Finding the top-k outliers: Exact-Grid right left top bottom

  44. Finding the top-k outliers: Exact-Grid right left top bottom

  45. Finding the top-k outliers: Exact-Grid right left top bottom

  46. Finding the top-k outliers: Exact-Grid right left top bottom

  47. Finding the top-k outliers: Exact-Grid right left top bottom

  48. Finding the top-k outliers: Exact-Grid right left top bottom

  49. Finding the top-k outliers: Exact-Grid right left top bottom

  50. Finding the top-k outliers: Exact-Grid right left top bottom

More Related