1 / 39

Overview of Climate Data Analyses Vipin Kumar Army High Performance Computing Research Center Department of Computer Sci

Overview of Climate Data Analyses Vipin Kumar Army High Performance Computing Research Center Department of Computer Science University of Minnesota http://www.cs.umn.edu/~kumar This work was partially funded by NASA and Army High Performance Computing Center. Overview.

Download Presentation

Overview of Climate Data Analyses Vipin Kumar Army High Performance Computing Research Center Department of Computer Sci

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview of Climate Data Analyses Vipin Kumar Army High Performance Computing Research Center Department of Computer Science University of Minnesota http://www.cs.umn.edu/~kumar This work was partially funded by NASA and Army High Performance Computing Center

  2. Overview • Discovery of Patterns in the Global Climate System using Data Mining • Clustering for zone formation • Preprocessing • Discovery of Ocean Climate Indices • Discovery of association patterns • Other Climate Analyses • Gradient analysis • Trajectory analysis • Animation of Weather Data

  3. Research Goals Research Goals: • Find global climate patterns of interest to Earth Scientists A key interest is finding connections between the ocean and the land. • Global snapshots of values for a number of variables on land surfaces or water. • Monthly over a range of 10 to 50 years.

  4. Patterns of Interest • Zone Formation • Find regions of the land or ocean which have similar behavior. • Teleconnections • Teleconnections are the simultaneous variation in climate and related processes over widely separated points on the Earth. • Associations • Find relations between climate events and land cover. • River Discharge • Relationship between water discharged from a river and precipitation, climate, and man.

  5. Clustering for Zone Formation • Interested in relationships between regions, not “points.” • For ocean, clustering based on SST (Sea Surface Temperature) or SLP (Sea Level Pressure). • For land, clustering based on NPP or other variables, e.g., precipitation, temperature. • Typically we work with the points. • When “raw” NPP and SST are used, clustering can find seasonal patterns. • Anomalous regions have plant growth patterns which reversed from those typically observed in the hemisphere in which they reside, and are easy to spot.

  6. K-Means Clustering of Raw NPP and Raw SST(Num clusters = 2)

  7. K-Means Clustering of Raw NPP and Raw SST (Num clusters = 2) Land Cluster Cohesion: North = 0.78, South = 0.59 Ocean Cluster Cohesion: North = 0.77, South = 0.80

  8. Preprocessing • Time series preprocessing issues • Need to remove seasonality • Earth scientists mostly interest in anomalies • Need to remove most of the autocorrelation • Statistical test are affected • Need to remove trends • Normally want to detect patterns and trends separately • Normally interested in similarity once differences in means and scale have been considered. • Pearson’s correlation coefficient has this property

  9. Minneapolis Atlanta Sao Paolo Minneapolis 1.0000 0.7591 -0.7581 Minneapolis Atlanta 0.7591 1.0000 -0.5739 Sao Paolo -0.7581 -0.5739 1.0000 Sample NPP Time Series Correlations between time series

  10. Minneapolis Atlanta Sao Paolo Minneapolis 1.0000 0.0492 0.0906 Minneapolis Atlanta 0.0492 1.0000 -0.0154 Sao Paolo 0.0906 -0.0154 1.0000 Seasonality Accounts for Much Correlation Normalized using monthly Z Score: Subtract off monthly mean and divide by monthly standard deviation Correlations between time series

  11. Removing Seasonality Removes Most Autocorrelation

  12. Preprocessing: Removing Trends A slight linear trend added to two random time series increases their correlation dramatically, from 0.01 to 0.17.

  13. © V. Kumar Discovery of Patterns in the Global Climate System using Data Mining 13 Ocean Climate Indices: Connecting the Ocean and the Land • An OCI is a time series of temperature or pressure • Based on Sea Surface Temperature (SST) or Sea Level Pressure (SLP) • OCIs are important because • They distill climate variability at a regional or global scale into a single time series. • They are well-accepted by Earth scientists. • They are related to well-known climate phenomena such as El Niño.

  14. Ocean Climate Indices – ANOM 1+2 • ANOM 1+2 is associated with El Niño and La Niña. • Defined as the Sea Surface Temperature (SST) anomalies in a regions off the coast of Peru • El Nino is associated with • Droughts in Australia and Southern Africa • Heavy rainfall along the western coast of South America • Milder winters in the Midwest El Nino Events

  15. Connection of ANOM 1+2 to Land Temp OCIs capture teleconnections, i.e., the simultaneous variation in climate and related processes over widely separated points on the Earth.

  16. Ocean Climate Indices - NAO • The North Atlantic Oscillation (NAO) is associated with climate variation in Europe and North America. • Normalized pressure differences between Ponta Delgada, Azores and Stykkisholmur, Iceland. • Associated with warm and wet winters in Europe and in cold and dry winters in northern Canada and Greenland • The eastern US experiences mild and wet winter conditions. Iceland Azores

  17. Connection of NAO to Land Temp

  18. Influence of OCI on Land – Area Weighted Correlation • Correlation of an OCI with a land variable is a standard way to evaluate its “influence.” • Correlation does not imply causality. • Temperature and precipitation are the typical land variables. • If relatively many land points have a relatively high correlation, then an OCI is influential. • To evaluate whether clusters (or pairs) are potential OCIs we compute their area weighted correlation. • Weighted average of the correlation with land points, where weight is based on area. • May exclude points whose correlation is low and then calculate area weighted correlation.

  19. Evaluation of Known OCIs via Area Weighted Correlation Area Weighted Correlation of Known OCIs to Land Temp Overlapping, threshold = 0

  20. Discovery of Ocean Climate Indices • Use clustering to find areas of the oceans that have high density, I.e., relatively homogeneous behavior. • Cluster centroids are potential OCIs. • For SLP pairs of cluster centroids are potential OCIs. • Evaluate the “influence” of potential OCIs on land points. • Determine if the potential OCI matches a known OCI. • For potential OCIs that are not well-known, conduct further evaluation. • Are there land points that have higher correlation for the potential OCI than for known indices?

  21. SSTClusters

  22. Evaluating Cluster Centroids as Potential OCIs • Evaluation will be based on area weighted correlation • Ignore clusters who area weighted correlation is low. • Three cases: • Clusters are highly similar to known OCIs (corr > 0.4) • May represent a known OCI • Clusters may be “better,” i.e., higher coverage • Clusters may cover different area, i.e., some points for which the new OCI is a better predictor • Clusters are moderately similar to known OCIs ( 0.25 < corr < 0.4 ) • Again, new OCIs may be better predictors for some points. • Clusters are not similar to known OCIs (corr < 0.25) • These clusters may represent as yet undiscovered Earth Science phenomena.

  23. SST Clusters Highly Correlated to Known Indices Area Weighted Correlation of Cluster Centroids to Land Temp Overlapping, threshold = 0

  24. SST Clusters that Correspond to El Nino Climate Indices 75 78 67 94 El Nino Regions Defined by Earth Scientists SNN clusters of SST that are highly correlated with El Nino indices, ~ 0.93 correlation.

  25. SST Clusters Highly Correlated to Known Indices … Examples of some SST clusters that are highly correlated to known OCIs and have high area weighted correlation with land temperature. These indices have a significant correlation with El Nino indices.

  26. SST Clusters Highly Correlated to Known Indices However, there are areas (yellow) where these clusters correlate better.

  27. SST Clusters Highly Correlated to Known Indices

  28. SST Cluster Moderately Correlated to Known Indices

  29. Mining Associations in Earth Science Data • First, transform Earth Science data into transactions. • Find patterns using association discovery algorithms. 1 FPAR-HI PET-HI PREC-HI SOLAR-HI TEMP-HI ==> NPP-HI (support count=145, confidence=100%) 2 FPAR-HI PET-HI PREC-HI TEMP-HI ==> NPP-HI (support count=933, confidence=99.3%) 3 FPAR-HI PET-HI PREC-HI ==> NPP-HI (support count=1655, confidence=98.8%) 4 FPAR-HI PET-HI PREC-HI SOLAR-HI ==> NPP-HI (support count=268, confidence=98.2%) … 75 FPAR-HI ==> NPP-HI (support count = 216924, confidence = 55.7%)

  30. Example of Interesting Association Rules FPAR-Hi ==> NPP-Hi (sup=5.9%, conf=55.7%) Shrubland areas

  31. Shrublands/ Land Cover Types

  32. Example of Interesting Association Rules… Support Count Land Cover • Temp-Hi  NPP-Hi tends to occur in the forest and cropland regions in the northern hemisphere (Forests (33.5%), Grassland(8.7%), Cropland (24.5%), Desert (0.4%) )

  33. Gradient Analysis of SLP Data SLP in June, 1992

  34. Trajectory Analysis of SST Data We choose a bounding box around the equatorial Pacific east of the dateline: longitude range: 80W -- 180W latitude range: 50N -- 15S Then, we calculated the locations of centroids of the top 20% SST cluster in the given region, and we plotted the trajectory albums for the centroid movements in one year.

  35. Weather Data • Obtained from Barbara Broome. • Data • One day at 6 time periods • Grid is 51 x 51 units • 9.66 by 6.76 • Air temperature, pressure and two types of wind data

  36. Pressure

  37. Air Temperature

  38. Winds 9

  39. Winds 10

More Related