Overview of Climate Data Analyses
Download
1 / 39

climateoverview - PowerPoint PPT Presentation


  • 174 Views
  • Uploaded on

Overview of Climate Data Analyses Vipin Kumar Army High Performance Computing Research Center Department of Computer Science University of Minnesota http://www.cs.umn.edu/~kumar This work was partially funded by NASA and Army High Performance Computing Center. Overview.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'climateoverview' - richard_edik


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg

Overview of Climate Data Analyses

Vipin Kumar

Army High Performance Computing Research Center

Department of Computer Science

University of Minnesota http://www.cs.umn.edu/~kumar

This work was partially funded by NASA and Army High Performance Computing Center


Overview l.jpg
Overview

  • Discovery of Patterns in the Global Climate System using Data Mining

    • Clustering for zone formation

    • Preprocessing

    • Discovery of Ocean Climate Indices

    • Discovery of association patterns

  • Other Climate Analyses

    • Gradient analysis

    • Trajectory analysis

  • Animation of Weather Data


Research goals l.jpg
Research Goals

Research Goals:

  • Find global climate patterns of interest to Earth Scientists

A key interest is finding connections between the ocean and the land.

  • Global snapshots of values for a number of variables on land surfaces or water.

  • Monthly over a range of 10 to 50 years.


Patterns of interest l.jpg
Patterns of Interest

  • Zone Formation

    • Find regions of the land or ocean which have similar behavior.

  • Teleconnections

    • Teleconnections are the simultaneous variation in climate and related processes over widely separated points on the Earth.

  • Associations

    • Find relations between climate events and land cover.

  • River Discharge

    • Relationship between water discharged from a river and precipitation, climate, and man.


Clustering for zone formation l.jpg
Clustering for Zone Formation

  • Interested in relationships between regions, not “points.”

  • For ocean, clustering based on SST (Sea Surface Temperature) or SLP (Sea Level Pressure).

  • For land, clustering based on NPP or other variables, e.g., precipitation, temperature.

    • Typically we work with the points.

  • When “raw” NPP and SST are used, clustering can find seasonal patterns.

    • Anomalous regions have plant growth patterns which reversed from those typically observed in the hemisphere in which they reside, and are easy to spot.



K means clustering of raw npp and raw sst num clusters 27 l.jpg
K-Means Clustering of Raw NPP and Raw SST (Num clusters = 2)

Land Cluster Cohesion: North = 0.78, South = 0.59

Ocean Cluster Cohesion: North = 0.77, South = 0.80


Preprocessing l.jpg
Preprocessing

  • Time series preprocessing issues

    • Need to remove seasonality

      • Earth scientists mostly interest in anomalies

    • Need to remove most of the autocorrelation

      • Statistical test are affected

    • Need to remove trends

      • Normally want to detect patterns and trends separately

    • Normally interested in similarity once differences in means and scale have been considered.

      • Pearson’s correlation coefficient has this property


Slide9 l.jpg

Minneapolis

Atlanta

Sao Paolo

Minneapolis

1.0000

0.7591

-0.7581

Minneapolis

Atlanta

0.7591

1.0000

-0.5739

Sao Paolo

-0.7581

-0.5739

1.0000

Sample NPP Time Series

Correlations between time series


Slide10 l.jpg

Minneapolis

Atlanta

Sao Paolo

Minneapolis

1.0000

0.0492

0.0906

Minneapolis

Atlanta

0.0492

1.0000

-0.0154

Sao Paolo

0.0906

-0.0154

1.0000

Seasonality Accounts for Much Correlation

Normalized using monthly Z Score:

Subtract off monthly mean and divide by monthly standard deviation

Correlations between time series



Slide12 l.jpg

Preprocessing: Removing Trends

A slight linear trend added to two random time series increases their correlation dramatically, from 0.01 to 0.17.


Ocean climate indices connecting the ocean and the land l.jpg

© V. Kumar Discovery of Patterns in the Global Climate System using Data Mining 13

Ocean Climate Indices: Connecting the Ocean and the Land

  • An OCI is a time series of temperature or pressure

    • Based on Sea Surface Temperature (SST) or Sea Level Pressure (SLP)

  • OCIs are important because

    • They distill climate variability at a regional or global scale into a single time series.

    • They are well-accepted by Earth scientists.

    • They are related to well-known climate phenomena such as El Niño.


Ocean climate indices anom 1 2 l.jpg
Ocean Climate Indices – ANOM 1+2 the Global Climate System using Data Mining

  • ANOM 1+2 is associated with El Niño and La Niña.

  • Defined as the Sea Surface Temperature (SST) anomalies in a regions off the coast of Peru

  • El Nino is associated with

    • Droughts in Australia and Southern Africa

    • Heavy rainfall along the western coast of South America

    • Milder winters in the Midwest

El Nino Events


Connection of anom 1 2 to land temp l.jpg
Connection of ANOM 1+2 to Land Temp the Global Climate System using Data Mining

OCIs capture teleconnections, i.e., the simultaneous variation in climate and related processes over widely separated points on the Earth.


Ocean climate indices nao l.jpg
Ocean Climate Indices - NAO the Global Climate System using Data Mining

  • The North Atlantic Oscillation (NAO) is associated with climate variation in Europe and North America.

  • Normalized pressure differences between Ponta Delgada, Azores and Stykkisholmur, Iceland.

  • Associated with warm and wet winters in Europe and in cold and dry winters in northern Canada and Greenland

  • The eastern US experiences mild and wet winter conditions.

Iceland

Azores


Connection of nao to land temp l.jpg
Connection of NAO to Land Temp the Global Climate System using Data Mining


Influence of oci on land area weighted correlation l.jpg
Influence of OCI on Land – Area Weighted Correlation the Global Climate System using Data Mining

  • Correlation of an OCI with a land variable is a standard way to evaluate its “influence.”

    • Correlation does not imply causality.

    • Temperature and precipitation are the typical land variables.

  • If relatively many land points have a relatively high correlation, then an OCI is influential.

  • To evaluate whether clusters (or pairs) are potential OCIs we compute their area weighted correlation.

    • Weighted average of the correlation with land points, where weight is based on area.

    • May exclude points whose correlation is low and then calculate area weighted correlation.


Evaluation of known ocis via area weighted correlation l.jpg
Evaluation of Known OCIs via Area Weighted Correlation the Global Climate System using Data Mining

Area Weighted Correlation of Known OCIs to Land Temp

Overlapping, threshold = 0


Discovery of ocean climate indices l.jpg
Discovery of Ocean Climate Indices the Global Climate System using Data Mining

  • Use clustering to find areas of the oceans that have high density, I.e., relatively homogeneous behavior.

    • Cluster centroids are potential OCIs.

    • For SLP pairs of cluster centroids are potential OCIs.

  • Evaluate the “influence” of potential OCIs on land points.

  • Determine if the potential OCI matches a known OCI.

  • For potential OCIs that are not well-known, conduct further evaluation.

    • Are there land points that have higher correlation for the potential OCI than for known indices?


Sst clusters l.jpg
SST the Global Climate System using Data Mining Clusters


Evaluating cluster centroids as potential ocis l.jpg
Evaluating Cluster Centroids as Potential OCIs the Global Climate System using Data Mining

  • Evaluation will be based on area weighted correlation

    • Ignore clusters who area weighted correlation is low.

  • Three cases:

    • Clusters are highly similar to known OCIs (corr > 0.4)

      • May represent a known OCI

      • Clusters may be “better,” i.e., higher coverage

      • Clusters may cover different area, i.e., some points for which the new OCI is a better predictor

    • Clusters are moderately similar to known OCIs ( 0.25 < corr < 0.4 )

      • Again, new OCIs may be better predictors for some points.

    • Clusters are not similar to known OCIs (corr < 0.25)

      • These clusters may represent as yet undiscovered Earth Science phenomena.


Sst clusters highly correlated to known indices l.jpg
SST Clusters Highly Correlated to Known Indices the Global Climate System using Data Mining

Area Weighted Correlation of Cluster Centroids to Land Temp

Overlapping, threshold = 0


Sst clusters that correspond to el nino climate indices l.jpg
SST Clusters that Correspond to El Nino Climate Indices the Global Climate System using Data Mining

75 78 67 94

El Nino Regions Defined by Earth Scientists

SNN clusters of SST that are highly correlated with El Nino indices, ~ 0.93 correlation.


Sst clusters highly correlated to known indices25 l.jpg
SST Clusters Highly Correlated to Known Indices … the Global Climate System using Data Mining

Examples of some SST clusters that are highly correlated to known OCIs and have high area weighted correlation with land temperature. These indices have a significant correlation with El Nino indices.


Sst clusters highly correlated to known indices26 l.jpg
SST Clusters Highly Correlated to Known Indices the Global Climate System using Data Mining

However, there are areas (yellow) where these clusters correlate better.


Sst clusters highly correlated to known indices27 l.jpg
SST Clusters Highly Correlated to Known Indices the Global Climate System using Data Mining


Sst cluster moderately correlated to known indices l.jpg
SST Cluster Moderately Correlated to Known Indices the Global Climate System using Data Mining


Mining associations in earth science data l.jpg
Mining Associations in Earth Science Data the Global Climate System using Data Mining

  • First, transform Earth Science data into transactions.

  • Find patterns using association discovery algorithms.

1 FPAR-HI PET-HI PREC-HI SOLAR-HI TEMP-HI ==> NPP-HI (support count=145, confidence=100%)

2 FPAR-HI PET-HI PREC-HI TEMP-HI ==> NPP-HI (support count=933, confidence=99.3%)

3 FPAR-HI PET-HI PREC-HI ==> NPP-HI (support count=1655, confidence=98.8%)

4 FPAR-HI PET-HI PREC-HI SOLAR-HI ==> NPP-HI (support count=268, confidence=98.2%)

75 FPAR-HI ==> NPP-HI (support count = 216924, confidence = 55.7%)


Example of interesting association rules l.jpg
Example of Interesting Association Rules the Global Climate System using Data Mining

FPAR-Hi ==> NPP-Hi (sup=5.9%, conf=55.7%)

Shrubland areas


Land cover types l.jpg

Shrublands/ the Global Climate System using Data Mining

Land Cover Types


Slide32 l.jpg

Example of Interesting Association Rules… the Global Climate System using Data Mining

Support Count

Land Cover

  • Temp-Hi  NPP-Hi tends to occur in the forest and cropland regions in the northern hemisphere (Forests (33.5%), Grassland(8.7%), Cropland (24.5%), Desert (0.4%) )


Slide33 l.jpg

Gradient Analysis of SLP Data the Global Climate System using Data Mining

SLP in June, 1992


Slide34 l.jpg

Trajectory Analysis of SST Data the Global Climate System using Data Mining

We choose a bounding box around the equatorial Pacific east of the dateline: longitude range: 80W -- 180W latitude range: 50N -- 15S

Then, we calculated the locations of centroids of the top 20% SST cluster in the given region, and we plotted the trajectory albums for the centroid movements in one year.


Weather data l.jpg
Weather Data the Global Climate System using Data Mining

  • Obtained from Barbara Broome.

  • Data

    • One day at 6 time periods

    • Grid is 51 x 51 units

    • 9.66 by 6.76

    • Air temperature, pressure and two types of wind data


Slide36 l.jpg

Pressure the Global Climate System using Data Mining


Slide37 l.jpg

Air Temperature the Global Climate System using Data Mining


Slide38 l.jpg

Winds 9 the Global Climate System using Data Mining


Slide39 l.jpg

Winds 10 the Global Climate System using Data Mining


ad