1 / 22

Radan HUTH, Monika CAHYNOVÁ Institute of Atmospheric Physics, Prague, Czech Republic

C lassifications of circulation patterns from the COST733 database: An assessment of s ynoptic-climatological applicability by two-sample Kolmogorov-Smirnov test. Radan HUTH, Monika CAHYNOVÁ Institute of Atmospheric Physics, Prague, Czech Republic huth@ufa.cas.cz.

roden
Download Presentation

Radan HUTH, Monika CAHYNOVÁ Institute of Atmospheric Physics, Prague, Czech Republic

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Classifications of circulation patterns from the COST733 database: An assessment of synoptic-climatological applicability by two-sample Kolmogorov-Smirnov test Radan HUTH, Monika CAHYNOVÁ Institute of Atmospheric Physics, Prague, Czech Republic huth@ufa.cas.cz

  2. COST733 database (collection) • COST733 Action – “Harmonization and Applications of Weather TypesClassifications for European Regions” • (very) large number of classifications produced • on unified data • SLP at 12 UTC • ERA40 (Sep 1957 – Aug 2002) • ~9, ~18, ~27 types wherever possible • 12 European domains

  3. COST733 database (collection) • version 2.0 of the database • released this spring • 18 methods for each domain • threshold-based: GWT (Beck), Litynski, Lamb (Jenkinson-Collison), P27 (Kruizinga), WLK • leader algorithm: Lund, Kirchhofer, Erpicum • PCA-based: T-mode PCA • optimization algorithms: CKMEANS, PCACA (k-means), Petisco, PCAXTRKMS, SANDRA, SANDRA-S, NNW (SOMs), PCAXTR • pseudo-random: random centroids • plus 7 subjective and objectivized classifications not attributable to any domain • ignored today

  4. COST733 database (collection) • different attributes of classifications • number of types (9 x 18 x 27) • sequencing (no vs. 4-day sequences) • seasonal vs. year-round definition • variable: all based on SLP, several additional variables used

  5. GOAL • assess the synoptic-climatological applicability of classifications • i.e., how well they stratify surface weather (climate) conditions • demonstrate effect of • sequencing • seasonal vs. annual definition • adding more variables • 500 hPa height • 500 hPa vorticity • 850/500 hPa thickness • number of types

  6. Classifications examined • 11 methods • 30 classifications available for each of them • differing in • sequencing (no x 4 days) • additional variables (Z500, THICK850/500, VOR500, all together) • number of types (9, 18, 27) • 5 methods • additional 6 classifications available • differing in • seasonality of definition (year-round x seasonal)

  7. TOOL • 2-sample Kolmogorov-Smirnov test • equality of distributions of the climate element under one type against under all the other types x

  8. TOOL • at each station • types for which the K-S test rejects the equality of distributions are counted • the larger the count, the better the stratification, the better the synoptic-climatological applicability

  9. ANALYSIS • preliminary results • maximum temperature (minimum temperature – very similar results) (precipitation – different) • domain 07 (central Europe) • 39 stations from ECA&D database • winter (DJF) • Jan 1961 – Dec 2000

  10. RANKING OF CLASS’S • at all stations individually: • for each classification: number of rejected K-S counted • classifications ranked by the %age of rejected K-S tests (= well separated classes) • higher %age  better  lower rank • for each classification: ranks averaged over stations • area mean rank  ranking of the classification

  11. Result 1: comparison of methods • area mean ranks averaged over 30 realizations of each method • result: order of the method, independent of any attribute (no. of types, sequencing, variable)

  12. Result 1: comparison of methods so the winner is…

  13. Result 1: comparison of methods NOTE: not all methods participated in the race!

  14. Result 2: sensitivity to the number of types • all pairs of classifications • differing in no. of types • 9 vs. 18 • 18 vs. 27 • with all other attributes equal • difference in rank is calculated • histogram of differences • t-test: equality of the difference to zero -106 ± 17

  15. Result 2: sensitivity to the number of types • all pairs of classifications • differing in no. of types • 9 vs. 18 • 18 vs. 27 • with all other attributes equal • difference in rank is calculated • histogram of differences • t-test: equality of the difference to zero -55 ± 12

  16. Result 3: effect of sequencing • all pairs of classifications • differing in sequencing (no vs. 4-days) • with all other attributes equal • difference in rank is calculated • histogram of differences • t-test: equality of the difference to zero -30 ± 11

  17. Result 4: effect of seasonality • all pairs of classifications • differing in the seasonality in their definition • with all other attributes equal • difference in rank is calculated • histogram of differences • t-test: equality of the difference to zero -44 ± 24

  18. Result 5: effect of additional variables +68 ± 18 +42 ± 24

  19. Result 5: effect of additional variables +41 ± 18 +61 ± 19

  20. CONCLUSIONS • various kinds of cluster analysis perform well • fewer types  better performance • sequencing adds value: surface temperature is better described by types of 4-day sequences than types of instantaneous fields • seasonal definition better than annual, but: • systematic difference in the number of types (7 vs. 9) • additional variables bring no benefit; in fact they worsen the synoptic-climatological applicability

  21. OUTLOOK • analysis to extend to • all domains • more variables (Tmin, Precip) • more comparisons will be possible  results may be more general • several other criteria as well • other datasets (gridded: ENSEMBLES, reanalyses)

More Related