1 / 25

29 October, 2001 Steven Worley National Center for Atmospheric Research

Scientific Investigations; Support from Research Data Archives for Computing in Atmospheric Sciences 2001. 29 October, 2001 Steven Worley National Center for Atmospheric Research Scientific Computing Division. Key Steps of Scientific Investigations.

Download Presentation

29 October, 2001 Steven Worley National Center for Atmospheric Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scientific Investigations; Support from Research Data Archivesfor Computing in Atmospheric Sciences 2001 29 October, 2001 Steven Worley National Center for Atmospheric Research Scientific Computing Division

  2. Key Steps of Scientific Investigations • Formulate the questions and review the state of understanding • Search and discover data • Access data • Analyzes data • Community sharing and archive • Document new understandings

  3. Search and Discover Data • How?  Web based Information Server • Salient Features • 2.5K + html pages (metadata) • All datasets are described (500+) • Location of all data files in MSS • Higher level information • Catalogs • Project specific descriptions Always current dataset descriptions

  4. Features • Organization Navigation • Archive Navigation • Pull down menus • Search • Project Links

  5. Dataset Page • Title and Brief description • Systematic Navigation • Metadata highlights • Period of Record • Usage • Variables • Related Sites (NOAA) • Contact Person • Related Datasets

  6. Brief Archive History and Specifications • Started in middle 1960’s, (35 years) • Managed by nine people • 211K data files • 17 TB in a MSS • 530 datasets – all sizes

  7. Global Observations • Usages: • Input for global atmospheric reanalysis • Basic long term climate assessment and case studies

  8. Operational and Composite Analyses • Daily SLP is a small but very popular dataset, e.g. NAO evaluations • Two main operational centers provide the best current analyses

  9. Concerns; • Restricted distribution • U.S. non-profits and UCAR members only • Need online authentication and authorization for easy access • Key Aspects • Medium size archive – 170 Gigabytes • multi-(product, temporal res., spatial res.) - complex

  10. Highlights • Frequent updates to FNL, 1º, daily via FTP • High resolution N. America product, ETA at 40km • No distribution restrictions or cost

  11. Reanalyses • Notes: • ERA-15 is finished, ERA-40 is running now • NCEP II, primarily experimental run

  12. Outstanding Features • Three different coordinate surfaces • Very long analysis, 2+ Terabytes size • Unrestricted distribution • CD-ROMS are very popular

  13. Countries Receiving Reanalysis CDROMs • Highlights • Over 8900 CDROMs 1997-09/2001 • Recipients; U.S. 46%, Japan 11%, (Canada, UK) 4%, (Germany, India) 3%, (Australia, S.Korea, Spain, Mexico, Norway, Russia, France) 2%

  14. Reanalysis Users for 2001 (4th qtr estimated) 209 From the MSS [157 Jan.-Sep.] 47 On CDROM [35] 48 Custom data orders on FTP or Tape [36] 540 From the online server [406] 844 Total Served

  15. Reanalysis Data Distributed for 2001 (4th qtr estimated) • 9616 GB from the MSS [7230 GB Jan.-Sep.] • 808 GB On CD-ROM [935, @650Mb/CDROM] • 1383 GB Custom orders, FTP and tape [1040] • 88 GB From the online server [66 GB] • 11895 GB, 11.9 TB Total

  16. High resolution atmospheric models focused on energy and hydrology cycles. GCIP Model Data Center Collection • Critical data for N. American mesoscale studies • Complete archive is about 1 Terabyte GCIP: GEWEX Continental-Scale International Project / GEWEX : Global Energy and Water Cycle Exper.

  17. 6-yr Mean T at 5 meters University of Miami Ocean Model Data MICOM; Miami Isopynic Coordinate Ocean Model, 1/12th degree 70N to 28 S, 16-20 layers

  18. Dataset Sizes and Scales • Today • ~ 800 Unique users • ~ 12 Terabytes data transferred • 2 Terabyte dataset size • Example: NCEP/NCAR Reanalysis • Near Future Excludes TB-PB Level 0 and 1 satellite and the super scale experimental models • Numbers of Users, ~ same • Data transferred, 5x to 10x more ? • Dataset size, 2-20 TB • Examples: • Ocean and Atmosphere models • ECMWF Reanalysis (ERA40)

  19. Access to Data Methods • NCAR computers • From the local MSS • Web data server • Custom data packages – by request (FTP, tape, CDROM) Users • World class programmer • Research Scientist • Graduate Students • Undergraduate Students

  20. Data Access in the future • Do we continue doing what we are doing? “Absolutely” Why? It Works • Over 1000 users annually • Very diverse skills • The archive is a heterogeneous collection • Many formats (ASCII, Binary, GrIB, BUFR, netCDF, HDF) • Many sizes (1 MB to 2 TB) • Capable of serving large and small projects Maintain a variety of flexible methods

  21. Data Access in the future • Keys to handling future larger collections • Plan to create useful data products • Condensed datasets from high resolution output • Group most popular variables products together • Serve many, e.g. CDROMS and WWW • Continue to develop emerging online data systems • User driven subset selection with graphics and data download options • Server-side elementary analysis • Multi-dataset comparisons • Statistical summaries and basic meteorological calculations • Our development is the “Community Data Portal”

  22. Data Analysis • Tools • NCAR Command Language (NCL) software • Features in brief • I/O for many ‘standard’ data formats • Easy adaptations to read any format • 100’s meteorological functions • “Publication quality” graphics • The CDP is capable of analysis • NCL is one of several middleware packages

  23. Community Sharing • Support for the scientist • A place to distribute new data results • Possibly with authentication and authorization control • E.g. model outputs • Spin off benefit • New data resources for the archive • Many users can then use new product

  24. a b • NCEP Operational Analyses blended with QSCAT Satellite data • Wind Stress Curl, 01/24/2000 1800 UTC • NCEP Operational ONLY • NCEP + QSCAT swaths • OI blend of NCEP + QSCAT • Blending by Colorado Research Associates • We archive all three products. c

  25. Key Steps of Scientific Investigations • Formulate the questions and review the state of understanding • Search and discover data • Access data • Analyzes data • Community sharing and archive • Document new understandings

More Related