Loading in 2 Seconds...
Loading in 2 Seconds...
Early Access to NCI Climate Data & Analysis Systems. Ben Evans Ben.Evans@anu.edu.au. NCI: Vision and Role. Vision Provide Australian researchers with a world-class, high-end computing service Aim
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
HSM (tape) – 1,2 copies at 2 locations
Scratch disk – one location, 2 speed options
Persistent disk – two locations, 2 speed options
Persistent data services – two locations, movable/sychronised VMs
Self-managed backup, synchronised data, or HSM
Filesystems layered on top of hardware options.
Specialised: Databases, Storage Objects, HDFS, …
Domain speciality: ESG, THREDDS/OpenDAP, Workflows, Data Archives (OAIS)
Special Features for the Data-Intensive Cloud
Astronomy Virtual Observatory
Less time at the telescope - more time in front of the computer!
Eg. National Environmental Satellite Data Backbone at NCI – CSIRO, GA
Earth Observing (EO) sensors carried on space-borne platforms produce large (multiple TB/year) data sets serving multiple research and application communities
The NCI established a single National archive of raw (unprocessed) MODIS data for the Australian region, and to support processing software (common to all users) and specialised tools for applications. LANDSAT is now being processed.
The high quality historical archive is complemented by exploiting the NCI network connectivity to download and merge data acquired directly from the spacecraft by local reception stations all round Australia in real time.
Data products and tools available through web technologies and embedded workflows.
Collaborators: King, Evans, Lewis, Wu, Lymburner
ESG – internationally significant climate model data
Data analysing capability
Mission: The DOE SciDAC-2 Earth System Grid Center for Enabling Technologies (ESG-CET) project is to provide climate researchers worldwide with access to: data, information, models, analysis tools, and computational resources required to make sense of enormous climate simulation datasets.
ESGF – Federation of worldwide sites providing data. Core nodes are: PCMDI (LLNL), BADC (UK), DKRZ (MPI), NCAR/JPL. NCI joined to provide an Australian Node.
NCI : Support the ESG as the Australian node and subsequent processing
Status of Publishing - CSIRO-QCCCE mk3.6
Status of Publishing – CAWCR ACCESS
Status of ESG software
LASG - Institute of Atmospheric Physics, Chinese Academy of Sciences ChinaMIROC - University of Tokyo, National Institute for Environmental Studies, and Japan Agency for Marine-Earth Science and TechnologyMOHC - UK Met Office Hadley CentreMPI-M - Max Planck Institute for MeteorologyMRI - Japanese Meteorological InstituteNASA GISS- NASA Goddard Institute for Space Studies USANCAR - US National Centre for Atmospheric ResearchNCAS - -UK National Centre for Atmospheric ScienceNCC - Norwegian Climate CentreNIMR - Korean National Institute for Meteorological ResearchQCCCE-CSIRO - Queensland Climate Change Centre of Excellence and Commonwealth Scientific and Industrial Research OrganisationRSMAS - University of Miami - RSMAS
24 modelling groups, 25 platforms being described, 44 models, 65 grids, and 223 simulations
CAWCR - Centre for Australian Weather and Climate ResearchCCCMA - Canadian Centre for Climate Modelling and AnalysisCCSM - Community Climate System ModelCMA-BCC - Beijing Climate Center, China Meteorological AdministrationCMCC - Centro Euro-Mediterraneo per I Cambiamenti ClimaticiCNRM-CERFACS - Centre National de Recherches Meteorologiques - Centre Europeen de Recherche et Formation Avancees en Calcul Scientifique.EC-Earth - EuropeFIO - The First Institute of Oceanography, SOA, ChinaGCESS - College of Global Change and Earth System Science, Beijing Normal UniversityGFDL - Geophysical Fluid Dynamics LaboratoryINM - Russian Institute for Numerical MathematicsIPSL - Institut Pierre Simon Laplace
~60 experiments within CMIP5
~20 modelling centres (from around the world) using
~several model configurations each
~2 million output “atomic” datasets
~10's of petabytes of output
~2 petabytes of CMIP5 requested output
~1 petabyte of CMIP5 “replicated” output
Of the replicants:
~ 220 TB decadal
~ 540 TB long term
~ 220 TB atmos-only
~80 TB of 3hourly data
~215 TB of ocean 3d monthly data!
~250 TB for the cloud feedbacks!
~10 TB of land-biochemistry (from the long term experiments alone).
Slide sourced from Metafor web site early 2011.
Last Update: Monday, 19 September 2011 03:21AM (UTC)
Data Replication supported by two methods
Bulk-fast transfers by ESG nodes
User-initiated data transfers at variable level
First method is fast but requires coordination at the sites and international networks
Second method can be very slow but relatively “simple”
We will provide more details during our session today.
Slow links to some sites can get swamped
ESG software not ready to manage official replicas
Don’t have a clear view of when model data will be available
Data may need to be revoked as errors are found.
Data capacity being closely managed/prioritised.
CAWCR, CoE, CSIRO, BoM and shareholders monitoring for future expansion