1 / 27

The GEM Computational System and Recent Scientific Results

The GEM Computational System and Recent Scientific Results. Andrea Donnellan Third International ACES Meeting May 10, 2002. GEM. Data Volumes from Observations. GRACE: 50 MB/day onboard, 8GB/day derived product ECHO: 100 GB/day onboard SRTM: 12 TB raw data,

jarah
Download Presentation

The GEM Computational System and Recent Scientific Results

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The GEM Computational System and Recent Scientific Results Andrea Donnellan Third International ACES Meeting May 10, 2002 GEM

  2. Data Volumes from Observations • GRACE: 50 MB/day onboard, 8GB/day derived product • ECHO: 100 GB/day onboard • SRTM: 12 TB raw data, • ICESat: 1 GB/day onboard, 2 GB/day derived • SCIGN: 250MB daily - 7.5 GB/day for real time • Airborne observations: LIDAR • VCL: 2 GB/day onboard, 4 GB/day derived • Hyperspectral imagery: 100GB/day raw • Imaging LIDAR: >20 GB/day, >40 GB/day

  3. Volumes from Models • Geodynamo model: • 1GB of storage for one model run • 2010: 5 TB/run • Minimal need of 10 runs • General earthquake/lithospheric models: • 1TB/run • 2010: 10 PB/run (multiple scales combined, many regions) • Gravity • 100 GB/run • 2010: 2 TB/run • Mantle convection models • 1 TB/run • 2010: 10PB/run • Geomagnetic field models • 32 GB/run • 2010: 300 GB/run

  4. Where We Will Be in 2010 • Multiple solid earth missions flying • PetaBytes of data per year gathered in a distributed fashion • Data analyzed by widely distributed scientists using widely distributed computational resources • Growing need for integration of information from multiple sources on multiple scales into a integrated analysis

  5. Goal • World-wide computational systems supporting gathering of 3 PetaBytes of data per year, integrating analysis, visualization, simulation, and interpretation.

  6. Requirements • Onboard adaptive processing • High space to ground bandwidth of TeraBytes per day per mission • Data transmission and handling • Reusable capabilities (framework) • Data processing (100 Petaflops per mission per year)

  7. Requirements (continued) • Product storage (National Virtual Solid Earth Science Observatory) using cooperative federated databases • Distributed computational environment for analysis (interoperable framework, portal) • Software tools • Hardware

  8. Hardware (Hierarchical) • Large central Petaflop computers with TeraBytes of memory • Single sign-on seamless access • Distributed computers for decomposable problems • Cluster computers (e.g. Beowulf for cost performance) • Heterogeneous computational capabilities (e.g. for storage, visualization, computing)

  9. Software • Problem Solving Environment • Visualization tools • Analysis algorithms • Data mining • Framework • Supports software integration into multidisciplinary analysis • Interoperability between data,software, and computer systems

  10. GEM/SERVO Components • Visualization • Model and algorithm development • IT: GRID technologies • Computational Environments/PSEs • Data handling/archiving • Assimilation • Datamining/pattern recognition • Data fusion • High speed networks • High end computers • Clusters • Laptops • Cycles needed and other infrastructure • Scalable system

  11. Solid Earth Research Virtual Observatory (SERVO) SERVO … Goddard Ames Langley Tier2 Center Tier2 Center Tier2 Center Tier2 Center Tier2 Center HPSS HPSS HPSS HPSS … Observations Archive … 1 PB per year data rate in 2010 Archive … ~TBytes/day … Downlink … Downlink Tier 0 +1 HPSS Archive … … Downlink 100 TeraFLOPs sustained Tier 1 Tier 2 Fully functional problem solving environment Tier 3 • Program-to-program communication in milliseconds • Approximately 100 model codes Institute Institute Institute Institute 100 - 1000 Mbits/sec Data cache Tier 4 Workstations, other portals • Plug and play composing of parallel programs from algorithmic modules • On-demand downloads of 100 GB in 5 minutes • 106 volume elements rendering in real-time

  12. Virtual Observatory Project • Solid earth research virtual observatory (SERVO) • On-demand downloads of 100 GB files from 40 TB datasets within 5 minutes. • Uniform access to 1000 archive sites with volumes from 1 TB to 1 PB Scaled to 100 sites Prototype cooperative federated data base service integrating 5 datasets of 10 TB each Prototype modeling service capable of integrating 5 modules Capability Decomposition into services with requirements Prototype 1920x1080 pixels at 120 frames per second visualization service Prototype data analysis service Architecture & technology approach 2003 2004 2005 2006 2007 2008 2009 2010 Timeline

  13. Problem Solving Environment Project • Fully functional PSE used to develop models for building blocks for simulations. • Program-to-program communication in milliseconds using staging, streaming, and advanced cache replication • Integrated with SERVO • Plug and play composing of parallel programs from algorithmic modules Integrated visualization service with volumetric rendering • Extend PSE to Include • 20 users collaboratory with shared windows • Seamless access to high-performance computers linking remote processes over Gb data channels. Capability Plug and play composing of sequential programs from algorithmic modules Prototype PSE front end (portal) integrating 10 local and remote services Isolated platform dependent code fragments 2003 2004 2005 2006 2007 2008 2009 2010 Timeline

  14. Computational Environment ~100 model codes with parallel scaled efficiency of 50% ~104 PetaFLOPs throughput per subfield per year ~100 TeraFLOPs sustained capability per model ~106 volume elements rendering in real time Capability Access to mixture of platforms low cost clusters (20-100) to supercomputers with massive memory and thousands of processors 100’s GigaFLOPs 40 GB RAM 1 Gb/s network bandwidth 2003 2004 2005 2006 2007 2008 2009 2010 Timeline

  15. The Ventura Basin is Actively Deforming

  16. Northridge Example • Northridge class simulation: 100,000 unknowns, 4000 time steps –> 8 hours on high end workstation. • Southern California system: 0.5 km resolution –> 100,000 processor hours or 400 hours (17 days) on a dedicated 256 processor machine.

  17. Steep Gradient Largely Attributable to Low Rigidity Basin Fill

  18. Coseismic Removed from the Interferogram Postseismic Interferogram

  19. Results from Data Inversion Show Fault Afterslip as Primary Mechanism

  20. Comparison of InSAR and Seismic Anomalies • Similar anomaly shows up in both the postseismic deformation indicated by GPS and InSAR (Donnellan et al) and seismic anomalies identified using Principal Component Analysis (Rundle and Tiampo). • Mojave desert shows a similar correlation near Barstow and the Blackwater Fault (Rundle and Tiampo; Peltzer).

  21. Recent GPS Results • Similar to pre-seismic velocity field, particularly near the source.

  22. Residuals

  23. Residual Geodetic Longitude (cm) Anomalous Motion at JPL was Observed Related to the Northridge Earthquake Sierra Madre Fault 1 m • JPL is several fault dimensions away from the Northridge rupture. • The earthquake probably triggered slip on the Sierra Madre Fault in the upper 0.5 km. • Based on additional observations collected near JPL. • Later extent of anomaly is unknown due to lack of stations.

  24. California 3D Fault Simulations • Faults are shown as light lines, the earthquakes at model year 4526 are shown as dark lines • Simulations indicate that major events are clustered in time like the real events. • Simulations using a realistic heterogeneous earth structure are computationally intensive.

  25. Modeling Faults as Interacting Systems Southern California Seismicity Space-time Stress Diagram Courtesy John Rundle • Transients likely occur as a result of stress redistribution. • Are observed on different faults, sometimes a few fault dimensions away.

  26. Conclusions • 90% of Northridge postseismic motion was aseismic. • Afterslip on the mainshock rupture plane responsible for most of the deformation. • No evidence for lower crustal relaxation playing a major role in postseismic motions. • Recent deformation is consistent with that observed before the earthquake.

  27. More Conclusions • High velocity gradient largely attributable to a low rigidity basin. • Lower crust is a minor player in interseismic and postseismic motion in this region – consistent with a cold lower crust. • The earthquake probably triggered slip on the Sierra Madre fault.

More Related