Cyberinfrastructure 2022 Application Challenges - PowerPoint PPT Presentation

cyberinfrastructure 2022 application challenges n.
Skip this Video
Loading SlideShow in 5 Seconds..
Cyberinfrastructure 2022 Application Challenges PowerPoint Presentation
Download Presentation
Cyberinfrastructure 2022 Application Challenges

play fullscreen
1 / 18
Download Presentation
Cyberinfrastructure 2022 Application Challenges
Download Presentation

Cyberinfrastructure 2022 Application Challenges

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Cyberinfrastructure 2022Application Challenges Ewa Deelman USC Information Sciences Institute

  2. Southern California Earthquake CenterPI: Tom Jordan, USC About 50% of the national seismic risk resides in Southern California • A collaboration of 16 core US institutions and 53 international partners • SCEC’s mission: • Gather data on earthquakes in Southern California and elsewhere • Integrate information into a comprehensive, physics-based understanding of earthquake phenomena • Communicate understanding to the world at large as useful knowledge for reducing earthquake risk Great CaliforniaShakeout EarthquakePreparedness Drill 8.2 Million people will participated in 2011 Photocredit Dietmar Quistorf.

  3. Seismic Hazard Maps of the Los Angeles Area SCEC’s physics-based map Traditional Attenuation-based Map USC San Onofre Nuclear Power Plant

  4. Hazard curvefor each site on the map Computational aspects of CyberShake 2012: Map of California: 1,398 sites on an adaptive grid ~2.5 Million tasks per site ~8.5 hours on 800 cores per site 2009: Map of So. California Number of Site: 239 Each site 1 Million tasks ~14,000 CPU hours total Total data footprint ~ 1TB 7.5 million data files Min time per site ~ 3 hours on 800 cores Executed using Pegasus on the TeraGrid

  5. Some CyberShake Issues • Cyberinfrastructure is changing • Scalability • increased number of jobs and workflows • Increased need for storage (tens of TB at a time) • Increased number of CI errors which are hard to debug • Data management • Defining what to keep and how to best describe it • How to serve the data to share with the community

  6. LIGO’s inspiral analysis workflow Total 5402 jobs ~800 CPU hours cumulative TB-size data footprint

  7. Some LIGO issues • The workflows are very complex, developed by a small team • The workflows are run by a greater number of users, who are not CI experts • When errors occur, it is difficult to analyze the execution logs to pinpoint problems • Need for better application monitoring and debugging • Data and Metadata management for derived data products to enable sharing

  8. New applications are looking towards the CloudGenerate an atlas of Periodigrams • Find extra-solar planets by • Wobbles in radial velocity of star, or • Dips in star’s intensity Use NASA Kepler Mission data: 210k light-curves released in July 2010 Apply 3 algorithms to each curve 3 different parameter sets Star • 210K input, 630K output files • 1 super-workflow • 40 sub-workflows • ~5,000 tasks per sub-workflow • 210K tasks total Planet Brightness Light Curve Time Work with Bruce Berriman, Caltech

  9. Challenge: Commercial Clouds are not usually a good solutions • ~210K light curves X 3 algorithms X 3 parameter sets • Each parameter set was a different “Run”, 3 runs total • EC2 works great for small workloads, grid may be easier for large workloads Compute is ~10X Transfer • Amazon: 16 x c1.xlarge instances = 128 cores • Ranger: 8-16 x 16 core nodes = 128-256 cores Actual cost Estimated cost

  10. DNA sequencingAt USC LAB • Wet lab managed by a LIMS • Data generated at 4 sequencers • Needs to be filtered for noisy data • Needs to be aligned • Needs to be collected into a single map • Vendors provide some basic analysis tools but • you may want to try the latest alignment algorithm • you may want to use a remote cluster ~1,000 tasks, ~120 cores, 2 hrs, 60GB footprint

  11. Challenges: • Automation of analysis and integration with lab management system • Data and storage management • Poor application codes • Lack of knowledge and willingness to deal with CI and its problems • Can buy own systems but results in isolated and often poorly maintained hardware • Need to rely on campus infrastructure • which often has limited CI • Lack of willingness to install new software • Want to publish methods online • Need a good amount of handholding

  12. How to create a workflow? Execution on USC resources Work with Ben Berman

  13. Managing complex workloads Data Storage data Campus Cluster XSEDE Open Science Grid Amazon Cloud Work definition As a WORKFLOW Pegasus Workflow Management System work Local Resource

  14. Our Philosophy • Work closely • with users to improve software, make it relevant • with CS colleagues to develop new capabilities, share ideas, and develop complex systems and algorithms • Users • Enable them to author workflows in a way comfortable for them • Provide reliability, scalability, performance • Software • Be a “good” CyberInfrastructure ecosystem member • Provide a number of interfaces to enter the system, and expose interfaces to other CI entities • Focus on one aspect of the problem and contribute solutions • Leverage existing solutions where possible (Condor, Netlogger, GlideinWMS) • Execution Environment • Use whatever we can, support heterogeneity

  15. Working with other scientists • It takes time! • Collaboration with LIGO since 2000 (GriPhyN) • Collaboration with Caltech IPAC 2002 (NVO) • Collaboration with SCEC 2003 (SCEC-CME) • Sciences progress in cycles, computing is only part of the cycle, so you need patience • Don’t forget that CS is a science, derive knowledge, abstractions that can be broadly applied, design new algorithms that solve pressing problems

  16. Challenges for CI 2022 • Make it easy to use • Provide interfaces that are • At a High-level of abstraction • Intuitive • Stable over time • Well defined • Robust

  17. In the meantime • How do we engage scientists to make CI relevant to them? • How do we increase CI-awareness? • How, do we as a community have impact? • Need to develop a close relationship with campus infrastructure providers and other people that work with end users (librarians) • Need to educate new generation of CS researchers, CI developers, and CI operators

  18. National Academy of Engineers