1 / 29

TeraGrid Science Gateways in Barcelona

TeraGrid Science Gateways in Barcelona. Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu. Thank you for the invitation to speak in such a beautiful place at such an important time for EGEE/EGI. European Grid Infrastructure

malaya
Download Presentation

TeraGrid Science Gateways in Barcelona

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TeraGrid Science Gatewaysin Barcelona Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu EGEE 2009, September 21-25, 2009

  2. Thank you for the invitation to speak in such a beautiful placeat such an important time for EGEE/EGI • European Grid Infrastructure • Sustainable grid infrastructure in Europe driven by the needs and requirements of the research community • Linking existing national grids • Facilitate the interaction and collaboration between the different NGIs • Provide a common managerial and operational framework for the pan-European grid infrastructure EGEE 2009, September 21-25, 2009

  3. Four main functions of EGI • Operations and Security functions • Include all the services that are required to provide a large-scale production-quality grid for science in Europe • Middleware integration • All the tasks that are required to ensure the maintenance, support and interoperability of the middleware stacks currently deployed on the e-Infrastructure • User Community services • Essential interface and basic support for existing and new user communities • External Liaison tasks • Project outreach and dissemination, maintaining relations with other relevant grids and organisations. EGEE 2009, September 21-25, 2009

  4. Coincidentally, this is an important time for the TeraGrid tooin much the same way! Dedicated high-speed, cross—country network Staff & Advanced Support 20 Petabytes Storage 2 PetaFLOPS Computation Visualization EGEE 2009, September 21-25, 2009

  5. Follow-on program to TeraGrid now being competedeXtreme Digital (XD) The XD Solicitation (NSF 08-571) consists of 5 services: Current planning grants address the CMS/AUSS/TEOS integrative services. Source: Richard Moore, SDSC EGEE 2009, September 21-25, 2009

  6. XD as Integrator • XD will be in a position similar to TeraGrid Grid Infrastructure Group (GIG) and EGI • Deliver common services: e.g. allocations, central documentation/portal/helpdesk, standard user software environments, networking, advanced user support, training/education/outreach, etc. • Provide integration function(s) across Resource Providers (RPs), including governance • Define/implement a high-level integrating system architecture • Provide continuity of services for TeraGrid users • Work with TAIS awardee in an integrated fashion • XD will not • Deploy its own HPC systems • Prescribe how Resource Providers operate resources • But will work with RPs to define a common user environment • Define the science/engineering objectives for users Source: Richard Moore, SDSC EGEE 2009, September 21-25, 2009

  7. TeraGrid/XD Resource Evolution Track 2C Track 2D: Data-Intensive Computing Track 2D: High-Throughput Computing Track 2D: Experimental Architecture Track 2D: Grid Testbed Other resources as designated by NSF Start dates TBD XD RV-DA Systems TACC Longhorn (~12/09) & NICS Nautilus (~6/10) TACC Ranger (Track 2A) NICS Kraken (Track 2B) Continuing IU Quarry LONI Queen Bee NCAR Frost NCSA Abe NCSA Lincoln Purdue Steele PSC Pople SDSC Dash TACC Lonestar To be continued through 3/31/11* IU Big Red NCSA Cobalt and Mercury (IA-64) ORNL NSTG PSC Big Ben Purdue Brutus and Condor Pool SDSC IA-64 UC/ANL IA-64 * Check the TeraGrid Resource Catalog at www.teragrid.org for start/end dates. Ending Serviceon or before3/31/10* XD TeraGrid EGEE 2009, September 21-25, 2009 Source: Richard Moore, SDSC

  8. TeraGrid resources today include: • Tightly Coupled Distributed Memory Systems, 2 systems in the top 10 at top500.org • Ranger (TACC): Sun Constellation, 62,976 cores, 579 Tflop, 123 TB RAM • Kraken (NICS): Cray XT5, 66,048 cores, 608 Tflop, > 1 Pflop in 2009 • Shared Memory Systems • Cobalt (NCSA): Altix, 8 Tflop, 3 TB shared memory • Pople (PSC): Altix, 5 Tflop, 1.5 TB shared memory • Clusters with Infiniband • Abe (NCSA): 90 Tflops • Lonestar (TACC): 61 Tflops • QueenBee (LONI): 51 Tflops • Condor Pool (Loosely Coupled) • Purdue- up to 22,000 cpus • Gateway hosting • Quarry (IU): virtual machine support • Visualization Resources • TeraDRE (Purdue): 48 node nVIDIA GPUs • Spur (TACC): 32 nVIDIA GPUs • Storage Resources • GPFS-WAN (SDSC) • Lustre-WAN (IU) • Various archival resources EGEE 2009, September 21-25, 2009 Source: Dan Katz, U Chicago

  9. How did the Gateway program develop?A natural result of the impact of the internet on worldwide communication and information retrieval • Implications on the conduct of science are still evolving • 1980’s, Early gateways, National Center for Biotechnology Information BLAST server, search results sent by email, still a working portal today • 1989 World Wide Web developed at CERN • 1992 Mosaic web browser developed • 1995 “International Protein Data Bank Enhanced by Computer Browser” • 2004 TeraGrid project director Rick Stevens recognized growth in scientific portal development and proposed the Science Gateway Program • Today, Web 3.0 and programmatic exchange of data between web pages • Simultaneous explosion of digital information • Growing analysis needs in many, many scientific areas • Sensors, telescopes, satellites, digital images and video, • #1 machine on Top500 today is 1000x more powerful than all combined entries on the first list in 1993 Only 17 years since the release of Mosaic! EGEE 2009, September 21-25, 2009

  10. Access to supercomputers hasn’t changed much in 20 yearsBut the world around them sure has! EGEE 2009, September 21-25, 2009

  11. vt100 in the 1980s and alogin window on Ranger today EGEE 2009, September 21-25, 2009

  12. Why are gateways worth the effort? ======= # Full path to executable executable=/users/wilkinsn/tutorial/bin/mcell # Working directory, where Condor-G will write # its output and error files on the local machine. initialdir=/users/wilkinsn/tutorial/exercise_3 # To set the working directory of the remote job, we # specify it in this globus RSL, which will be appended # to the RSL that Condor-G generates globusrsl=(directory='/users/wilkinsn/tutorial/exercise_3') # Arguments to pass to executable. arguments=nmj_recon.main.mdl # Condor-G can stage the executable transfer_executable=false # Specify the globus resource to execute the job globusscheduler=tg-login1.sdsc.teragrid.org/jobmanager-pbs # Condor has multiple universes, but Condor-G always uses globus universe=globus # Files to receive sdout and stderr. output=condor.out error=condor.err # Specify the number of copies of the job to submit to the condor queue. queue 1 • Increasing range of expertise needed to tackle the most challenging scientific problems • How many details do you want each individual scientist to need to know? • PBS, RSL, Condor • Coupling multi-scale codes • Assembling data from multiple sources • Collaboration frameworks #! /bin/sh #PBS -q dque #PBS -l nodes=1:ppn=2 #PBS -l walltime=00:02:00 #PBS -o pbs.out #PBS -e pbs.err #PBS -V cd /users/wilkinsn/tutorial/exercise_3 ../bin/mcell nmj_recon.main.mdl +( &(resourceManagerContact="tg-login1.sdsc.teragrid.org/jobmanager-pbs") (executable="/users/birnbaum/tutorial/bin/mcell") (arguments=nmj_recon.main.mdl) (count=128) (hostCount=10) (maxtime=2) (directory="/users/birnbaum/tutorial/exercise_3") (stdout="/users/birnbaum/tutorial/exercise_3/globus.out") (stderr="/users/birnbaum/tutorial/exercise_3/globus.err") ) EGEE 2009, September 21-25, 2009

  13. Gateways democratize access to high end resources • Almost anyone can investigate scientific questions using high end resources • Not just those in the research groups of those who request allocations • Gateways allow anyone with a web browser to explore • Opportunities can be uncovered via google • My 11-year-old son discovered nanoHUB.org himself while his class was studying Bucky Balls • Foster new ideas, cross-disciplinary approaches • Encourage students to experiment • But used in production too • Significant number of papers resulting from gateways including GridChem, nanoHUB • Scientists can focus on challenging science problems rather than challenging infrastructure problems EGEE 2009, September 21-25, 2009

  14. Today, there are approximately 35 gateways using the TeraGrid EGEE 2009, September 21-25, 2009

  15. Not just ease of useWhat can scientists do that they couldn’t do previously? • Linked Environments for Atmospheric Discovery (LEAD) - access to radar data • National Virtual Observatory (NVO) – access to sky surveys • Ocean Observing Initiative (OOI) – access to sensor data • PolarGrid – access to polar ice sheet data • SIDGrid – expensive datasets, analysis tools • GridChem –coupling multiscale codes • How would this have been done before gateways? EGEE 2009, September 21-25, 2009

  16. What makes a gateway a TeraGrid gateway? • TeraGrid gateways that use TeraGrid resources • Are they all developed by TeraGrid? • No, we don’t make gateways the gateways you use, we make the gateways you use better • The strength of the program lies in the development of end user interfaces by those in the community • TeraGrid does provide staff to assist with gateway use of the resources • Anyone can request support via the same peer review process used to request CPU hours or a data allocation EGEE 2009, September 21-25, 2009

  17. 3steps to connect a gateway to TeraGrid • Request an allocation • Only a 1 paragraph abstract required for up to 200k CPU hours • Register your gateway • Visibility on public TeraGrid page • Request a community account • Run jobs for others via your portal • Staff support is available! • www.teragrid.org/gateways EGEE 2009, September 21-25, 2009

  18. Tremendous Opportunities Using the Largest Shared Resources - Challenges too! • What’s different when the resource doesn’t belong just to me? • Resource discovery • Accounting • Security • Proposal-based requests for resources (peer-reviewed access) • Code scaling and performance numbers • Justification of resources • Gateway citations • Tremendous benefits at the high end, but even more work for the developers • Potential impact on science is huge • Small number of developers can impact thousands of scientists • But need a way to train and fund those developers EGEE 2009, September 21-25, 2009

  19. Gateways in the marketplaceKids control telescopes and share images • “In seconds my computer screen was transformed into a live telescopic view” • “Slooh's users include newbies and professional astronomers in 70 countries” • Observatories in the Canary Islands and Chile, Australia coming soon • 5000 images/month since 2003 • Increases public support for investment in these facilities EGEE 2009, September 21-25, 2009

  20. Linked Environments for Atmospheric DiscoveryLEAD • Providing tools that are needed to make accurate • predictions of tornados and hurricanes • Data exploration and Grid workflow EGEE 2009, September 21-25, 2009

  21. PolarGrid brings CI to polar ice sheet measurement • Cyberinfrastructure Center for Polar Science (CICPS) • Experts in polar science, remote sensing and cyberinfrastructure • Indiana, ECSU, CReSIS • Satellite observations show disintegration of ice shelves in West Antarctica and speed-up of several glaciers in southern Greenland • Most existing ice sheet models, including those used by IPCC cannot explain the rapid changes EGEE 2009, September 21-25, 2009

  22. Components of PolarGrid • Expedition grid consisting of ruggedized laptops in a field grid linked to a low power multi-core base camp cluster • Prototype and two production expedition grids feed into a 17 Teraflops "lower 48" system at Indiana University and Elizabeth City State (ECSU) split between research, education and training. • Gives ECSU (a minority serving institution) a top-ranked 5 Teraflop high performance computing system • Access to expensive data • TeraGrid resources for analysis • Large level 0 and level 1 data sets require once and done processing and storage • Filters applied to level 1 data by users in real time via the web • Student involvement EGEE 2009, September 21-25, 2009 Source: Geoffrey Fox

  23. Social Informatics Data GridCollaborative access to large, complex datasets • SIDGrid is unique among social science data archive projects • Streaming data which change over time • Voice, video, images (e.g. fMRI), text, numerical (e.g. heart rate, eye movement) • Investigate multiple datasets, collected at different time scales, simultaneously • Large data requirements • Sophisticated analysis tools http://www.ci.uchicago.edu/research/files/sidgrid.mov EGEE 2009, September 21-25, 2009

  24. Viewing multimodal data like a symphony conductor • “Music-score” display and synchronized playback of video and audio files • Pitch tracks • Text • Head nods, pause, gesture references • Central archive of multi-modal data, annotations, and analyses • Distributed annotation efforts by multiple researchers working on a common data set • History of updates • Computational tools • Distributed acoustic analysis using Praat • Statistical analysis using R • Matrix computations using Matlab and Octave Source: Studying Discourse and Dialog with SIDGrid, Levow, 2008 EGEE 2009, September 21-25, 2009

  25. Future Technical Areas • Web technologies change fast • Must be able to adapt quickly • Gateways and gadgets • Gateway components incorporated into any social networking page • 75% of 18 to 24 year-olds have social networking websites • iPhone apps? • Web 3.0 • Beyond social networking and sharing content • Standards and querying interfaces to programmatically share data across sites • Resource Description Framework (RDF), SPARQL EGEE 2009, September 21-25, 2009

  26. Gateways can further investments in other projects • Increase access • To instruments, expensive data collections • Increase capabilities • To analyze data • Improve workforce development • Can prepare students to function in today’s cross-disciplinary world • Increase outreach • Increase public awareness • Public sees value in investments in large facilities • Pew 2006 study indicates that half of all internet users have been to a site specializing in science • Those who seek out science information on the internet are more likely to believe that scientific pursuits have a positive impact on society EGEE 2009, September 21-25, 2009

  27. Where are we going with the program next? • Prebuilt VMs with gateway software • Better visibility of apps available through gateways in addition to those available at the command line on TeraGrid • More example-based documentation • Less talk, more action • Improved use of remote vis resources by gateways • Continued targeted support of an ever-changing portfolio of projects • Peer-reviewed requests for assistance • Helpdesk support expanded EGEE 2009, September 21-25, 2009

  28. Tremendous Potential for Gateways • In only 17 years, the Web has fundamentally changed human communication • Science Gateways can leverage this amazingly powerful tool to: • Transform the way scientists collaborate • Streamline conduct of science • Influence the public’s perception of science • Reliability, trust, continuity are fundamental to truly change the conduct of science through the use of gateways • High end resources can have a profound impact • The future is very exciting! EGEE 2009, September 21-25, 2009

  29. Thank you for your attention!Questions? Nancy Wilkins-Diehr, wilkinsn@sdsc.edu www.teragrid.org EGEE 2009, September 21-25, 2009

More Related