1 / 25

Introduction to Grid & Cluster Computing Sriram Krishnan, Ph.D. sriram@sdsc

Introduction to Grid & Cluster Computing Sriram Krishnan, Ph.D. sriram@sdsc.edu. QMView. GAMESS. Motivation: NBCR Example. Set of Biomedical Applications. Resources. Gtomo2 TxBR. Cyber-Infrastructure. APBS. Continuity. Autodock. Rich Clients. Web Portals. Web Services. PMV ADT

steve
Download Presentation

Introduction to Grid & Cluster Computing Sriram Krishnan, Ph.D. sriram@sdsc

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Grid & Cluster ComputingSriram Krishnan, Ph.D.sriram@sdsc.edu

  2. QMView GAMESS Motivation: NBCR Example Set of Biomedical Applications Resources Gtomo2 TxBR Cyber-Infrastructure APBS Continuity Autodock Rich Clients Web Portals Web Services PMV ADT Vision Workflow Telescience Portal Middleware APBSCommand Continuity

  3. Cluster Resources • “A computer cluster is a group of tightly coupled computers that work together closely so that in many respects they can be viewed as though they are a single computer.” [wikipedia] • Typically built using commodity off-the-shelf hardware (processors, networking, etc) • Differs from traditional “supercomputers” • Now at more than 70% of deployed Top500 machines • Useful for: high availability, load-balancing, scalability, visualization, and high performance

  4. Grid Computing • “Coordinated resource sharing and problem solving in dynamic multi-institutional virtual organization.” [Foster, Kesselman, Tuecke] • Coordinated - multiple resources working in concert, eg. Disk & CPU, or instruments & database, etc. • Resources - compute cycles, databases, files, application services, instruments. • Problem solving - focus on solving scientific problems • Dynamic - environments that are changing in unpredictable ways • Virtual Organization - resources spanning multiple organizations and administrative domains, security domains, and technical domains

  5. Grids are not the same as Clusters! • Foster’s 3 point checklist • Resources not subjected to centralized control • Use of standard, open, general-purpose protocols and interfaces • Delivery of non-trivial qualities of service • Grids are typically made up of multiple clusters

  6. Popular Misconception • Misconception: Grids are all about CPU cycles • CPU cycles are just one aspect, others are: • Data: For publishing and accessing large collections of data, e.g. Geosciences Network (GEON) Grid • Collaboration: For sharing access to instruments (e.g. TeleScience Grid), and collaboration tools (e.g. Global MMCS at IU)

  7. Uses 1000s of internet connected PCs to help in search for extraterrestrial intelligence When the computer is idle, the software downloads ~ 1/2 MB chunk of data for analysis. Results of analysis sent back to the SETI team, combined with 1000s of other participants Largest distributed computation project in existence Total CPU time: 2433979.781 years Users: 5436301 Statistics from 2006 SETI@Home

  8. NCMIR TeleScience Grid * Slide courtesy TeleScience folks

  9. NBCR Grid Gemstone PMV/Vision Kepler State Mgmt Application Services Security Services (GAMA) Globus Globus Globus PBS Cluster Condor pool SGE Cluster

  10. Day 1 - Using Grids and Clusters: Job Submission • Scenario 1 - Clusters: • Upload data to remote cluster using scp • Log on to the said cluster using ssh • Submit job via command-line to schedulers, such as Condor or the Sun Grid Engine (SGE) • Scenario 2 - Grids: • Upload data using to Grid resource using GridFTP • Submit job via Globus command-line tools (e.g. globus-run) to remote resources • Globus services communicate with the resource specific schedulers

  11. Day 1 - Using Grids & Clusters: Security

  12. Day 1 - Using Grids & Clusters: User Interfaces

  13. Day 2 - Managing Cluster Environments • Clusters are great price/performance computational engines • Can be hard to manage without experience • Failure rate increases with cluster size • Not cost-effective if maintenance is more expensive than the cluster itself • System administrators can cost most than clusters (1 Tflops cluster < $100,000)

  14. Day 2 - Rocks (Open Source Clustering Distribution) • Technology transfer of commodity clustering to application scientists • Making clusters easy • Scientists can build their own supercomputers • Rocks distribution is a set of CDs • Red Hat Enterprise Linux • Clustering Software (PBS, SGE, Ganglia, Globus) • Highly programmatic software configuration management • http://www.rocksclusters.org

  15. Day 2 - Rocks Rolls

  16. Day 3 - Advanced Usage Scenarios: Workflows • Scientific workflows emerged as an answer to the need to combinemultiple Cyberinfrastructure components in automated process networks • Combination of • Data integration, analysis, and visualization steps • Automated “scientific process” • Promotes scientific discovery

  17. Day 3 - The Big Picture: Scientific Workflows From “Napkin Drawings” … … to Executable Workflows Source: Mladen Vouk (NCSU) Conceptual SWF Executable SWF Here: John Blondin, NC State Astrophysics Terascale Supernova Initiative SciDAC, DOE

  18. Day 3 - Kepler Workflows: A Closer Look

  19. Day 3 - Advanced Usage Scenarios: MetaScheduling • Local schedulers are responsible for load balancing and resource sharing within each local administrative domain • Meta-Schedulers are responsible for querying, negotiating access and managing resources existing within different administrative domains in Grid systems

  20. Day 3 - MetaSchedulers: CSF4 • What is the CSF Meta-Scheduler? • Community Scheduler Framework • CSF4 is a group of Grid services hosted inside the Globus Toolkit (GT4) • CSF4 is fully WSRF compliant • Open Source project and can be accessed at http://sourceforge.net/projects/gcsf • The development team of CSF4 is from Jilin University, PRC

  21. Day 3 - CSF4 Architecture Grid Environment CSF 4 Services R e s o u r c e M a n a g e r Meta Information F a c t o r y S e r v i c e W S - M D S R e s o u r c e M a n a g e r R e s e r v a t i o n G r a m S e r v i c e S e r v i c e R e s o u r c e M a n a g e r J o b G T 2 E n v i r o n m e n t L S F S e r v i c e S e r v i c e G a t e K e e p e r Q u e u i n g S e r v i c e W S - G R A M G r a m P B S G r a m S G E G r a m C o n d o r G r a m F o r k g a b d G r a m L S F G r a m F o r k G r a m P B S G r a m S G E G r a m C o n d o r L o c a l L o c a l P B S S G E C o n d o r L S F P B S S G E C o n d o r M a c h i n e M a c h i n e : :

  22. Day 4 - Accessing TeraScale Resources • I need more resources! What are my options? • TeraGrid: “With 20 petabytes of storage, and more than 280 teraflops of computing power, TeraGrid combines the processing power of supercomputers across the continent” • PRAGMA: “To establish sustained collaborations and advance the use of grid technologies in applications among a community of investigators working with leading institutions around the Pacific Rim”

  23. Members: IU, ORNL, NCSA, PSC, Purdue, SDSC, TACC, ANL, NCAR 280 Tflops of computing capability 30 PB of distributed storage High performance networking between partner sites Linux-based software environment, uniform administration Focus is a national, production Grid PSC PSC Extensible Terascale Facility Day 4 - TeraGrid TeraGrid is a “top-down”, planned Grid

  24. PRAGMA Grid Member Institutions JLU China CNIC GUCAS China AIST OsakaU UTsukuba TITech Japan NCSA USA UZurich Switzerland KISTI Korea BU USA UUtah USA SDSC USA LZU China UPRM Puerto Rico ASGC NCHC Taiwan UoHyd India CICESE Mexico CUHK HongKong UNAM Mexico NECTEC ThaiGrid Thailand HCMUT IOIT-HCM Vietnam ITCR Costa Rica APAC QUT Australia MIMOS USM Malaysia BII IHPC NGO NTU Singapore UCN Chile BESTGrid New Zealand UChile Chile MU Australia 31 institutions in 15 countries/regions (+ 7 in preparation)

  25. Track 1: Agenda (9AM-12PM at PFBH 161) • Tues, July 31: Basic Cluster and Grid Computing Environment • Wed, Aug 1: Rocks Clusters and Application Deployment • Thurs, Aug 2: Workflow Management and MetaScheduling • Fri, Aug 3: Accessing National and International TeraScale Resources

More Related