1 / 55

What is the Grid?

Experiences in the Grid ‘the grid, the bad, and the ugly’ Surrey University e-Science Day 2 nd December 2002 Prof Simon Cox Technical Director, Southampton Regional e-Science Centre School of Engineering Sciences University of Southampton. What is the Grid?. IT Drivers: Moore’s Law (i).

coty
Download Presentation

What is the Grid?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Experiences in the Grid‘the grid, the bad, and the ugly’Surrey Universitye-Science Day 2nd December 2002Prof Simon CoxTechnical Director,Southampton Regional e-Science CentreSchool of Engineering SciencesUniversity of Southampton

  2. What is the Grid?

  3. IT Drivers: Moore’s Law (i) “Moore’s Law, the prediction that transistor density would double every 18 to 24 months, has become a self-fulfilling prophecy. Computers have been gaining 10 times the processing power every five years. The exponential growth of chip transistor density will continue at least another decade.” “By 2010, the typical desktop computer will have a 30-GHz processor that performs 1 trillion instructions per second. Handheld computers will run at clock speeds of 5 GHz, faster than today's high-end systems” Pat Gelsinger, Intel Corp.’s CTO at FOSE 2002 trade show, Washington

  4. IT Drivers: Moore’s Law (ii) • Moore’s law Þ highly functional end-systems • Compute and Data • Network exponentials produce dramatic changes in geometry and geography • 9-month doubling: double Moore’s law! • New modes of working and problem solving emphasize teamwork, computation • New business models and technologies facilitate outsourcing

  5. IT environment What’s New Where we were Too Expensive Compute/ Data Network Proprietary Homogeneous Couldn’t Interoperate Couldn’t Collaborate COMMODITY + OPEN STANDARDS Web Services Grid Middleware Databases/ XML W3C

  6. Grid Computing

  7. The Grid Problem “Flexible and secure sharing of resources among dynamic collections of individuals within and across organisations” • Resources = assets, capabilities, and knowledge • Capabilities (e.g. application codes, analysis tools) • Compute Grids (PC cycles, commodity clusters, HPC) • Data Grids • Experimental Instruments • Knowledge Services • Virtual Organisations • Utility Services Grid middleware mediates between these resources

  8. Opportunity • Grid aims to do for corporate IT what the web did for information • Unify and coordinate resources • compute, data, … • Grid paradigm • Lower TCO by reducing complexity • Facilitate new ways of operating • Leverage existing infrastructure • Seamless integration of new infrastructure • Interoperability: intra-company, inter-company

  9. A Brief History of Grid

  10. The Grid:A Brief History • Early 90s • Gigabit testbeds, metacomputing • Mid to late 90s • Early experiments (e.g., I-WAY), academic software projects (e.g., Globus, Legion), application experiments • 2002 • Dozens of application communities & projects • Major infrastructure deployments • Significant technology base (esp. Globus ToolkitTM) • Growing industrial interest • Global Grid Forum: ~500 people, 20+ countries

  11. The Grid World: Current Status • Dozens of major Grid projects in scientific & technical computing/research & education • Deployment, application, technology • Some consensus on key concepts and technologies • Open source Globus Toolkit™ a de facto standard for major protocols & services • Far from complete or perfect, but out there, evolving rapidly, and large tool/user base • Global Grid Forum a significant force • Industrial interest emerging rapidly http://www.gridforum.org

  12. Summary • The Grid problem: Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations • Grid architecture: Protocol, service definition for interoperability & resource sharing • Grid Middleware • Globus Toolkit a source of protocol and API definitions—and reference implementations • Open Grid Services Architecture represents next step in evolution • Condor High throughput Computing • Web Services & W3C leveraging e-business • e-Science Projects applying Grid concepts to applications

  13. Grid Architecture

  14. Distributed Terascale Facility

  15. Distributed Terascale Facility

  16. Edinburgh Glasgow DL Newcastle Belfast Manchester Cambridge Oxford Hinxton RAL Cardiff London Southampton UK e-Science Grid

  17. UK e-Science Programme DG Research Councils Grid TAG E-Science Steering Committee Director (Tony Hey) Director’s Management Role Director’s Awareness and Co-ordination Role Generic Challenges EPSRC (£15m), DTI (£15m) Academic Application Support Programme Research Councils (£74m), DTI (£5m) PPARC (£26m) BBSRC (£8m) MRC (£8m) NERC (£7m) ESRC (£3m) EPSRC (£17m) CLRC (£5m) £80m Collaborative projects Industrial Collaboration (£40m)

  18. Architecture of a Grid Collaboration and Remote Instrument Services Grid Information Service UniformResourceAccess Network Cache Communication Services Authentication Authorization Security Services Co-Scheduling Uniform Data Access Global EventServices Global Queuing Data Management Fault Management Monitoring Brokering Auditing Condor poolsof workstations Science Portals andScientific Workflow Management Systems Web Services and Portal Toolkits Applications (Simulations, Data Analysis, etc.) Application Toolkits (Visualization, Data Publication/Subscription, etc.) Execution support and Frameworks (Globus MPI, Condor-G, CORBA-G) Grid Common Services: Standardized Services and Resources Interfaces = operational services (Globus, SRB) Distributed Resources clusters scientific instruments national supercomputer facilities tertiary storage network caches High Speed Communication Services

  19. Compute (many) Storage (many) Communi-cation Instruments (various) ApplicationPortals Web Services Grid Services:Collective and Resource Access Resources Combining Grid and Web Services Clients Grid Protocols and Grid Security Infrastructure XML / SOAP over Grid Security Infrastructure Job Submission / Control Grid ssh Grid Protocols and Grid Security Infrastructure Discipline / Application SpecificPortals (e.g. SDSCTeleScience) http, https. etc. CORBA File Transfer GRAM Data Management X Windows Condor-G Monitoring SRB/MetadataCatalogue ProblemSolvingEnvironments(AVS, SciRun,Cactus) Events …… Web Browser GridFTP Data Replica and Metadata Catalog EnvironmentManagement(LaunchPad,HotPage) Credential Management GridMonitoringArchitecture Workflow Management PDA Grid X.509CertificationAuthority • other services: • visualization • interface builders • collaboration tools • numerical grid generators • etc. MPI compositionframeworks (e.g. XCAT) Secure, ReliableGroup Comm. GridInformationService CoG Kits implementing Web Services in servelets, servers, etc. Grid Web ServiceDescription (WSDL) & Discovery (UDDI) Python, Java, etc., JSPs Apache Tomcat&WebSphere&Cold Fusion=JVM + servlet instantiation + routing Apache SOAP,.NET, etc.

  20. Grid Middleware

  21. Grid Middleware (coordinate and authenticate use of grid services) • Globus (and GGF grid-computing protocols) • Security Infrastructure (GSI) • Resource Allocation Mechanism (GRAM) • Resource Information System (GRIS) • Index Information Service (GIIS) • Grid-FTP • Metadirectory service (MDS 2.0+) coupled to LDAP server • Condor (distributed high performance throughput system) • Condor-G allows us to handle dispatching jobs to our Globus system • Condor development started in 1985 at University of Wisconsin (Miron Livny)

  22. The Globus ProjectMaking Grid computing a reality • Close collaboration with real Grid projects in science and industry • Development and promotion of standard Grid protocols to enable interoperability and shared infrastructure • Development and promotion of standard Grid software APIs and SDKs to enable portability and code sharing • The Globus Toolkit: Open source, reference software base for building grid infrastructure and applications • Global Grid Forum: Development of standard protocols and APIs for Grid computing http://www.gridforum.org http://www.globus.org

  23. Web Services • Increasingly popular standards-based framework for accessing network applications • W3C standardization: Microsoft (.NET), IBM (WebSphere), Sun (J2EE), etc • XML and XML Schema • Representing data in a portable format • WSDL: Web Services Description Language • Interface Definition Language for Web services • SOAP: Simple Object Access Protocol • XML-based RPC protocol; common WSDL target • WSDL (/ WS-Inspection) • Conventions for locating service descriptions • UDDI: Universal Description, Discovery, & Integration • Directory for Web services

  24. BPEL4WS

  25. Grid Services • Grid Services are defined in terms ofWeb Services Description Language (WSDL) interfaces, and provide the mechanisms to create and compose sophisticated distributed systems: • Lifetime management • Reliable and secure remote invocation • Change management • Credential management • Notification in a Web services environment

  26. Grid Applications: ‘e-Science’

  27. Scientific Grid Applications • A biochemist exploits 10,000 computers to screen 100,000 compounds in an hour • 1,000 physicists worldwide pool resources for peta-op analyses of petabytes of data • Civil engineers collaborate to design, execute, & analyze shake table experiments • Climate scientists visualize, annotate, & analyze terabyte simulation datasets • An emergency response team couples real time data, weather model, population data

  28. Online Access to Scientific Instruments Advanced Photon Source wide-area dissemination desktop & VR clients with shared controls real-time collection archival storage tomographic reconstruction DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago

  29. CERN’s Large Hadron Collider 1800 Physicists, 150 Institutes, 32 Countries 100 PB of data by 2010; 50,000 CPUs?

  30. ~PBytes/sec ~100 MBytes/sec Offline Processor Farm ~20 TIPS There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size ~100 MBytes/sec Online System Tier 0 CERN Computer Centre ~622 Mbits/sec or Air Freight (deprecated) Tier 1 FermiLab ~4 TIPS France Regional Centre Germany Regional Centre Italy Regional Centre ~622 Mbits/sec Tier 2 Tier2 Centre ~1 TIPS Caltech ~1 TIPS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS Tier2 Centre ~1 TIPS HPSS HPSS HPSS HPSS HPSS ~622 Mbits/sec Institute ~0.25TIPS Institute Institute Institute Physics data cache ~1 MBytes/sec 1 TIPS is approximately 25,000 SpecInt95 equivalents Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Pentium II 300 MHz Pentium II 300 MHz Pentium II 300 MHz Pentium II 300 MHz Tier 4 Physicist workstations Grid Communities & Applications:Data Grids for High Energy Physics www.griphyn.org www.ppdg.net www.eu-datagrid.org

  31. Network for EarthquakeEngineering Simulation • NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other • On-demand access to experiments, data streams, computing, archives, collaboration NEESgrid: Argonne, Michigan, NCSA, UIUC, USC

  32. Business Grid Applications • Engineers at a multinational company collaborate on the design of a new product • A multidisciplinary analysis in aerospace couples code and data in four companies • An insurance company mines data from partner hospitals for fraud detection • An application service provider offloads excess load to a compute cycle provider • An enterprise configures internal & external resources to support e-Business workload

  33. NASA Information Power Grid • Vision: To revolutionize how computing is used in NASA’s science and engineering by providing the middleware services for routinely building large-scale, dynamically constructed, and transient, problem solving environments from distributed, heterogeneous resources • Apersistent computing and data grid William E. Johnston, Project Manager NASA Advanced Supercomputing (NAS) DivisionNASA Ames Research Center http://www.ipg.nasa.gov

  34. Multi-disciplinary Simulations:Aviation Safety Example Virtual National Air Space (VNAS) • FAA Ops Data • Weather Data • Airline Schedule Data • Digital Flight Data • Radar Tracks • Terrain Data • Surface Data The vision for VNAS is that whole system simulated aircraft are inserted into a realistic environment. This requires integrating many types of operations data as drivers for the simulations.

  35. 300 node Condor pool Boeing EDC GRC NGIX NREN CMU ARC NCSA GSFC LaRC JPL SDSC NTON-II/SuperNet MSFC JSC KSC Collaboration and Remote Instrument Services Grid Information Service UniformResourceAccess Communication Services Authentication Authorization Security Services Fault Management Global EventServices Data Cataloguing Uniform Data Access Global Queuing Co-Scheduling Network Cache Brokering Auditing Monitoring To NAS Data Warehouse Future Aviation Safety Systems Wing Models (ARC) StabilizerModels Human Models Airframe Models Engine Models(GRC) Landing Gear Models (LaRC) Application framework compute and data management requests West Coast TRACON/Center Data (Performance Data Analysis & Reporting System (PDARS) - AvSP/ASMM ARC) Atlanta Hartsfield International Airport (Surface Movement Advisor AATT Project) NOAA Weather Dbase (ATL Terminal area) Airport Digital Video (Remote Tower Sensor System) Grid Services: Uniform access to distributed resources Information Power Grid managed compute and data management resources

  36. Aviation Safety Multiple sub-systems, e.g. a wing lift model operating at NASA Ames and a turbo-machine model operating at NASA Glenn, are combined using GRC’s NPSS (Numerical Propulsion System Simulation) application framework that manages the interactions of multiple models and uses IPG services to coordinate computing and data storage systems across NASA.

  37. Simon Cox- Grid/ W3C Technologies and High Performance Computing Global Grid Forum Apps Working Group Andy Keane- Director of Rolls Royce/ BAE Systems University Technology Partnership in Design Search and Optimisation Mike Giles- Director of Rolls Royce University Technology Centre for Computational Fluid Dynamics Carole Goble- Ontologies and DARPA Agent Markup Language (DAML) / Ontology Inference Language (OIL) Nigel Shadbolt- Director of Advanced Knowledge Technologies (AKT) IRC BAE SYSTEMS- Engineering Rolls-Royce- Engineering Fluent- Computational Fluid Dynamics Microsoft- Software/ Web Services Intel- Hardware Compusys- Systems Integration Epistemics- Knowledge Technologies Condor- Grid Middleware Grid Enabled Optimisation and Design Search for Engineering (GEODISE)Southampton, Oxford and Manchester

  38. Design

  39. Design Challenges Modern engineering firms are global and distributed “Not just a problem of using HPC” How to … ? … improve design environments … cope with legacy code / systems CAD and analysis tools, user interfaces, PSEs, and Visualization … produce optimized designs Optimisation methods … integrate large-scale systems in a flexible way Management of distributed compute and data resources Data archives (e.g. design/ system usage) … archive and re-use design history Knowledge repositories & knowledge capture and reuse tools. … capture and re-use knowledge

  40. GEODISE Geodise will provide grid-based seamless access to an intelligent knowledge repository, a state-of-the-art collection of optimisation and search tools, industrial strength analysis codes, and distributed computing & data resources

  41. Knowledge Technologies Advanced Knowledge Technologies, IRC (Soton)

  42. Presenter mic Presenter camera Ambient mic (tabletop) Audience camera Access Grid • Collaborative work among large groups • ~80 sites worldwide Access Grid: Argonne, others www.accessgrid.org

  43. 1 2 3 4 5 myGridManchester, Newcastle, Nottingham, Sheffield, Southampton, IT Innovation Centre, European Bioninformatics Institute, AstraZeneca, GlaxoSmithKline, Merck KGaA, Epistemics Ltd, GeneticXchange, Network Inference, IBM, Sun Building personalised extensible environments for data-intensivein silico experiments in biology myGrid Middleware http://www.mygrid.org.uk An environment that enables geographically distributed scientists to achieve research goals more effectively, while enabling their results to be used in developments elsewhere

  44. myGrid demo Repository Client Portal Workflow Client Ontology Client Meta Data: Ontology Personal Repository Workflow Repository Meta Data: Service Type Directory

  45. Paul Valdes- Professor of Earth System Science, Reading Simon Cox- Technical Director, Southampton e-Science Centre Melvin Cannell - Director of CEH Edinburgh John Darlington- Director of London e-Science Centre Richard Harding- Head of Global Processes Section, CEH Wallingford Tony Payne- Reader in glaciology, Bristol John Shepherd- Professor of Marine Sciences, Southampton Andrew Watson- Professor of Environmental Sciences, UEA. Tim Lenton- Science Coordinator Bob Marsh- Collaborator Peter Cox- Collaborator (Hadley Centre) Intel- Hardware Compusys- Systems Integration Condor- Grid Middleware Grid ENabled Integrated Earth System Model (GENIE)Reading, Southampton, Bristol, CEH (Wallingford and Edinburgh), Hadley Centre, Imperial, UEA

  46. Earth System Science “By taking the ‘whole systems’ approach, we are more likely to find sustainable solutions to environmental problems.”

  47. Earth System Model Cryosphere Model Dynamic ice sheet including fast flow. Simple seaice Terrestrial Carbon Cycle Model Simplified TRIFFID Atmosphere Model Non-transient eddy resolving planetary wave Ocean Model 3-D, non-eddy-resolving, frictional geostrophic Terrestrial Hydrology Model Simplified MOSES scheme Ocean Carbon and Nutrient Cycle Model Including sediments “What are the sources, sinks and transportation processes of carbon within the Earth system?”

  48. Reality Grid (www.realitygrid.org) A tool for investigating condensed matter and materials

More Related