1 / 96

High Performance Computing – CISC 811

High Performance Computing – CISC 811. Dr Rob Thacker Dept of Physics (308A) thacker@physics. Assignments. Assignment 2 almost done (1 question left to mark) Will hand back next week… Assignment 5 will be up tonight. Today’s Lecture. Grids. Part 1: Introduction Part 2: Globus

elainal
Download Presentation

High Performance Computing – CISC 811

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Performance Computing – CISC 811 Dr Rob Thacker Dept of Physics (308A) thacker@physics

  2. Assignments • Assignment 2 almost done (1 question left to mark) • Will hand back next week… • Assignment 5 will be up tonight.

  3. Today’s Lecture Grids • Part 1: Introduction • Part 2: Globus • Part 3: Grid in Canada, around the world A number of overheads are drawn from presentations by Ian Foster. See the Globus website http://www.globus.org Definitive book on the subject is: “The Grid: Blueprint for a New Computing Infrastructure” edited by Foster and Kesselman

  4. Part 1: Grids • Identifying issues, possible solutions • What is “The Grid”

  5. Caveat Emptor • There is an enormous amount of terminology surrounding Grid • Result of the ideas being new and under development, and the inherent complexity involved • Grid really does try to be everything for everyone! • Buzzwords unfortunately have become common place • See through the hype! Incredibly useful glossary: http://www.nesc.ac.uk/global/glossary.html

  6. Why Should You Care? 1) Grid is a disruptive technology [Vision] • It ushers in a virtualized, collaborative, distributed world 2) Grid addresses pain points now [Reality?] • Grids are built not bought, but are delivering benefits in commercial settings 3) An open Grid is to your advantage [Future] • Standards are being defined now that will determine the future of this technology

  7. Networking : Not quite a utopia • Grid concept: Connectivity is a commodity • Users should be able to access local PC or other machines connected to grid with similar ease • There must not be a strong perception of complexity • Ideally it should be fault tolerant as well Your local network

  8. How to achieve this • Suppose resources are connected and operate via a system of resource allocators • Resources can be dynamically updated • Necessary if fault tolerance is to be included • Could potentially schedule many jobs across “The Grid” – A “metascheduler” • Assumes that there is excess useable capacity though

  9. The electricity distribution grid analogy • Prior to the development of national distribution systems local generators where responsible for power production • Funds to purchase a new generator were frequently difficult to come by • Different systems could operator with different standards • Hindered deployment of electricity driven systems • Arrival of national grid system provided • Standardized service • More capacity • Universally accessible

  10. Fundamental differences between computational and power grids • Connectivity: • A few thousand power stations on production side connect to billions of appliances on the consumption side • Several million computers in academia but production/consumption difference is largely lost • Market: • Grid is seen to be ubiquitous and egalitarian source of computing – cost is shared across users • Motivated in part by Gate’s argument that hardware costs will continue to fall • This philosophy will undoubtedly change WHEN Grid becomes more commercial • No overwhelming economic need for Grid in computing • Power Grid is absolutely vital in a modern economy though • Power grid is used extensively in energy trading • Leads to significant instability under load • Hopefully TCP/IP gets around this problem for Grid

  11. Segue: Decentralized Power Generation? • Some energy analysts suggest that it would be far better to follow the internet model for power generation! • Individual consumers are more prepared to absorb costs of generation equipment they “own” • Consider Ontario’s current problem: 40 billion dollar investment required to reinvigorate generation • 4 million households – net investment of 10,000 dollars each • Could supply approx 4000 MW from renewables, but it doesn’t come close to the 20000 MW average consumption

  12. More realistic perspective on Grid access • How about “access computing resources like we access Web content”? • We have no idea where a website is, or on what computer or operating system it runs • Two interrelated opportunities • Enhance economy, flexibility, access by virtualizing computing resources • Deliver entirely new capabilities by integrating distributed resources

  13. Automatically connect applications to services • Dynamic & intelligent • provisioning Application Virtualization Infrastructure Virtualization • Dynamic & intelligent • provisioning • Automatic failover Virtualization Applications: Delivery Application Services: Distribution Servers: Execution Source: The Grid: Blueprint for a New Computing Infrastructure (2nd Edition), 2004

  14. Virtual Organizations • Grid is designed to emphasize cooperation and a dynamic administration structure • Individuals come and go within a community • Communities tend to be long term objects • Virtual Organization: set of individuals and/or institutions who agree upon resource sharing and share a common goal • e.g. Cycle providers, storage providers, high energy physics collaboration • VOs vary enormously both in size and scope • Underlying hope is that enough commonality exists between VOs that a general purpose grid architecture is useful for all VOs

  15. Examples of successful Grid applications • To date all succesful grid applications share one common feature • They are all embarrasingly parallel applications • No communication is required to any process other than the computational server • Parameter searchers and data analysis are naturally the most popular applications • Examples: • Prime number searches • SETI@home: Spectral analysis of radio data • Climateprediction.net: Simulation of different climate models • Folding@home: protein folding simulation comparing various different drugs • fightAIDS@home: Reviews drug candidates using AutoDock software

  16. Peer-to-peer networks compared to Grid • P2P and Grid share a lot of common ideals • Both are services communicating by messages on shared resources • P2Ps tend to be more dynamic than Grid (Grid resources are usually quite static) • P2P applications are long-lived (i.e. everyone on the network shares a similar goal of file sharing) • Grid applications tend to be transient • P2Ps often tend to be very fault tolerant • Multiple redundancy tends to be built in • Lack of security is a significant difference between P2Ps and Grid • P2Ps don’t support the idea of VOs effectively (but nothing to stop individuals organizing themselves)

  17. Grid Benefits: Improved resource sharing • A large amount of computation is used ineffectively • Weather forecasting is close to optimal – requires ~1017 flops, and is distributed to 107 people • If everyone performed a weather forecast we would need 1024 flops per day per region! • Achieving better distribution of results and cooperation is key to this idea working • Origin of the word “collaboratory”

  18. Idle capacity • This is a fundamental enabler for computation on grids • There must be excess capacity available for the idea to work • Estimates suggest that most academic computers are used on computation for 30% of their operating time • Only true for desktop type machines – production clusters frequently subscribe at 90+% utilization • Unclear how much excess capacity is really available for computation • Certainly works in a global environment for production machines • Machines in nighttime zone may well be sitting idle

  19. The Computational Grid • The Computational Grid is the infrastructure that will enable computation to be carried out in a manner equivalent to power distribution grids • Two main components • Hardware – both compute servers and networking • Software – For the idea to be practical all users must have common standards • Achieving standardization is extremely difficult • Competing ideals • Creating climate of cooperation is also difficult • Users have to give up a certain level of control of their local systems

  20. The Data Grid • Counterpart to the computation intensive “Computational Grid” • Fundamental problem is coordinated management of data (and related computation) • Not simply about moving data quickly • Need coordination without centralized control • More than just a network – provides new services • To date biggest driver behind data grids is particle physics • Large detectors create enormous databases that must be distributed to various sites for analysis • Ideally there would be no distinction between data and computational Grids

  21. Korea Russia UK USA U Florida Caltech UCSD FIU Maryland Iowa Global LHC Data Grid Hierarchy ~10s of Petabytes/yr by 2007-8~1000 Petabytes in < 10 yrs? CMS Experiment Online System 0.1 - 1.5 GBytes/s CERN Computer Center Tier 0 10-40 Gb/s Tier 1 2.5-10 Gb/s Tier 2 1-2.5 Gb/s Tier 3 Physics caches Tier 4 PCs

  22. Grid versus Grids • Some people envision Grid as “…the web on steroids…” • Actually a fairly accurate description – as we’ll see in part 2 • Massive connectivity while important, is a collective function • Local networks that are imbued with services related to Grid can be viewed as a local Grid • “The Grid” can therefore be called a Grid of Grids • This hierarchy concept is being sold strongly by Vendors • Departmental grids are aggregated into institutional grids and so on

  23. Grid paradigm is overloaded Global Grids • Multiple enterprises, owners, platforms, domains, file systems, locations, and security policies • Legion, Avaki, Globus Enterprise “Grids” • Single enterprise; multiple owners, platforms, domains, file systems, locations, and security policies • SUN SGE EE, Platform Multicluster Cluster & Departmental “Grids” • Single owner, platform, domain, file system and location • SUN SGE, Platform LSF, PBS Desktop Cycle Aggregation • Desktop only • United Devices, Entropia, Data Synapse WARNING! Not everything that has “G” in the name is Grid! (SGE, Oracle 10g, Condor-G etc) Graph borrowed from A.Grimshaw

  24. Who will use Grid? • Government • Both federal and local • Healthcare • Academics • The new computational economy • Insurance industry is a prime example

  25. Government usage • Expected to be a comparatively small group of individuals who need large scale computation for solutions to • Disaster response • National defense • Planning and long-term research • Production environments for weather prediction are less relevant • Little excess capacity available these machines

  26. Healthcare • Telerobotic surgery is potentially life saving • Last resort when surgeon cannot reach patient • Remote cardiac analysis and monitoring by specialists • More detailed analysis of medical data-sets using Grid resources • Cross referencing of past case histories become feasible • Epdemiological simulations

  27. Academia • Driver for Grid is coming directly from academics • High Energy Physics – Next generation Large Hadron Collider will produce Exabytes of data by 2015 • Chemistry – vast amounts of data about thousands of chemical compounds is poorly stored and difficult to retrieve • CombeChem will provide a highly organized framework for storing and searching this data • Engineering – DAME • Engine health monitoring system with distributed databases • Environmental Science – GODIVA • Remote visualization of oceanography data

  28. Computational Market Economy • This will probably prove to be the “killer app” for Grid • Financial modelling and risk analysis • Interactive gaming • Graphics rendering • Compute resource providers • Will computational resources become a commodity?

  29. Leading adopters (Oct 2003) * • Financial services: 31% • Life sciences: 26% • Manufacturing: 18% Manufacturing Financial Services LS / Bioinformatics Mechanical/ Electronic Design Process Simulation Finite Element Analysis Failure Analysis Other Derivatives Analysis Statistical Analysis Portfolio Risk Analysis Energy Entertainment Cancer Research Drug Discovery Protein Folding Protein Sequencing Web Applications Weather Analysis Code Breaking/ Simulation Academic Seismic Analysis Reservoir Analysis Digital Rendering Massive Multi-Player Games Streaming Media “Gridified” Infrastructure *Grids 2004: From Rocket Science To Business Service, The 451 Group Early Commercial Applications Grid Services Market Opportunity 2005 Sources: IDC, 2000 and Bear Stearns- Internet 3.0 - 5/01 Analysis by SAI

  30. Fundamental Problems for Grid • Political • Users are strongly inclined toward control of facilities they have purchased • Similarly, users are less inclined to run programs on facilities which they have little control over • Especially when programs take a long time to run (several weeks) • Some people have championed the idea for their own (local) gains

  31. Grid: Bad Public Relations • Biggest hurdle is lack of understanding and terrible PR • Grid is different things to different people • Egregious claims have been made for what Grid could do • Biggest problem remains the belief that Grid could provide a single enormous metacomputer – this is largely a pointless concept • Some individuals call networks with scheduling software “a Grid” – not true, Grids are about infrastructure plus services • Hardware and software firms are utilizing it as a buzzword to generate FUD about being “left behind”

  32. Grid Hype

  33. Improving usability • Currently users must expend considerable time and effort to develop apps • Must use specialist environment e.g. • Need to transition from a low-level to high-level environment • This is one of the key problems in software development • Many difficulties to address • Dealing with resource allocation changes • Perhaps need better abstraction • Definite need to for better code sharing and reuse (consider HTML example) • Fault tolerance • Interoperability • Overall design is helped by drawing up a layering of users and their requirements

  34. Application Internet Protocol Architecture “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Collective “Sharing single resources”: negotiating access, controlling use Resource “Talking to things”: communication (Internet protocols) & security Connectivity Transport Internet “Controlling things locally”: Access to, & control of, resources Fabric Link What might a grid protocol stack look like? Application Layering is a fundamental component of Grid – complexity necessitates it!!!!

  35. Fabric layer • Provides the local services of a given resource • Computational, storage, network… • Contrast to OSI and IP stack protocol, physical connectivity is no longer the lowest level • Virtualization of resources forces them to the lowest level

  36. Connective layer • Defines core communication and authentication protocols • Conduit for data exchanges between fabric layer resources • This is where security becomes important

  37. Resource Layer • Enables sharing of a single resource by many users • Built on top of the connectivity layer and calls fabric layer functions to control and access resources

  38. Collective Layer • Coordinates interactions across multiple resources • Ties everything together, including multiple resources and services

  39. Application layer • Users make use of the layers defined beneath them to perform operations within a VO

  40. Grid Security • Security is one of the dominant concerns in Grid development • Increased usability means increased access for processes • User authentication becomes paramount, however it must not be overly pervasive (usability is lost!) • Virus and worm transfer has the potential to be greatly inflated in a grid environment • Potential for Denial of Service (DoS) attacks being launched is worrisome, but job scheduling tags means things can be traced effectively • But what if the hacker could hide their jobs in the scheduler?

  41. Security Terminology • Authentication: establishing identity • Authorization: establishing rights to resources and services • Message protection • Integrity and confidentiality • Non-repudiation • Concerns over digital signatures and deniability • Certificate Authority • Trusted third party which issues digital certificates for use by others

  42. Factors that contribute to the security problem • Problems being studied on grids may well be sensitive, and the resources used highly valuable • Resources may have different usage guidelines and policies • Separate administrative domains! • Organization of resources is non-trivial • Not a simple client/server relationship • Delegation will be necessary • Any security standards must be both applicable and readily available • Need to be integrated into many different tools

  43. Classifying Grid users

  44. Open Grid Services Architecture – Putting Everything Together • Underlying concept of grid is virtualization • Applications run on virtual machines • Users form virtual organizations • Orienting design toward providing services helps to enable this approach • OGSA is designed to address a number of key issues for Grid • Utility and interoperability • Management of distributed services • Builds on web services standards • Becoming a hot topic for vendors – “Grid compliance” See “The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration”, Foster, Kesselman, Nick, Tuecke, 2002

  45. Standardization Open Grid Services Architecture Domain-Specific Services Program Execution Data Services Core Services Open Grid Services Infrastructure WS Resource Framework Web Services Messaging, Security, Etc.

  46. Global Grid Forum • www.gridforum.org • Leads standardization effort • several thousand individuals from industry and academia participate • Spans the gamut from researchers building Grid software to end users • All participants have equal voice • Modeled after Internet standards process • Focused on ensuring “best practices” become widespread • Publishes 4 types of documents • “Informational” dispersal of information about useful technologies • “Experimental” results of research • “Community Practice” inform and influence community as approaches or processes become accepted by concensus • “Recommendations” technical specifications

  47. Summary Part 1: The Grid Is … • A collaboration & resource sharing infrastructure for scientific applications • A distributed service integration and management technology • A disruptive technology that enables a virtualized, collaborative, distributed world • An open source technology & community • A marketing slogan • All of the above

  48. Globus is the emerging standard for Grid software infrastructure Good and bad points “Microsoft for the Grid” Open Source Backed by IBM Evolving on a short time scale GT1: 1998, GT4: 2005 Part 2: Globus

  49. Globus overview • Globus project is a joint venture between Argonne Nat. Lab and USC • Primarily performs infrastructure development • Have developed a prototype computational grid (GUSTO) • Most significant contribution is the Globus Toolkit • Comprehensive set of applications covering a gamut of issues from security to management of grids • Can co-exist with other software environments • Designed to bridge gaps between differences in local environments • Designed fundamentally to be a tool-based approach • Avoids imposing specific solutions on users

  50. Globus Hourglass – Simpify whenever possible • Focus on specifics of the Grid architecture • Provide a simple set of fundamental services which facilitate the basic infrastructure • Build high-level local solutions from the fundamental services • Design principles • Local control must be maintained • Minimize the cost of participation • Provide adaptable toolkit – do not over specify or lose ability to be flexible • “IP hourglass” model A p p l i c a t i o n s Diverse global services Core Globus services Local OS

More Related