A Framework for Grid and Utility Computing

KCCMG - Fall IMPACT 2005 A Framework for Grid and Utility Computing Automatic Provisioning and Load Managementfor Server Farms and Blade ServersJoe Bell, Bob EquitzFujitsu Computer Systems KCCMG 2005 IMPACT

SuperDome and HP Open View Automation Manager are trademarks of HP PRIMEPOWER, PRIMEQUEST and Adaptive Services Control Center are trademarks of Fujitsu Ltd., Fujitsu Computer Systems and Fujitsu-Siemens Corp. eServer, pSeries, P5, zSeries, z990, Tivoli Provisioning Manager (TPM), and Tivoli Intelligent Orchestrator (TIO), JES2, JES3, NJE, YES/MVS, S/360 and SYSPLEX are Trademarks of IBM. SunFire E25k, N1 Grid System, N1 Grid Service Provisioning System (GSPS), N1 Provisioning Server Blades Edition, N1 Grid Engine, and N1 System Manager are Trademarks of SUN. OpForce is a Trademark of Veritas. EMC (Legato) AutoStart is a Trademark of EMC (Legato). All automated provisioning and virtualization concepts and notes relating to such, in this presentation, are based on Fujitsu Siemens Corporation documentation material. TRADEMARKS KCCMG 2005 IMPACT

-Joe Bell - Define Grid and or Utility Computing What’s wrong with today’s typical environment The factors pushing Grid and Utility designs Business drivers Scientific & engineering Why can’t we just make everything run faster? Speed of light Moore’s Law Parallelism Bottlenecks What are we going to cover? KCCMG 2005 IMPACT

-Bob Equitz - Review today’s environment What’s the new technologies we need for Utilities/Grids? Why Virtualize and what’s the general process? Remove boundaries Implement management Assign/provision services The Road Map to Autonomic Systems The process to maintain Autonomic Systems Why are they important to our goals General configuration end-to-end Product review Summary What are we going to cover? KCCMG 2005 IMPACT

Bell’s simple minded approach – start with the basics Merriam/Webster – Grid : A framework, or lattice or network or net or web … 1 : GRATING 2 a (1) : a perforated or ridged metal plate used as a conductor in a storage battery (2) : an electrode consisting of a mesh or a spiral of fine wire in an electron tube (3) : a network of conductors for distribution of electric power; also : a network of radio or television stations b : a network of uniformly spaced horizontal and perpendicular lines (as for locating points on a map); also : something resembling such a network What is Grid or Utility Computing Anyway? KCCMG 2005 IMPACT

Merriam/Webster – Utility: Usefulness, service, convenience, function, ….. 1 : fitness for some purpose or worth to some end 2 : something useful or designed for use 3 a : PUBLIC UTILITY b (1) : a service (as light, power, or water) provided by a public utility (2) : equipment or a piece of equipment to provide such service or a comparable service 4 : a program or routine designed to perform or facilitate especially routine operations (as copying files or editing text) on a computer What is Grid or Utility Computing Anyway? KCCMG 2005 IMPACT

Bell’s feeble mind – A Computing or IT Grid Utility: A network of heterogeneous computer server and storage platforms, providing the framework and infrastructure for the useful, convenient and functional computational services, that support a given set of business and / or scientific IT requirements. Or roll your own variation within the previous definition boundaries. What is Grid or Utility Computing Anyway? KCCMG 2005 IMPACT

Is this really new stuff? • Shared Disk – Loosely Coupled Systems • What was JES2/NJE. JES3? • Home Grown Load Balancers? • Resource Affinity Scheduling? • Serially Reusable Resource Controls Across Multiple Systems? • SYSPLEX • YES/MVS KCCMG 2005 IMPACT

Today‘s computing geography – Static and unshared islands • Dedicated IT systems • Inflexible use of resources • Low resources utilization • Hard to manage • High TCO, low ROI KCCMG 2005 IMPACT

Business Imperative:Increase Utilization KCCMG 2005 IMPACT

Data bases / files too large for required nightly linear or sequential massaging / Reorganization requirements ... Split it up to more isolated systems. Peak processing requirements requiring huge amounts of CPU that are not utilized during off peak time periods. Outage costs that are prevented with duplicate idle hardware resources and software licenses (HA Clustering) Transactions that must process too much data to meet SLAs – More smarts in the application or middle ware required – or a very big CPU for a relatively small gain. OSs and program products that don’t manage more than one instance very well, and/or don’t share resources .. (all for one – one for all) All of the above and more press IT in the direction of lowered resource utilizations or additional resources that are isolated and under utilized .. 180º away from the goal of higher utilizations!! Other Business Requirements & Issues KCCMG 2005 IMPACT

60% 100% 100% 50% 80% 80% 40% 60% 60% 30% 40% 40% 20% 20% 20% 10% 0% 0% Service before Grid Utility – typical bimodal Service with Grid Utility 0% Utilization before Grid Utility 60% 50% 40% 30% 20% 10% 0% Utilization after Grid Utility Business Objectives • better utilization of hundreds of pooled systems more achievable. • Administering and maintaining workloads - Provisioning KCCMG 2005 IMPACT

“To prevent the data center from consuming the entire IT budget, increased manageability and utilization through standardization and automation are essential” Source: 2003 META Group – The Data Center of the Future 75% of all IT staff is absorbed by maintaining the existing IT Source: Andy Butler, Gartner Group, April 2004 Utilization of UNIX/Windows servers is low ( < 25% over 24 hours across all servers) Source: 2003 META Group – The Data Center of the Future Todays Low Agility & Efficiency of IT KCCMG 2005 IMPACT

The traditional scientific paradigm First do theory on paper Then perform experiments to confirm or deny the theory. The traditional engineering paradigm First do a design Then build a laboratory prototype. These paradigms are being replaced by numerical experiments and digital prototyping – why? Real phenomena are too complicated to model on paper (e.g. climate prediction). Real experiments are too hard, too expensive, too slow, or too dangerous for a laboratory e.g. oil reservoir simulation, large wind tunnels, overall aircraft design, galactic evolution, whole factory or product life cycle design and optimization, weather prediction, nuclear fusion control etc. Scientific and Engineering Requirements KCCMG 2005 IMPACT

Scientific and engineering problems requiring the most computing power to simulate are commonly called “Grand Challenges” or “largest problems” For example predicting the climate 50 years ahead is estimated to require computers computing at the rate of 1 TFLOP and with a memory size of 1 TB 1 MFLOP = 106 floating point operations per second 1 GFLOP = 109 floating point operations per second 1 TFLOP = 1012 floating point operations per second Why even use a grid parallel process? KCCMG 2005 IMPACT

Weather prediction for one week requires 56 GFLOPS. Climate prediction for 50 years requires 4.8 TFLOPS. The actual grid resolution used in climate codes today is 4 degrees of latitude by 5 degrees of longitude, or about 450 km by 560 km. A near term goal is to improve this resolution to 2 degrees by 2.5 degrees, which is four times as much data. NASA has launched weather satellites expecting to collect 1TB of data per day for a period of years Totaling > 6 PB (1015 bytes - Peta Bytes) of data over time. No existing system is large enough to store this data today. The Sequoia 2000 Global Change Research Project is concerned with building this database. http://appl.nasa.gov/pdf/61537main_eosdis_case_study_602904.pdf Some other sites: http://www.cio.noaa.gov/hpcc/ http://www.noaa.gov/ A stake in the ground - Weather forecasting KCCMG 2005 IMPACT

The clock speed is increasing – can’t we just push it all the way up to 1THz? 1 flop / Hz  1TFLOPS No - speed of light sets a limit upon the speed of a computer. Now assume a completely sequential computer with 1 TB of memory running at 1 TFLOP. If the data has to travel a distance d to get from the memory to the CPU, and it has to travel this distance 1012 times per second at the speed of light c=3x108 m/s, then d <= 3 *108 / 1012 = 0.3 mm. So the computer theoretically has to fit into a 0.3 mm cube. Now consider the 1TB memory. Memory is conventionally built as a planar grid of bits, in our case say a 106 x 106 grid of words. If this grid is 0.3mm by 0.3mm, then one word occupies about 3 Angstroms (Å) by 3 Angstroms (.3x10-3/1x106 per side), or the size of a typical atom. Getting close to 3 Angstroms? 1nm= 10Å 45nm (current Fujitsu leading edge chip etching)= 450Å, 450Å/ 3Å=> 150 atoms for the etching size Why Parallelism is Essential KCCMG 2005 IMPACT

How Small Can We Go? From the beginning to the present: on the left an early computing machine built from mechanical gears, on the right a state-of-the art IBM chip with 0.25 micron features. The production version will contain 200 million transistors. http://www.qubit.org/library/intros/nano/nano.html KCCMG 2005 IMPACT

Nanocomputing with Quantum Effects The transition from microtechnology to nanotechnology. The structure on the right is a single-electron transistor (SET) which was carved by the tip of a scanning tunneling microscope (STM). According to classical physics, there is no way that electrons can get from the 'source' to the 'drain', because of the two barrier walls either side of the 'island'. But the structure is so small that quantum effects occur, and electrons can, under certain circumstances, tunnel .through the barriers (but only one electron at a time can do this!). Thus the SET wouldn't work without quantum mechanics. http://www.qubit.org/library/intros/nano/nano.html KCCMG 2005 IMPACT

As of Jan 1996, the fastest machine then was an Intel Paragon with 6768 processors and a peak speed of 50 Megaflops/proc, for an overall peak speed of 6768*50 = 338 GFLOPS. Doing Gaussian elimination, the machine got 281 GFLOPS on a 128600x128600 matrix; the whole problem takes 84min. The Linpack Benchmark (component of SPECFP), sorts all machines by the speed with which they can solve systems of linear equations Ax=b, of various dimensions, using Gaussian elimination. In the Netlib repository there is a long list of computers, together with performance benchmark information. As of June. 2005, the fastest machine on the TOP500 (see Top500.org) list is the IBM Blue Gene with a peak speed of 183 TFLOPS Trips, or the Tera-op Reliable Intelligently Adaptive Processing System. "Our goal is to exploit concurrency, …” Defense Advanced Research Projects Agency in its Polymorphous Computing Architectures project. DARPA, which is contributing $15.4 million to Trips, is looking for a chip that is able to scale to 1 trillion sustained operations (tera-op) per second on many applications http://www.computerworld.com/hardwaretopics/hardware/story/0,10801,104911,00.html?source=NLT_EMC&nid=104911 Fast and Parallel KCCMG 2005 IMPACT

Where do the FLOPS go? Why does the speed depend so much on the problem size? The answer lies in understanding the memory hierarchy. All computers, even cheap ones, look something like this (since IBM S/360 – or 3rd generation): Writing Fast Programs is Hard KCCMG 2005 IMPACT

The memory at the bottom level of the hierarchy, disk, is large, slow and cheap Useful work, such as floating point operations, can only be done on the data at the top of the hierarchy. Transferring data among levels is slow, much slower than the rate at which we can do useful work on data in the registers. In fact, this data transfer is the bottleneck in almost all computation and numerical analyses More time is spent moving data in the hierarchy than doing useful work. These are the non-compute related tasks that significantly impact scalability of compute clusters Thus enhancing these systems provides future potential Writing Fast Programs is Hard KCCMG 2005 IMPACT

Good algorithmic designs require keeping active data near the top of the hierarchy for as long as possible, as well as minimizing movement between levels. For many problems, like Gaussian elimination, only if the problem is large enough, is there enough work to do at the top of the hierarchy to mask the time spent transferring data between lower levels. – Else, your no better than a few sequential processors… The more processors one has, the larger the problem has to be to mask this transfer time. These mechanisms are inherently inefficient Writing Fast Programs is Hard KCCMG 2005 IMPACT

Moore’s Law Speeds of basic microprocessors grow by approximately a factor of 2 every 18 months because; Number of transistors doubles every 18 months One of the reasons Moore's Law is true is that microprocessor manufacturers are adopting many of the tricks of parallel computing and accounting for memory hierarchies. Getting the peak speed from the processor is becoming increasingly more difficult. Facet- there is no way around the issue today without radical new technology. Writing Fast Programs is Hard KCCMG 2005 IMPACT

Which takes longer always depends upon The application in hand The speed of the processor - memory architecture The speed of the network For a given problem, any of the above is a huge“bottleneck” – whether Business or Scientific Computing The bottleneck can be reduced –maybe .. at least partially, by introducing a large SMP based entities, as elements of the Grid/Utility with a massive interconnect backbones (e.g., HP SuperDome, Fujitsu PRIMEPOWER & PRIMEQUEST, eServer p5 Series, zSeries z990, Sun Fire E25K)to reconcile these mutually exclusive grid design constraints. Analyze the requirement for speed and availability versus costs : several large SMP’s versus large clusters of 1U commodity servers both arranged into a grid structure. Potential to produce a hybrid of both Good R&D Project! To COMPUTE or To COMMUNICATE? KCCMG 2005 IMPACT

Which one would you challenge?” “Arguing with an engineer is like wrestling in mud with a pig: After a while you realize the pig likes it!!” --Mark Simmons, Sr. Consulting Engineer and Marketing Product Specialist, FCS James Montgomery Doohan (March 3, 1920 – July 20, 2005) was a Irish-Canadiancharacter and voiceactor best known for his portrayal of Scotty in the television and movie series Star Trek. Pig in Mud KCCMG 2005 IMPACT

Today‘s computing geography – Static and unshared islands • Inefficient • Over / Under provisioned • Hard to manage • Inflexible Remember; This is where most of us are today…………… KCCMG 2005 IMPACT

Required Core Technologies Virtualization Separation of business applications and data from the need for dedicated technology Automation Automatic adjustment of platforms and infrastructure to changes in operation & environment Integration Low-cost, low-risk implementations & upgrades, re-usable technology, unified processes and services as validated product integration templates KCCMG 2005 IMPACT

IT resources are shared, not isolated as in today’s “islands of computing” model Business priorities determine the allocation of IT resources Service levels are predictable and consistent, despite the unpredictable demands for IT services Services Server Virtualization Storage Virtualization Application Business Efficiency through Virtualization KCCMG 2005 IMPACT

Remove ServerBoundaries Pooling and sharing of the overall resource Service A Service B Service C Service D Service E Service A Service B Service C Service D Service E KCCMG 2005 IMPACT

Remove ServerBoundaries ConsolidateStorage Pooling and sharing of the overall resource Service A Service B Service C Service D Service E KCCMG 2005 IMPACT

Remove Server Boundaries Establish overall management ConsolidateStorage Pooling and sharing of the overall resource Service A Service B Service C Service D Service E KCCMG 2005 IMPACT

Remove Server Boundaries Establish overall management Assign Services Consolidate Storage ServiceC ServiceD ServiceA Pooling and sharing of the overall resource ServiceB ServiceE KCCMG 2005 IMPACT

Automatic provisioning & loadmanagement for Applications ApplicationQoS Metrics Workload Graph Applicationinstances Resourceallocation KCCMG 2005 IMPACT

QoS Monitoring & Management Measured QoS metric exceeds the specified maximum acceptable value Allocate more satellite nodes and deploy needed application to meet QoS target High Water Mark Target Metric Range Low Water Mark QoS Metric Time Measured QoS metric is below the specified minimal acceptable value (too many resources) Perform orderly shutdown of some instances thus reducing cost and freeing the resources for other work. KCCMG 2005 IMPACT

Virtualization: On the way to Autonomic Systems • Self configuring • Self optimizing • Self protecting • Self healing • Dynamic provisioning • Allocation policies • Consistent QoS FLEXIBILITY • Standardization • Consolidation • Automation Autonomic Systems Virtualization Resource management COST KCCMG 2005 IMPACT

Autonomic system monitoring & control Event Handling self configuring self healing Event Generation Measures AutonomicCycle Monitoring Execution self optimizing self protecting Resource A(HW, FW, OS, Middleware Application, System) Interface forMonitoring Resource B(HW, FW, OS, Middleware Application, System) Interface forControl Managing the autonomic cycle • “Autonomic” rules • ... • Scripts • Policies • Event rules • Thresholds • ... • SNMP values • Commands • Systemparameters • ... • Commands • SNMP set • ... KCCMG 2005 IMPACT

Benefits include: 1) fast and unattended adaptation of IT infrastructure to changing business requirements; 2) automated monitoring and immediate reaction to changing workloads without operator intervention and with lower operating risks; 3) no changes to applications or operating systems are required. Transparently manages Linux or Windows or.. And applications. 4) better response to changing demands 5) easier accommodation of SLA’s Total effect: Further reduction of wasted or over utilized resources and reduction of personnel monitoring and manually adjusting the systems  further reduction of TCO per unit of work accomplished. Importance of Autonomic Functions KCCMG 2005 IMPACT

Shared storage (NAS) Data Areas Images Actions Terminal Server Deploy Console Inventory Policies Tie it all together .. End –to- End Solution Adaptive Service Manager • Automatic provisioning • Efficient management of large environment • Automatic workload management • Automated policy-based management • Centralized fully automated operating system and application deployment • Single image administration PRIMERGY (e.g. BX600) Storage network Control network Client network Spare • Adaptation • Restart • RemoteDeploy Monitoring Clients OS Deploy-ment Server KCCMG 2005 IMPACT

Fujitsu Siemens Adaptive Services Control Center 1.1: Automatic provision, deploy, monitor, and allocate resources - controlling load, utilization and service level and quality metrics for each application service, per user requirements. HP Open View Automation Manager is a data center automation solution that extends Open View Change and Configuration Management solution to automatically re-provision resources in accordance with business priorities. Automation Manager runs under Windows and supports Windows and Linux servers. IBM Tivoli Provisioning Manager (TPM) combined with Tivoli Intelligent Orchestrator (TIO) automates tasks in anticipation of, or in response to, changing conditions. TIO manages pooled resources and prioritizes allocations. TPM provisions resources. TIO monitors performance and decides what actions to take in order to maintain committed application service levels. Products for Investigation KCCMG 2005 IMPACT

EMC (Legato) AutoStart is a cluster solution integrating EMC’s suite of storage products with application availability. AutoStart supports automated switching of servers, networks, and data. SUN N1 Grid System, “a collection of architectures, products, and services...” products are available today – N1 Grid Service Provisioning System (GSPS), N1 Provisioning Server Blades Edition, and N1 Grid Engine. N1 System Manager GSPS automates application provisioning on Solaris, Linux, AIX, and Windows servers. Veritas OpForce is based on software acquired from Jareva Technologies in 2003. VERITAS positions OpForce for server automation and provisioning, and managing IT resource lifecycle. OpForce automates tasks associated with controlling, provisioning, and updating heterogeneous data center environments, including bare-metal discovery, resource pooling, and application and OS software deployment. Products for Investigation KCCMG 2005 IMPACT

Summary • Requirements for Utility/Grid are driven by • Business TCO pressures • Scientific/Engineering problem solving requirements • Full-featured Framework for Utility/Grid Computing • High availability • High scalability • Disaster recovery • Automatic provisioning • Enterprise applications • Multi-platform support • Solaris, Linux, Windows, VMware, ......... • ... And easy installation / operation with instrumentation • Self managing, self scaling, self healing, self adapting, auto configuration updates KCCMG 2005 IMPACT

A Framework for Grid and Utility Computing