1 / 28

Grids, utility computing and a perspective on the future of IT infrastructure

Grids, utility computing and a perspective on the future of IT infrastructure. Washington Area CTO Forum March 31, 2006 Nirav Kapadia nhkapadia@gmail.com. Outline. Characterizing computing grids Grids as intended versus what we see today Common types of grids today

flavio
Download Presentation

Grids, utility computing and a perspective on the future of IT infrastructure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grids, utility computingand a perspective onthe future of IT infrastructure Washington Area CTO Forum March 31, 2006 Nirav Kapadia nhkapadia@gmail.com

  2. Outline • Characterizing computing grids • Grids as intended versus what we see today • Common types of grids today • Putting computing grids to work • Types of problems addressed by today’s grids • Operational considerations in deploying a grid • A perspective on the future of IT infrastructure • Cost pressures and technology commoditization • Grid and utility computing: the technology enablers

  3. Grids came about from a need for large scale, collaborative computing • Scale is measured in terms of users, nodes, organizations, geography, and heterogeneity • A grid in the strict sense of the word involves a large number of heterogeneous, shared resources • Collaboration is measured in terms of resource sharing and interoperability • A key characteristic is the ability to manage across organizational boundaries

  4. Broad definition of computing grid Strict definition of computing grid Systems for large scale, collaborative computing must meet key criteria Group A Scalable with users and resources Support for heterogeneity Group B Support for interoperability Scalable with geographical distances Group C Fully distributed (federated) architecture Ability to compartmentalize along organizational boundaries

  5. Many commercial grid solutions only meet the broad definition of a grid • Cluster management systems • Typically harness clusters of dedicated servers • Examples include Platform LSF, Sun Grid Engine • CPU-scavenging “master-slave” applications • Typically take advantage of idle desktop cycles • Examples include SETI@Home, distributed.net

  6. Many commercial grid solutions only meet the broad definition of a grid • Application-specific, custom-built grids • Typically built around a key business function • Examples include Acxiom, Oracle offerings

  7. Today, solutions that meet the strict definition of a grid have to be “built” • Grid solutions based on the Globus toolkit • Several vendors have Globus based offerings • Univa Corp is commercializing Globus • Other grid solutions in academia and research • Most are custom-built and target a specific problem • Typically not appropriate for commercial use (today)

  8. Key takeaways • A grid is a distributed computing system that enables large scale, collaborative computing • Scalable across a large number of diverse and geographically dispersed resources • Many commercial “grid solutions” of today do not meet the strict definition of a grid • Limited ability to manage policies and resources across administrative boundaries

  9. Outline • Characterizing computing grids • Grids as intended versus what we see today • Common types of grids today • Putting computing grids to work • Types of problems addressed by today’s grids • Operational considerations in deploying a grid • A perspective on the future of IT infrastructure • Cost pressures and technology commoditization • Grid and utility computing: the technology enablers

  10. Even today’s grids can benefit users with large scale computing needs • High throughput computing (HTC) • Many independent (non-communicating) tasks • Large problems that break up into manageable, independent tasks • High performance computing (HPC) • Large problem that is not decomposable into manageable, independent tasks

  11. High throughput computing is common in business environments • Large, legacy applications are best served by cluster management systems • Compute-intensive apps are preferable but a mix of compute- and data-intensive apps are manageable • Customizable apps that work on small slices of data work well with CPU-scavenging grids • Apps must be compute-intensive and preferably run within a sandbox

  12. High performance computing isseen more in targeted environments • Applications involving multiple, communicating tasks are typically require custom designed grid environments • Examples include Oracle grid offering and some test beds built with Globus • Other examples include distributed computing platforms such as PVM and MPI

  13. So… you’re ready to deploy a grid computing environment… • As with any other technology, there are several operational considerations… • Resources on the grid – dedicated or shared? • Access management – who needs access to what? • Data management – how does data get to the grid? • Security model employed by the grid

  14. Cluster Mgmt Systems Cluster management systems work best with dedicated resources Condor – from the U of Wisconsin – is a notable exception, but not commercially available CPU-scavenging grids As the name implies, resources are shared – and typically involve desktops A custom screen saver is the most common vehicle for running the grid application Resources on the grid –should they be dedicated or shared?

  15. Cluster Mgmt Systems Option #1: jobs run in a guest account Shared access across jobs Option #2: accounts for everyone on all machines Homogeneous uid pool highly recommended Logins typically disabled CPU-scavenging grids Option #1: jobs run with user’s privileges If downloaded by user Option #2: jobs run in guest account If set up by administrator No direct remote user access to desktop Access management –who needs (gets) access to what?

  16. Cluster Mgmt Systems Transfer user specified files via ftp, scp, etc File staging for large data On demand file transfer (system call traps) Shared file systems CPU-scavenging grids Data embedded within application or retrieved via HTTP/Java call-backs Limited data, typically no files Data management –how does data get to the apps?

  17. Access management (capability control) Opportunities for subversion distributed.net, SETI@Home, etc Globus Java, PCCs Condor LSF, PBS, SGE Unix Ideal Grid Security model –user accountability is key today Custom Applications Source Code Modifications Object Code Modifications Basic system and kernel safeguards Unchanged Binaries Application Executable Application Generation Application Users Run Time Environment

  18. Key takeaways • Today’s commercially available grid solutions primarily target high throughput computing • Cluster management systems and CPU-scavenging grids are the most common • Carefully consider the policy implications of grids in terms of access and data management • More of a concern for grids that span sub-nets or fire walls

  19. Outline • Characterizing computing grids • Grids as intended versus what we see today • Common types of grids today • Putting computing grids to work • Types of problems addressed by today’s grids • Operational considerations in deploying a grid • A perspective on the future of IT infrastructure • Cost pressures and technology commoditization • Grid and utility computing: the technology enablers

  20. Even as grids take hold, theIT landscape is changing rapidly… • Technology is rapidly being commoditized • Businesses are more willing and able to shop for IT services • In-house IT infrastructure is increasingly seen as complex and rigid © Harvard Business Review

  21. IT infrastructure is already a commodity from a business view • Outsourcing is pervasive; and standards-based, open systems are increasingly common • Cost pressures will continue driving businesses to streamline IT infrastructure • More often than not, customized in-house IT systems stand out for their cost and complexity • Common off-the-shelf solutions provide more value in the absence of direct competitive advantage

  22. In time, economics will drive IT infrastructure out of the enterprise • The technology enablers for this paradigm exist today, but are still nascent • (True) grids offer a way to manage computing resources across organizational boundaries • Utility computing solutions bring together grids, data center automation, and virtualization

  23. The technology implications of these changes are enormous • Computing infrastructure needs to become transparent to end users • Users only interact with applications and data • Policy management needs to be decoupled from system management • Cannot assume users can be held accountable • Components of computing systems need to be less tightly coupled • CPU, OS, data, apps may all be in different, remote locations

  24. A utility computing test bed at Purdue showcases this paradigm • Operating since 1995; now a joint development effort between Purdue and U of Florida • By 2001, allowed 3,000+ users from 30 countries to run ~100 applications in a utility environment • Extensively validated: ~400,000 runs (by 2001); highly peaked usage profile • Powers online simulations in the nanoHUB.org portal for the nanotechnology community

  25. Physical Machine Virtual Machine Real users and real usage >10,687 users Condor-G Globus TeraGrid Cluster nanoHUB.org – remote access to simulators and compute power nanoHUB infrastructure Internet nanoHUB.org Web site Remote desktop (VNC) NMI Cluster Slide courtesy of Gerhard Klimeck, Network for Computational Nanotechnology

  26. Custom computing environment assembled in real time Web Portal Application Repositories OS Repositories Data Vaults CPU Farms Inside nanoHUB.org Local Services Utility Services PUNCH Virtual Machine

  27. In conclusion… • Today’s commercially available grids provide a valuable but narrow service • More efficient computing in a closed environment; limited support for cross-organizational sharing • In time, grid and utility computing technologies will move IT infrastructure out of the enterprise • Virtualization and data center automation products are visible precursors

  28. Questions? Comments? Email: nhkapadia@gmail.com

More Related