chapter 4 introduction to grid and its evolution n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Chapter 4:- Introduction to Grid and its Evolution PowerPoint Presentation
Download Presentation
Chapter 4:- Introduction to Grid and its Evolution

Loading in 2 Seconds...

play fullscreen
1 / 60

Chapter 4:- Introduction to Grid and its Evolution - PowerPoint PPT Presentation


  • 221 Views
  • Uploaded on

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT. Overview. Background: What is the Grid? Related technologies Grid applications Communities Grid Tools Case Studies. What is a Grid?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Chapter 4:- Introduction to Grid and its Evolution


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. Chapter 4:-Introduction to Grid and its Evolution Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

    2. Overview • Background: What is the Grid? • Related technologies • Grid applications • Communities • Grid Tools • Case Studies

    3. What is a Grid? • Many definitions exist in the literature • Early defs: Foster and Kesselman, 1998 “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational facilities” • Kleinrock 1969: “We will probably see the spread of ‘computer utilities’, which, like present electric and telephone utilities, will service individual homes and offices across the country.”

    4. Grid computing (1) “Coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organisations” (I. Foster)

    5. Grid computing (2) • Information grid • large access to distributed data (the Web) • Data grid • management and processing of very large distributed data sets • Computing grid • meta computer

    6. Parallelism vs grids: some recalls • Grids date back “only” 1996 • Parallelism is older ! (first classification in 1972) • Motivations: • need more computing power (weather forecast, atomic simulation, genomics…) • need more storage capacity (Petabytes and more) • in a word: improve performance ! 3 ways ... Work harder --> Use faster hardware Work smarter --> Optimize algorithms Get help --> Use more computers !

    7. The performance ? Ideally it grows linearly • Speed-up: • if TS is the best time to process a problem sequentially, • then the parallel processing time should be TP=TS/P with P processors • speedup = TS/TP • the speedup is limited by Amdhal law: any parallel program has a purely sequential and a parallelizable part TS= F + T//, • thus the speedup is limited: S = (F + T//) / (F + (T///P)) < P • Scale-up: • if TPS is the time to solve a problem of size S with P processors, • then TPS should also be the time to process a problem of size n*S with n*P processors

    8. Why do we need Grids? • Many large-scale problems cannot be solved by a single computer • Globally distributed data and resources

    9. Background: Related technologies • Cluster computing • Peer-to-peer computing • Internet computing

    10. Cluster computing • Idea: put some PCs together and get them to communicate • Cheaper to build than a mainframe supercomputer • Different sizes of clusters • Scalable – can grow a cluster by adding more PCs

    11. Cluster Architecture

    12. Peer-to-Peer computing • Connect to other computers • Can access files from any computer on the network • Allows data sharing without going through central server • Decentralized approach also useful for Grid

    13. Peer to Peer architecture

    14. Internet computing • Idea: many idle PCs on the Internet • Can perform other computations while not being used • “Cycle scavenging” – rely on getting free time on other people’s computers • Example: SETI@home • What are advantages/disadvantages of cycle scavenging?

    15. Some Grid Applications • Distributed supercomputing • High-throughput computing • On-demand computing • Data-intensive computing • Collaborative computing

    16. Grid Users • Many levels of users • Grid developers • Tool developers • Application developers • End users • System administrators

    17. Some Grid challenges • Data movement • Data replication • Resource management • Job submission

    18. Computational grid • “Hardware and software infrastructure that provides dependable, consistent, pervasive and inexpensive access to high-end computational capabilities” (I. Foster) • Performance criteria: • security • reliability • computing power • latency • throughput • scalability • services

    19. Grid characteristics • Large scale • Heterogeneity • Multiple administration domain • Autonomy… and coordination • Dynamicity • Flexibility • Extensibility • Security

    20. Levels of cooperation in a computing grid • End system (computer, disk, sensor…) • multithreading, local I/O • Cluster • synchronous communications, DSM, parallel I/O • parallel processing • Intranet/Organization • heterogeneity, distributed admin, distributed FS and databases • load balancing • access control • Internet/Grid • global supervision • brokers, negotiation, cooperation…

    21. Basic services • Authentication/Authorization/Traceability • Activity control (monitoring) • Resource discovery • Resource brokering • Scheduling • Job submission, data access/migration and execution • Accounting

    22. Application Application Internet Protocol Architecture “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Collective “Sharing single resources”: negotiating access, controlling use Resource “Talking to things”: communication (Internet protocols) & security Connectivity Transport Internet “Controlling things locally”: Access to, & control of, resources Fabric Link Layered Grid Architecture(By Analogy to Internet Architecture) From I. Foster

    23. Resources • Description • Advertising • Cataloging • Matching • Claiming • Reserving • Checkpointing

    24. Resource management (1) • Services and protocols depend on the infrastructure • Some parameters • stability of the infrastructure (same set of resources or not) • freshness of the resource availability information • reservation facilities • multiple resource or single resource brokering • Example of request: I need from 10 to 100 CE each with at least 512 MB RAM and a computing power of 150 Mflops

    25. Resource management and scheduling (1) • Levels of scheduling • job scheduling (global level ; perf: throughput) • resource scheduling (perf: fairness, utilization) • application scheduling (perf: response time, speedup, produced data…) • Mapping/Scheduling process • resource discovery and selection • assignment of tasks to computing resources • data distribution • task scheduling on the computing resources • (communication scheduling)

    26. Resource management and scheduling (2) • Individual perfs are not necessarily consistent with the global (system) perf ! • Grid problems • predictions are not definitive: dynamicity ! • Heterogeneous platforms • Checkpointing and migration

    27. Broker Co-allocator A Resource Management System Example (Globus) RSL specialization RSL Resource Specification Language Application Information Service Queries & Info Ground RSL Simple ground RSL Local resource managers GRAM GRAM GRAM LSF Condor NQE LSF: Load Sharing Facility (task scheduling and load balancing; Developed by Platform Computing) NQE: Network Queuing Env. (batch management; developed by Cray Research

    28. Resource information (1) • What is to be stored ? • virtual organizations, people, computing resources, software packages, communication resources, event producers, devices… • what about data ??? • A key issue in such dynamics environments • A first approach : (distributed) directory (LDAP) • easy to use • tree structure • distribution • static • mostly read ; not efficient updating • hierarchical • poor procedural language

    29. Resource information (2) • Goal: • dynamicity • complex relationships • frequent updates • complex queries • A second approach: (relational) database

    30. Programming on the grid: potential programming models • Message passing (PVM, MPI) • Distributed Shared Memory • Data Parallelism (HPF, HPC++) • Task Parallelism (Condor) • Client/server - RPC • Agents • Integration system (Corba, DCOM, RMI)

    31. Program execution: issues • Parallelize the program with the right job structure, communication patterns/procedures, algorithms • Discover the available resources • Select the suitable resources • Allocate or reserve these resources • Migrate the data • Initiate computations • Monitor the executions ; checkpoints ? • React to changes • Collect results

    32. Data management • It was long forgotten !!! • Though it is a key issue ! • Issues: • indexing • retrieval • replication • caching • traceability • (auditing) • And security !!!

    33. Some Grid-Related Projects • Globus • Condor • Nimrod-G

    34. Globus Grid Toolkit • Open source toolkit for building Grid systems and applications • Enabling technology for the Grid • Share computing power, databases, and other tools securely online • Facilities for: • Resource monitoring • Resource discovery • Resource management • Security • File management

    35. Data Management in Globus Toolkit • Data movement • GridFTP • Reliable File Transfer (RFT) • Data replication • Replica Location Service (RLS) • Data Replication Service (DRS)

    36. GridFTP • High performance, secure, reliable data transfer protocol • Optimized for wide area networks • Superset of Internet FTP protocol • Features: • Multiple data channels for parallel transfers • Partial file transfers • Third party transfers • Reusable data channels • Command pipelining

    37. More GridFTP features • Auto tuning of parameters • Striping • Transfer data in parallel among multiple senders and receivers instead of just one • Extended block mode • Send data in blocks • Know block size and offset • Data can arrive out of order • Allows multiple streams

    38. Striping Architecture • Use “Striped” servers

    39. Limitations of GridFTP • Not a web service protocol (does not employ SOAP, WSDL, etc.) • Requires client to maintain open socket connection throughout transfer • Inconvenient for long transfers • Cannot recover from client failures

    40. GridFTP

    41. Reliable File Transfer (RFT) • Web service with “job-scheduler” functionality for data movement • User provides source and destination URLs • Service writes job description to a database and moves files • Service methods for querying transfer status

    42. RFT

    43. Replica Location Service (RLS) • Registry to keep track of where replicas exist on physical storage system • Users or services register files in RLS when files created • Distributed registry • May consist of multiple servers at different sites • Increase scale • Fault tolerance

    44. Replica Location Service (RLS) • Logical file name – unique identifier for contents of file • Physical file name – location of copy of file on storage system • User can provide logical name and ask for replicas • Or query to find logical name associated with physical file location

    45. Data Replication Service (DRS) • Pull-based replication capability • Implemented as a web service • Higher-level data management service built on top of RFT and RLS • Goal: ensure that a specified set of files exists on a storage site • First, query RLS to locate desired files • Next, creates transfer request using RFT • Finally, new replicas are registered with RLS

    46. Condor • Original goal: high-throughput computing • Harvest wasted CPU power from other machines • Can also be used on a dedicated cluster • Condor-G – Condor interface to Globus resources

    47. Earth System Grid • Provide climate studies scientists with access to large datasets • Data generated by computational models – requires massive computational power • Most scientists work with subsets of the data • Requires access to local copies of data

    48. ESG Infrastructure • Archival storage systems and disk storage systems at several sites • Storage resource managers and GridFTP servers to provide access to storage systems • Metadata catalog services • Replica location services • Web portal user interface

    49. Earth System Grid

    50. Earth System Grid Interface