1 / 20

Grid Technology: Introduction & Overview

Grid Technology: Introduction & Overview. Ian Foster Argonne National Laboratory University of Chicago. Including New Zealand!. Grid Technologies: Expanding the Horizons of HEP Computing. Enabling thousands of physicists to harness the resources of

ghutchison
Download Presentation

Grid Technology: Introduction & Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grid Technology: Introduction & Overview Ian Foster Argonne National Laboratory University of Chicago LCG 13.3.2002

  2. Including New Zealand! Grid Technologies: Expanding the Horizons of HEP Computing Enabling thousands of physicists to harness the resources of hundreds of institutions in pursuit of knowledge LCG 13.3.2002

  3. The Grid Problem Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations LCG 13.3.2002

  4. Some “Large” Grid Issues(H. Newman) • Consistent transaction management • Query (task completion time) estimation • Queuing and co-scheduling strategies • Load balancing (e.g., Self Organizing Neural Network) • Error Recovery: Fallback and Redirection Strategies • Strategy for use of tapes • Extraction, transport and caching of physicists’ object-collections; Grid/Database Integration • Policy-driven strategies for resource sharing among sites and activities; policy/capability tradeoffs • Network Performance and Problem Handling • Monitoring and Response to Bottlenecks • Configuration and Use of New-Technology Networks e.g. Dynamic Wavelength Scheduling or Switching • Fault-Tolerance, Performance of the Grid Services Architecture LCG 13.3.2002

  5. How Large is “Large”? • Is the LHC Grid • Just the O(10) Tier 0/1 sites and O(20,000) CPUs? • + the O(50) Tier 2 sites: O(40,000) CPUs? • + the collective computing power of O(300) LHC institutions: perhaps O(60,000) CPUs in total? • Are the LHC Grid users • The experiments and their relatively few, well-structured “production” computing activities? • The curiosity-driven work of 1000s of physicists? • Depending on our answer, the LHC Grid is • A relatively simple deployment of today’s technology • A significant information technology challenge LCG 13.3.2002

  6. The Problem:Resource Sharing Mechanisms That … • Address security and policy concerns of resource owners and users • Are flexible enough to deal with many resource types and sharing modalities • Scale to large number of resources, many participants, many program components • Operate efficiently when dealing with large amounts of data & computation LCG 13.3.2002

  7. Aspects of the Problem • Need for interoperability when different groups want to share resources • Diverse components, policies, mechanisms • E.g., standard notions of identity, means of communication, resource descriptions • Need for shared infrastructure services to avoid repeated development, installation • E.g., one port/service/protocol for remote access to computing, not one per tool/appln • E.g., Certificate Authorities: expensive to run • A common need for protocols & services LCG 13.3.2002

  8. Hence, Grid ArchitectureMust Address • Development of Grid protocols & services • Protocol-mediated access to remote resources • New services: e.g., resource brokering • “On the Grid” = speak Intergrid protocols • Mostly (extensions to) existing protocols • Development of Grid APIs & SDKs • Interfaces to Grid protocols & services • Facilitate application development by supplying higher-level abstractions • The (hugely successful) model is the Internet LCG 13.3.2002

  9. Application “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Collective “Sharing single resources”: negotiating access, controlling use Resource “Talking to things”: communication (Internet protocols) & security Connectivity “Controlling things locally”: Access to, & control of, resources Fabric Grid Architecture LCG 13.3.2002 For more info: www.globus.org/research/papers/anatomy.pdf

  10. HENP Grid Architecture(H. Newman) • Physicists’ Application Codes • Reconstruction, Calibration, Analysis • Experiments’ Software Framework Layer • Modular and Grid-aware: Architecture able to interact effectively with the lower layers (above) • Grid Applications Layer (Parameters and algorithms that govern system operations) • Policy and priority metrics • Workflow evaluation metrics • Task-Site Coupling proximity metrics • Global End-to-End System Services Layer • Monitoring and Tracking Component performance • Workflow monitoring and evaluation mechanisms • System self-monitoring, evaluation and optimization mechanisms LCG 13.3.2002

  11. Architecture (1): Fabric Layer • Diverse resources that may be shared • Computers, clusters, Condor pools, file systems, archives, metadata catalogs, networks, sensors, etc., etc. • Speak connectivity, resource protocols • The neck of the protocol hourglass • May implement standard behaviors • Reservation, pre-emption, virtualization • Grid operation can have profound implications for resource behavior Registration, enquiry, management, access protocol(s) Grid resource LCG 13.3.2002

  12. Architecture (2):Connectivity Layer Protocols & Services • Communication • Internet protocols: IP, DNS, routing, etc. • Security: Grid Security Infrastructure (GSI) • Uniform authentication & authorization mechanisms in multi-institutional setting • Single sign-on, delegation, identity mapping • Public key technology, SSL, X.509, GSS-API (several Internet drafts document extensions) • Supporting infrastructure: Certificate Authorities, key management, etc. LCG 13.3.2002

  13. Architecture (3):Resource Layer Protocols & Services • Resource management: GRAM • Remote allocation, reservation, monitoring, control of [compute=>arbitrary] resources • Data access: GridFTP • High-performance data access & transport • Information/monitoring • MDS: Access to structure & state information • GMA • & others : database access, code repository access, virtual data, … • All integrated with GSI LCG 13.3.2002

  14. Grid Services Architecture (4):Collective Layer Protocols & Services • Community membership & policy • E.g., Community Authorization Service • Index/metadirectory/brokering services • E.g., Globus GIIS, Condor Matchmaker, DAGMAN • Replica management and replica selection • E.g., GDMP • Optimize aggregate data access performance • Co-reservation and co-allocation services • End-to-end performance • Middle tier services • MyProxy credential repository, portal services LCG 13.3.2002

  15. Evolution of Grid Architecture • Up to 1998 • Basic mechanisms: Authentication, virtualization, resource management, information/monitoring • Condor, Globus Toolkit, SRB, etc. • Early application experiments on O(60) site testbeds • 1999-2001 • Data Grid protocols and services; GDMP, GridFTP, DRM, etc. • First experiences with production operation • 2002- • Further evolution in protocol base (Web services) • Higher-level services, reliability, scalability LCG 13.3.2002

  16. The Grid Information Problem • Large numbers of distributed “sensors” with different properties • Need for different “views” of this information, depending on community membership, security constraints, intended purpose, sensor type LCG 13.3.2002

  17. Grid Information Architecture Registration & enquiry protocols, information models, query languages • Provides standard interfaces to sensors • Supports different “directory” structures supporting various discovery/access strategies LCG 13.3.2002

  18. Web Services • “Web services” provide • A standard interface definition language (WSDL) • Standard RPC protocol (SOAP) [but not required] • Emerging higher-level services (e.g., workflow) • Nothing to do with the Web • Useful framework/toolset for Grid applications? • See proposed Open Grid Services Architecture • Represent a natural evolution of current technology • No need to change any existing plans • Introduce in phased fashion when available • Maintain focus on hard issues: how to structure services, build applications, operate Grids LCG 13.3.2002 For more info: www.globus.org/research/papers/physiology.pdf

  19. Identifying and AddressingTechnology Challenges 1) Identify and correct critical technology challenges • We don’t know all of the problems yet 2) Develop coherent Grid technology architecture • To conserve scarce resources; for experiments • Both challenges can be addressed by a pragmatic, experiential strategy • Build and run joint testbeds of increasing size • Gain experience “at scale” • Mix and match technologies • Coordinated projects to resolve problems LCG 13.3.2002

  20. Summary • We have a solid base on which we can build • Still learning how to deploy and operate • Success of LCG (and EDG, GriPhyN, PPDG, …) requires • Focused, methodical effort to deploy and operate • Continued iteration on core components • Collaborative design and development of higher-level services • Early adoption and experimentation by experiments • We are not alone in these endeavors • Dozens of other Grid projects worldwide • Significant and growing industrial participation LCG 13.3.2002

More Related