1 / 16

SA1 Execution Plan Status and Issues

SA1 Execution Plan Status and Issues. Ian Bird EGEE Operations Manager. EGEE is proposed as a project funded by the European Union under contract IST-2003-508833. SA1 Objectives. Core Infrastructure services: Operate essential grid services Grid monitoring and control:

Download Presentation

SA1 Execution Plan Status and Issues

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SA1 Execution PlanStatus and Issues Ian Bird EGEE Operations Manager EGEE is proposed as a project funded by the European Union under contract IST-2003-508833 All Activity Meeting – 14 January 2004 - 1

  2. SA1 Objectives • Core Infrastructure services: • Operate essential grid services • Grid monitoring and control: • Proactively monitor the operational state and performance, • Initiate corrective action • Middleware deployment and resource induction: • Validate and deploy middleware releases • Set up operational procedures for new resources • Resource provider and user support: • Coordinate the resolution of problems from both Resource Centres and users • Filter and aggregate problems, providing or obtaining solutions • Grid management: • Coordinate Regional Operations Centres (ROC) and Core Infrastructure Centres (CIC) • Manage the relationships with resource providers via service-level agreements. • International collaboration: • Drive collaboration with peer organisations in the U.S. and in Asia-Pacific • Ensure interoperability of grid infrastructures and services for cross-domain VO’s • Participate in liaison and standards bodies in wider grid community All Activity Meeting – 14 January 2004 - 2

  3. Operations Infrastructure All Activity Meeting – 14 January 2004 - 3

  4. SA1 Management All Activity Meeting – 14 January 2004 - 4

  5. CERN (OMC, CIC) UK+Ireland (CIC,ROC) France (CIC, ROC) Italy (CIC, ROC) Germany+Switzerland (ROC) Northern Europe (ROC) South West Europe (ROC) South East Europe (ROC) Central Europe (ROC) Russia (CIC – M12, ROC) 48 Partners involved in SA1 ROC’s in several regions are distributed across many sites SA1 Partners Organisational Issues • ROC’s must take responsibility: • Chapter in execution plan • Organisation within region • Reporting to overall SA1 • CIC’s vs ROC’s • Need to clarify overlap/separation – dedicated discussion All Activity Meeting – 14 January 2004 - 5

  6. Milestones & Deliverables All Activity Meeting – 14 January 2004 - 6

  7. Work breakdown – tasks, products • Basic level: • OMC, CIC, ROC, Security • Next levels will depend on refinement of roles and responsibilities All Activity Meeting – 14 January 2004 - 7

  8. Training requirements • New people: • LCG, EDG, Globus – tutorials? • New sites: • LCG procedures, middleware, policies, operations • Longer term: • Grid services, OGSA and associated technologies etc All Activity Meeting – 14 January 2004 - 8

  9. Risk Analysis • People: • Need to identify unfunded effort now • Hiring process will not have people in place by April 1 • Short term contracts – • tends to attract inexperienced people • Continuity • Middleware: • Response to operational needs • Packaging and installation not flexible enough • Scope-creep of middleware – bug fixes vs functional changes • Infrastructures: • National infrastructures cannot integrate in EGEE • UK e-Science, NorduGrid • Must be addressed in execution plans • Use LCG Deployment and Security Risk Analyses • http://lcg.web.cern.ch/LCG/PEB/risk/ , • http://proj-lcg-security.web.cern.ch/proj-lcg-security/RiskAnalysis/risk.html • Quite comprehensive • Many issues in common All Activity Meeting – 14 January 2004 - 9

  10. Issues related to other activities • JRA1: • Understand requirements setting process • Agree on split of responsibilities for integration/certification/testing • Understand support issues for LCG-2 based middleware • Discussions in progress • SA2: • No real issues for Execution plan • Operationally will work together – GOC vs NOC, understanding of SLA’s • Security: • Operational security – relationship to JRA3 • Catch-all CA – whose responsibility? (JRA3?) • JRA2: • Project and task breakdown and reporting structures focussed on products and not services All Activity Meeting – 14 January 2004 - 10

  11. Changes requested to TA • No significant changes foreseen • Execution plan will address • Refinement of roles (CIC, ROC, OMC etc) • Interactions between them • Resource and service delivery All Activity Meeting – 14 January 2004 - 11

  12. General Issues • Peering with US Grid infrastructures • Needs interaction from project start • Federation chapters should detail how national infrastructures integrate/migrate to EGEE • Hiring • (No) People to work on execution plan • Focus on getting LCG-2 running – EGEE will have operational infrastructure at project start • Need to agree that ROC’s (or defined people in each region) have organisational responsibility for the region • Cannot deal with 48 partners directly • All aspects: • Negotiate SLAs with Resource centres and with EGEE • Operational responsibility • Organisational responsibility – planning and reporting (effort reports, etc) All Activity Meeting – 14 January 2004 - 12

  13. Execution plan • Execution plan is not well advanced: • Many people focussed on getting LCG-2 going • Outline exists – sections and contents quite well defined • Template agreed for chapters for each region • Input and ideas for some sections • Much work still to do: • Refine roles and responsibilities for ROC, CIC • How to approach SLA’s • Do we need a “GDB”(OAG)? – ROC and CIC managers All Activity Meeting – 14 January 2004 - 13

  14. Overview – introduction Organisation Clarify roles, responsibilities, interactions Processes Security group Policy group – relation to NA5 Federation organisation Chapter per region -> template Service set up and deployment What services, functions Relation to US, Asia projects Transition process New centres – process Training Policies Operations and tools Certification, Middleware support CA and VO management Operational Security Metrics – measurement, validation of services SLA’s Management PBS WBS Staffing and resources plan From regions Training plan Risk assessment Deliverables and responsibilities Execution Plan - TOC All Activity Meeting – 14 January 2004 - 14

  15. Resources: Resource centres Resources - At month 1, 6, 15? When RC joins EGEE FTE for support (need > min) Name system manager, security manager Policy for: Which VO supported … System support and Availability (on-call hours etc) … Expertise available Ability to run general grid services (IS, RLS, etc) ROC organization: Organization – teams, etc People – and responsibilities – must name unfunded people (and funded if known) Support – how it is organized and managed, especially for distributed ROC’s Operational support – how region is organized and managed Execution plan – ROC template All Activity Meeting – 14 January 2004 - 15

  16. Steps to project start-up • Hiring • Refine roles and interactions between CIC, ROC, NOC and other operational organisations • Clarify interactions with JRA1 • Clarify security issues • Clarify relationship to international grid projects for VO’s that are larger than EGEE (LCG, other HEP,…) • Understand national infrastructure integration issues • What requirements will this place on operations, mw, processes, security, etc. • Set up basic policy frameworks: • How VOs access resources, limitations, rules, etc – sufficient that infrastructure can be exploited from project start • Identify sites willing to run development service (based on EDG tb sites?) • Set up OMC team • Organise membership of QA, Security, Requirements, etc. groups • Define transition process from LCG operations to EGEE operations All Activity Meeting – 14 January 2004 - 16

More Related