360 likes | 366 Views
SA1 Overview and Execution plan Ognjen Prnjat D-SEE ROC manager, GRNET. EGEE-SEE regional kick-off, April 7-8 th , 2004. EGEE is a project funded by the European Union. Objectives of this session. Understand SA1 organisation Overview of the execution plan
E N D
SA1 Overview and Execution planOgnjen PrnjatD-SEE ROC manager, GRNET EGEE-SEE regional kick-off, April 7-8th, 2004 EGEE is a project funded by the European Union
Objectives of this session • Understand SA1 organisation • Overview of the execution plan • Identify missing inputs and action points Athens, 7-8th April - 2
Outline • SA1 overview • High-level SA1 organisation and activities • Middleware background • Execution plan • ROC Overview • Deliverables and milestones • ROC Organisation • Overall • Country-specific • Evolution of resources • WBS • TA effort estimate • Communication procedures • Action points Athens, 7-8th April - 3
EGEE regions Athens, 7-8th April - 4
SA1 objectives • Create, operate, support and manage a production quality infrastructure • Overall services offered • Middleware deployment and installation • Software and documentation repository • Grid monitoring and problem tracking • Bug reporting and knowledge database • VO services • Grid management services Athens, 7-8th April - 5
SA1 objectives • Core Infrastructure services • Operate essential grid services • Grid monitoring and control • Proactively monitor the operational state and performance • Initiate corrective actions • Middleware deployment and resource induction • Validate and deploy middleware releases • Set up operational procedures for new resources • Resource provider and user support • Coordinate the resolution of problems from both Resource Centres and users • Filter and aggregate problems, providing or obtaining solutions • Grid management • Coordinate Regional Operations Centres (ROC) and Core Infrastructure Centres (CIC) • Manage the relationships with resource providers via service-level agreements. • International collaboration • Drive collaboration with peer organisations in the U.S. and in Asia-Pacific • Ensure interoperability of grid infrastructures and services for cross-domain VO’s • Participate in liaison and standards bodies in wider grid community Athens, 7-8th April - 6
Operations centers: hierarchy • Implement the objectives to provide • Access to resources • Operation of EGEE as a reliable service • Deploy new middleware and resources • Support resource providers and users • With a clear layered structure • Operations Management Centre (CERN) • Overall grid operations coordination • Core Infrastructure Centers • CERN, France, Italy, UK, Russia (from M12) • Operate core grid services • Regional Operations Centers • One in each federation, in some cases these are distributed centers • Provide front-line support to users and resource centers • Support new resource centers joining EGEE in the regions • Support deployment to the resource centers • Resource Centers • Many in each federation of varying sizes and levels of service • Not funded by EGEE directly instances 1 5 ~11 50+ Athens, 7-8th April - 7
Operations centers: hierarchy Athens, 7-8th April - 8
Deployment • Of middleware and resources by ROCs Athens, 7-8th April - 9
Access • Of resources by VOs Athens, 7-8th April - 10
Support • Of VOs by ROCs Athens, 7-8th April - 11
Operation • Of Grid by CICs Athens, 7-8th April - 12
VDT EDG . . . LCG-1 LCG-2 EGEE-1 EGEE-2 AliEn LCG . . . Globus 2 based Web services based EGEE M/W: lifecycles • From 1st April 2004 Production grid service based on the LCG infrastructure running LCG-2 grid m/w • In parallel develop a “next generation” grid facility Produce a new set of grid services according to evolving standards (web services) Run a development service providing early access for evaluation purposes Will replace LCG-2 on production facility in 2005 Athens, 7-8th April - 13
M/W: LCG EGEE builds on the work of LCG to establish a grid operations service • LCG: a collaboration of • The LHC experiments • The Regional Computing Centers • Physics institutes • Mission: • Prepare and deploy the computing environment that will be used by the experiments to analyze the LHC data • Strategy: • Integrate thousands of computers at dozens of participating institutes worldwide into a global computing resource • Rely on software being developed in advanced grid technology projects, both in Europe and in the USA Athens, 7-8th April - 14
LCG components • Computing Element (CE) is a set of Worker Nodes (WN) providing computing power and the Grid Gate (GG) which is the entry node to a set of WNs. GG itself is referred to as CE. • Storage Element (SE) provides uniform access to large storage spaces. • Grid Index Information Server (GIIS) is the information server for a Grid site. Evolution towards BDII. • Replica Location Service (RLS) provides information about data location and management. There should be one RLS per Virtual Organization (VO). • Resource Broker (RB) performs workload management functions. • Logging and Bookkeeping Service (LBS) logs all job management events. It is typically collocated with the RB. • Proxy Server (PS) enables renewal of proxy certificates. In the table above this is referred to as PROX. • Virtual Organisation Server (VOS) maps user certificates to users data and is used for authorisation purposes. • User Interface (UI) provides for user access to the Grid. • LCFG is the automatic configuration server. Athens, 7-8th April - 15
SA1 execution plan: overview • Planning organised by regional federation • Coordination between partners responsibility of ROC managers • Execution plan • Main part – overall organisation, responsibilities, procedures, task breakdown etc. Core is provided. • Detailed planning – detailing internal Local Operations Centre task breakdown, staffing, planning, organisation etc. TBD per country. Athens, 7-8th April - 16
ROC responsibilities (summary) • Certification of new middleware releases • Coordinate release installation and configuration in the RCs and validate RCs • Provide operational support to RC managers and VO users, interact with Application/VO specific support • Solve, or refer and escalate middleware problems to relevant teams • Distribute operational monitoring, authorisation and accounting tools to RCs • Collaborate with CIC to read and check monitoring information and to react to bad performance of the running Grid services • Establish the tools measuring the service level provided by RCs computing and storage services and by other Grid services located inside their region/federation Athens, 7-8th April - 17
Milestones & Deliverables Athens, 7-8th April - 18
SEE partners AP: Confirm FTEs Athens, 7-8th April - 19
D-SEE ROC organization: vertical Athens, 7-8th April - 20
D-SEE ROC organization: vertical • ROC Manager, supported by the Alternating ROC Manager, coordinates the distributed ROC operations • In collaboration with the peers in all SEE countries • Each country’s manager coordinates the country’s RC operations • In each RC there is a RC manager and a set of administrators responsible for different aspects of the operations Athens, 7-8th April - 21
SA1 key individuals AP: Confirm Key Personnel Athens, 7-8th April - 22
D-SEE ROC organization: horizontal • Certification and distribution of middleware. This is the responsibility of the ROC Release Group. • Installation and deployment. This is the responsibility of the ROC Management Team. • Operational and user support. This is the responsibility of the Operational and User Support Service Team. • Monitoring. This is responsibility of the Monitoring Team. • Logical grouping within each of the D-SEE-ROCs. • Roles can be taken by different combinations of people (funded and unfunded) within each region/ROC/RC Athens, 7-8th April - 23
Local organization: Greece Athens, 7-8th April - 24
Greece: vertical Athens, 7-8th April - 25
Greece: horizontal • Next step is to identify horizontal teams • Certification team: Demokritos university • Deployment team: M/W + system + n/w admin at RCs • Operational + support team: M/W + n/w admin at RCs • Monitoring team: M/W + network admin at RCs • AP: Some effort needed to establish teams in ALL countries Athens, 7-8th April - 26
Greece: Identification of personnel & roles AP: Identify ALL personnel and roles for ALL countries Athens, 7-8th April - 27
ROC branch evolution • Phased introduction of services • Basic RC configuration: CEs, WNs, SEs, UIs • Operation and monitoring • Followed by core services for the region (RB, RLS, VOMS) • AP: identification of the RC evolution, and core services identification Athens, 7-8th April - 28
Evolution of RCs AP: Identify ALL resources envisaged Athens, 7-8th April - 29
WBS • The initial work breakdown structure • Deliverable tasks • ROC setups • Various steps • To be performed by ROC, per RC centre • Middleware certification (validation/adaptation) • Deployment (including operational policies and SLA setup) • Site certification • Operational/user support • Monitoring Athens, 7-8th April - 30
Activity plan (example) AP: Finalize activity plan Athens, 7-8th April - 31
Effort • Effort needs to be identified • Real PMs/partner needed • Real people needed Athens, 7-8th April - 32
TA effort estimate: PMs Athens, 7-8th April - 33
Detailed names Athens, 7-8th April - 34
Communications procedures • egee-see-sa1@grnet.gr • project-egee-sa1@cern.ch • CERN Document Server (CDS): http://agenda.cern.ch/displayLevel.php?fid=41 • EDMS server (In OTHERS/EGEE/) https://edms.cern.ch • Main sitehttp://egee-intranet.web.cern.ch/egee-intranet/gateway.html • Restricted members area http://egee-members.web.cern.ch/egee-members/index.html • CERN NICE/EDMS accounts: Marie-Laure.Bourgeois@cern.ch Athens, 7-8th April - 35
Action points • AP: Confirm partners’ FTEs • Confirm Key Personnel • For ALL countries: • Establish teams (horizontal, vertical) • Identify ALL personnel (names!) and roles • Identify RC evolution, and core services identification • Finalize activity plan • But keep activities coarse-grained • Effort: • Real PMs/partner needed • Real people needed • Deadline execution plan draft: 15th April • In order to have a good estimate for Cork conference Athens, 7-8th April - 36