Gridpp executive summary
This presentation is the property of its rightful owner.
Sponsored Links
1 / 28

GridPP: Executive Summary PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

GridPP: Executive Summary. Tony Doyle. Outline. Exec 2 Summary Grid status High level view 2006 Outturn Performance Monitoring Outlook for 2007 Beyond GridPP2. 2007. Exec 2 Summary. 2006 was the second full year for the UK Production Grid

Download Presentation

GridPP: Executive Summary

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Gridpp executive summary

GridPP: Executive Summary

Tony Doyle



  • Exec2 Summary

  • Grid status

  • High level view

  • 2006 Outturn

  • Performance Monitoring

  • Outlook for 2007

  • Beyond GridPP2


Oversight Committee

Exec 2 summary

Exec2 Summary

  • 2006 was the second full year for the UK Production Grid

  • More than 5,000 CPUs and more than 1/2 Petabyte of disk storage

  • The UK is the largest CPU provider on the EGEE Grid, with total CPU used of 15 GSI2k-hours in 2006

  • The GridPP2 project has met 69% of its original targets with 92% of the metrics within specification

  • The initial LCG Grid Service is now starting and will run for the first 6 months of 2007

  • The aim is to continue to improve reliability and performance ready for startup of the full Grid service on 1st July 2007

  • The GridPP2 project has been extended by 7 months to April 2008

  • The outcome of the GridPP3 proposal to PPARC is awaited

  • We anticipate a challenging period from Sept. 2007 onwards

Oversight Committee

Grid overview

Grid Overview

  • Aim: by 2008 (full year’s data taking)

  • CPU ~100MSI2k (100,000 CPUs)

  • Storage ~80PB

  • - Involving >100 institutes worldwide

  • Build on complex middleware being developed in advanced Grid technology projects, both in Europe (Glite) and in the USA (VDT)

  • Prototype went live in September 2003 in 12 countries

  • Extensively tested by the LHC experiments in September 2004

  • February 200625,547 CPUs, 4398 TB storage

Status in February 2007:

177 sites, 32,412 CPUs,

13,282 TB storage

Monitoring via

Grid Operations Centre

Oversight Committee

Gridpp executive summary


2006 CPU Usageby Region

Via APEL accounting

Oversight Committee

2006 outturn

2006 Outturn


  • "Promised" is the total that was planned at the Tier-1/A (in the March 2005 planning) and Tier-2s (in the October 2004 Tier-2 MoU) for CPU and storage

  • "Delivered" is the total that was physically installed for use by GridPP, including LCG and SAMGrid at Tier-2 and LCG and BaBar at Tier-1/A

  • "Available" is available for LCG Grid use, i.e. declared via the EGEE mechanisms with storage via an SRM interface

  • "Used" is as accounted for by the Grid Operations Centre

Oversight Committee

Resources delivered

Resources Delivered

Tier-1 and Tier-2 total delivery is impressive and usage is improved


CPU: 8.5 MSI2k

Storage: 1.7 PB

Disk: 0.54 PB

Delivery of Tier-1 disk


CPU:15 GSI2k-hours

Disk: 0.26 PB

Usage of Tier-2 CPU, disk

Request: PPARC acceptance of the 2006 outturn

Oversight Committee

Gridpp executive summary


Oversight Committee

Measured by uk tier 1 for all vos


(measured by UK Tier-1 for all VOs)

~90% CPU efficiency due to i/o bottlenecks is OK

Concern that this is currently ~75%


Each experiment needs to work to improve their system/deployment practice anticipating e.g. hanging gridftp connections during batch work

Oversight Committee

Tier 1 cpus brought online on jan 10


(Tier-1 CPUs brought online on Jan 10)

  • Tier-1 CPU fully utilised throughout 2006 (Grid & non-Grid)

  • Added 64 Intel twin dual-core Woodcrests on Jan 10

  • Busy with Grid jobs within 30 minutes

Oversight Committee

Estimated utilisation based on gstat job slots usage


(Estimated utilisation based on gstat job slots/usage)

UKI mirrors overall EGEE utilisation

Average Utilisation for Q306: 66%

Compared to target of ~70%

CPU utilisation was a major T2 issue, but now improving..

Oversight Committee

Gridpp executive summary

CPU by experiment

Oversight Committee

2006 cpu usage by experiment

UK Resources

2006 CPU Usageby experiment

Oversight Committee

Gridpp executive summary

LCG Disk Usage

Oversight Committee

Individual rates

File Transfers

(individual rates)

Current goals:

>250Mb/s inbound-only

>250Mb/s outbound-only

>200Mb/s inbound and outbound

Aim: to maintain data transfers at a sustainable level as part of experiment service challenges

Oversight Committee

Gridpp executive summary

Tier-1 Resource

  • Approval for new (shared) machine room – ETA Summer 2008. Space for 300 racks.

  • Procurement

    • March 06: 52 AMD 270 units, 21 disk servers (168TB data capacity)

    • FY 06/07: 47 disk servers (282TB disk capacity), 64 twin dual-core Intel Woodcrest 5130 units (550kSI2K)

    • FY 06/07 upcoming: further 210 TB disk capacity plus high-availability systems (redundant PSUs, hot-swappable paired HDDs)

  • Storage commissioning saga

    • Ongoing problems with March kit. Firmware updates have now solved problem. (Disks on Areca 1170 in raid 6 experienced multiple dropouts during testing of WD drives)

  • Move to CASTOR

    • Very support heavy but made available for CSA06 and performing well

  • General

    • - Air-con problems with high-temperatures triggering high pressure cut-outs in refrigerator gas circuits

    • - July security incident

    • - 10Gb CERN line in place. Second 10Gb line scheduled in 07Q1

Oversight Committee

E g glasgow uki scotgrid glasgow

T2 Resources


August 28

  • 800 kSI2k

  • 100 TB DPM

    Needed for LHCstart-up

September 1

  • IC-HEP

    • 440 KSI2K

    • 52 TB dCache

  • Brunel

    • 260 KSI2K

    • 5 TB DPM

October 13

October 23

Oversight Committee

Gridpp middleware incorporates

Workload Management

Grid Data Management

Network Monitoring

Information Services


Storage Interfaces


GridPP Middleware incorporates..

Oversight Committee

Msn outlook

MSN Outlook

  • The results of the GridPP2+ project extension proposal to PPARC were made known to GridPP in November 2006

  • The effects on MSN are significant and particularly damaging with the overall effort reduced by more than a third from 13 to 8.3 FTEs

  • WMS testing and contributions to EGEE SA3 will reduce

  • GridPP work on metadata will cease and UK leadership will be lost, but this is known to be an area the experiments are keen to see tackled

  • The reduction in Information and Monitoring effort will severely impact re-engineering work and support for R-GMA and compromises UK obligations in fulfilling the EGEE contract

  • GridPP has recognised the importance of finishing the R-GMA re-engineering, thus meeting the R-GMA deliverables to EGEE and has therefore agreed (in consultation with PPARC) to meet the costs of maintaining the current staffing levels to the end of EGEE-II from within existing allocations

  • The reduction in networking activities is likely to impact GridPP’s ability to optimise its use of the underlying JANET network

  • Staff whose contracts will not be extended beyond the end of August 2007 have been informed

Oversight Committee

E g atlas tier 2 testing

e.g. ATLAS Tier-2 Testing

  • Most of the experiments are now well advanced in highly pragmatic deployment issues, particularly in advance of the LHC data at the end of 2007

Oversight Committee

Applications outlook

Applications Outlook

  • Products developed by GridPP are in mainstream use, and will form a vital component of the computing system of each LHC experiment for first data-taking and analysis

  • However, almost all explicit funding for the further development and support of such products will terminate in September 2007, since it is now clear that this area will be supported neither via GridPP3 (as planned) nor the Rolling Grants round (as requested)

  • This is a matter of concern both for the UK collaborations and the experiments as a whole

  • Recovery plans are being prepared within each experiment, attempting to use non-specialist RA effort in tension with physics and hardware support, but there will be profound negative consequences for the continuation and maintenance of these projects

Oversight Committee

Dissemination outlook

Dissemination Outlook

  • Dissemination was one of the areas not fully funded in GridPP2+

  • The Dissemination Officer post was funded at 0.5 FTE (as at present), but the PPRP did not allocate funds to continue the Events Officer position

  • Due to a large number of events and activities planned for the end of 2007, we aim to fund this position for some months out of the current dissemination budget

Oversight Committee

Hardware outlook planning for 2007

Hardware OutlookPlanning for 2007..

  • A profiled ramp-up of resources is planned throughout 2007 to meet the UK requirements of the LHC and other experiments

  • The results are available for the Tier-1 and Tier-2s

  • The Tier-1/A Board reviewed UK input to International MoU negotiations for the LHC experiments as well as providing input to the International Finance Committee for BaBar

  • An impasse was reached in planning for 2007

  • No new investment in the BaBar Tier A analysis facility hardware is planned

  • For LCG, the 2007 commitment for disk and CPU capacity can be met out of existing hardware already delivered

Oversight Committee


6th September – 1st PPRP review

16th June – GridPP16 at QMUL

13th July – Bid Submitted

1st November – GridPP17

31st March – PPARC Call




8th November

PPRP “visiting panel”

30th November GridPP2+ outcome


GridPP3 outcome

Proposal Writing

Proposal Defence










~10 month process to propose/defend/define future programme

Oversight Committee

Scenario planning resource requirements tb ksi2k

Scenario Planning – Resource Requirements [TB, kSI2k]

GridPP requested a fair share of global requirements, according to experiment requirements

Changes in the LHC schedule prompted a(nother) round of resource planning - presented to CRRB on Oct 24th

New UK resource requirements have been derived and incorporated in the scenario planning e.g. Tier-1

Oversight Committee

Input to scenario planning hardware costing

Input to Scenario Planning –Hardware Costing

  • Empirical extrapolations with extrapolated (large) uncertainties

  • Hardware prices have been re-examined following recent Tier-1 purchase

  • CPU (woodcrest) was cheaper than expected based on extrapolation of previous 4 years of data

Oversight Committee

Scenario planning

Scenario Planning

An example 70% “minimum viable level”scenario [£m]

Oversight Committee

Beyond gridpp2

Beyond GridPP2

  • The separation between GridPP2+ and GridPP3 was primarily designed to ensure an early decision could be made on the extension in order to retain key staff

  • Approval for the extension was received in late November but included major cuts in the middleware support area

  • This is problematic in two ways:

    • EU-CCLRC contractual obligation

    • crucial 7 month ramp-up period - the worst time to cut back

  • Problems are severely compounded by the outcome of the Rolling Grant round where much of the Applications support work will be lost during this same critical period

  • We currently await the outcome of the GridPP3 bid in order to be able to assess the whole picture

  • We anticipate a highly challenging period from September 2007 onwards

Oversight Committee

  • Login