1 / 23

Prof. David Britton GridPP Project leader University of Glasgow

GridPP : Status and Review. Prof. David Britton GridPP Project leader University of Glasgow. GridPP Oversight Committee 25 th May 2011. Factors of Ten. 2008 - 2011. 2010. 2011. During GridPP3, the CPU delivered by the UK grew a factor of 10. --------- GridPP3 ---------.

paniz
Download Presentation

Prof. David Britton GridPP Project leader University of Glasgow

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GridPP: Status and Review Prof. David BrittonGridPP Project leader University of Glasgow GridPP Oversight Committee 25th May 2011 IET, Oct 09

  2. Factors of Ten 2008 - 2011 2010 2011 During GridPP3, the CPU delivered by the UK grew a factor of 10. --------- GridPP3 --------- GridPP2+ ------------- GridPP4 ----------- --------- GridPP2 --------- --------- GridPP1 --------- 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 From Web to Grid From Prototype to Production From Production to Exploitation Computing in the LHC era Seneca: "Every new beginning comes from some other beginning's end."

  3. Progress 912b 1380b? 480b

  4. CPU Delivery by Region CPU delivered this year CPU delivered during GridPP3

  5. CPU Delivery over GridPP3 Tier-1 Tier-1 + Tier-2s

  6. CPU Shares First half of GridPP3 80%:20% split between LHC and non-LHC VOs. Second half of GridPP3 90%:10% split between LHC and non-LHC VOs.

  7. CPU Efficiency CPU Efficiency this year CPU Efficiency during GridPP3

  8. Reliability Reliability is consistently above target and comparable with peers

  9. GridPP3 ProjectMap Green and Orange are milestones and metrics that meet or are close to target. Red flags those not close to being met. Lilac and black flag those that can’t be measured or are suspended

  10. Evolution of Milestones and Metrics Milestones Metrics Eight metrics and two milestones were not met at end of GridPP3 – discussed in the Project Status report.

  11. GridPP4 ProjectMap A ProjectMap has been developed for GridPP4 focused on the supporting the experiments. The coloured banners reflect the proposal WPs There are 260 entries, 2/3rds or which are metrics. This will evolve after the first quarter reports.

  12. Financial Outturn • SLA pay costs have increased • Tier-1 HW spend reduction adjusted. • Tape drives did not arrive in time (£207k). • Tier-2 HW grants were almost all spent within FY. GridPP3 spent 98.3% of the funds made available, which was 91.6% of the original award. The LHC-delay allowed savings to be made.

  13. Effort Delivered • It takes 30%-40% more effort to run the Grid than is actually funded by GridPP. • 100% of funded effort was deliver in FY10 with 98% over the whole project. • Shortfall came predominately from a hiring moratorium at STFC and slow subsequent recruitment process (no financial cost because on SLA). FY10 GridPP3

  14. Risk Registers

  15. Current Risks The PMB recently reviewed the risks and the key issues identified are: • 6: Failure of Tier-1 to meet SLA or MoU. Elevated due to the worries over staffing levels and the effect on critical services. Recruitment is underway with top priority at the Tier-1. • 10: Recruitment retention problems at RAL. Four posts have resigned at the Tier-1 since the last Oversight Committee meeting (plus one death in service). However, all remaining fixed term contract posts have now been renewed until 2015 and are open-ended.The situation at the Tier-2 sites is more stable now that GridPP4 grants have been issued for the first two years and verbalconfirmation of four years has been provided. • 23:Changes to the LHC schedule. This risk is still elevated, as the consequences of the LHC decision to run during 2012 (and possibly beyond?) are assimilated into the project. There are costs associated with bringing forward hardware purchases and with running at higher instantaneous luminosities than planned. • 25:Technology Shifts. This is risk is highlighted as a longer-term issue (i.e. over the lifetime of GridPP4):As new technology becomes prevalent the existing infrastructure may need radically changing. Virtualisation, multi/many core CPU’s, GPU’s, network-strategies and evolving Cloud-offerings, can all affect the experiments future computing models and requirements at sites.

  16. Observations on GridPP3 HW-£ LHC OPN CASTOR SSC R89 Disk CSR Disk … but I doubt the fates are out of ammo yet! We dodged a lot of bullets…

  17. GridPP4 Financial Plan • Two main issues: • Experiment resource requirements continue to evolve but at present there is pressure on the Tier-2 hardware budget, with an estimated shortfall of £345k. This has been helped recently by the decision to increase the assumed Tier-2 efficiency from 60% to 66.6%. It can be further helped if the (possible) surplus of £150k on the Tier-1 line could be used to purchase PPD hardware. • Staff costs have come in 2.5% higher than estimated and there is a projected shortfall of £510k. This will be managed by small reductions to grants/SLA in the second half of GridPP4.

  18. Hardware Costs Current Estimates: Disk+CPU £3.5m at Tier-1 £3.3m at Tier-2 GridPP4 Proposal: Disk+CPU: £2.8m at Tier-1; £2.7m at Tier-2 GridPP26

  19. Tier-2 Resources The four Tier-2 centres have delivered the 2011 MOU resources. Slight delays at individual sites are compensated by over-delivery at other demonstrating one of the strengths of the distributed Tier-2 model.

  20. Tier-2 Resources ++ GridPP has commenced a ~6-month accounting period to help determine the split of experimental resource between institutes. The experiments have established algorithms according to their own priorities and abilities to monitor accurately. ATLAS example shown below.

  21. Networks - LHCONE • The objective of LHCONE is to provide a collection of access locations that are effectively entry points into a network that is private to the LHC T1/2/3 sites. LHCONE is not intended to replace the LHCOPN but rather to complement it. LHCONE is being pushed by the US and a few EU countries (different funding models). Experiments are evolving computing models to better use Networks as a resource. • Notes from NREN meeting in April-11: • JANET (David Salmon) • GridPP in UK represents Tier 2 view • There is no evidence that there is poor WAN connectivity at the moment. There have been some local network issues and this is where the bottleneck risk is perceived to be greater • The costs of dedicated network could be quite high nationally (~500k install and 200k pa) and the users don’t have a budget for this. Due also to highly over-provisioned backbone no real issues on the JANET backbone are expected • JANET prefers therefore to continue the current IP access solution for now but will monitor developments closely • JANET have asked the UK sites to ensure the T2s traffic levels are monitored and build up a better picture of traffic volumes • Need to be aware of other projects coming on stream (LOFAR) as this may also add large loads on the backbones that either the standard IP services need to be capable of meeting or a segregated model is needed • GridPP is monitoring this; is working with experiments to explore any current limitations; has formed an interim policy; and is/will be looking at network performance on a site-by-site basis.

  22. GridPP4 Structure Oversight Committee changes from bi-annual to a mid-term and full-term review (TBC) The CB stays the same: Group leaders from each institute. User Board becomes entirely virtual with a User Coordinator responsible for resource scheduling and community contact. The PMB stays largely the same. The dTeam becomes the ops-Team and is re-aligned with new Tier-2 structure and the development of a UK NGI. GridPP26

  23. Summary • In GridPP4, driven by the increasing demands from LHC data, we need to strive towards greater efficiency and effectiveness. This was the underlying theme of the GridPP26 meeting. • It is important because in the next few years we are likely to experience for the first time, real and sustained contention for resources. This will likely be against a backdrop of financial austerity, higher-than-planned hardware prices, and increased experiment resource requests. • We need to remain focused on the, sometimes changing, needs of the experiments and not lose sight of the changing technological environment. • GridPP offers our profound thanks for the help and encouragement of all the generations of our Oversight Committee over the last decade who have been instrumental in our success.

More Related