1 / 15

Southgrid Status Report

Southgrid Status Report. Rhys Newman: September 2004 GridPP 11 - Liverpool. Southgrid Member Institutions. Oxford RAL PPD Cambridge Birmingham Bristol Warwick. Tier 2 Management Board. Tier 2 Board meets regularly every 3 months.

anka
Download Presentation

Southgrid Status Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Southgrid Status Report Rhys Newman: September 2004 GridPP 11 - Liverpool

  2. Southgrid Member Institutions • Oxford • RAL PPD • Cambridge • Birmingham • Bristol • Warwick

  3. Tier 2 Management Board • Tier 2 Board meets regularly every 3 months. • Where possible this is a face to face meeting, although a couple of people “phone in”. • MOU Status is still ongoing. • The most difficult tier 2 to organise as it has the most institutes. • Many concerns about imposing security policy on to the institute • Confusion as to who in each site is authorised to sign the MOU. • Final signatures are being collected as I speak.

  4. Status at Warwick • A recent addition to Southgrid. • Third line institute – no resources as yet but remain interested in being involved in the future. • Will not receive GridPP resources and so does not need to sign the MOU yet.

  5. Operational Status • RAL PPD • Cambridge • Bristol • Birmingham • Oxford

  6. Status at RAL PPD • Always on the leading edge of software deployment (co-located with RAL Tier 1) • Currently (10 Sept) up to LCG 2.2 • CPUs: 24 2.4 GHz, 18 2.8GHz • 100% Dedicated to LCG • 0.5 TB Storage • 100% Dedicated to LCG

  7. Status at Cambridge • Constantly the first institute to keep up with LCG releases. • Currently LCG 2.1.1 (since date of release), will upgrade by October. • CPUs: 32 2.8GHz – increase to 40 soon. • 100% Dedicated to LCG • 3 TB Storage • 100% Dedicated to LCG

  8. Status at Bristol • Limited involvement for last 6 months due to manpower shortage. • Current plans to switch BaBar farm to LCG by October. • 1.25 FTE computer support to be filled soon and should improve the situation (Bristol Initiative not GridPP) • CPUs: 80 866MHz PIII (Planned BaBar) • Shared with LHC under an LCG install. • 2 TB Storage (Planned) • Shared with LHC under an LCG install. • Possible new computing centre (>500 CPUs) still ongoing. • Possible new post still ongoing.

  9. Status at Birmingham • Second line institute, reliably up to date with software within about 6 weeks of release. • Currently LCG 2.2 (since mid August). • Southgrid’s “Hardware Support Post” to be allocated here to assist. • CPUs: 22 2.0GHz Xenon (+48 soon) • 100% LCG • 2 TB Storage awaiting “Front End Machines” • 100% LCG.

  10. Status at Oxford • Second line institute, have only recently come online. Until May had limited resource. • Currently LCG 2.1.1 (since early August). • Hosted LCG2 Administrator’s Course which impacted installation timeline. • CPUs: 80 2.8 GHz • 100% LCG • 1.5 TB Storage – upgrade to 3TB planned • 100% LCG.

  11. Resource Summary • CPU (3GHz equiv) • 155.2 Total • Storage (TB) • 7 TB Total

  12. LCG2 Administrator’s Course • Main activity at Oxford for the weeks leading up to July. • Received very well – lack of machines was identified as a problem, even though we used 20 servers! • A good measure of the complexity: • An expert could do a LCG install in 1 day • A novice could do it with expert help in 3 days. • A novice alone could take weeks! • A lot of interest in a repeat, especially when the 8.5 “Hardware Support” posts are filled (suggestions welcome).

  13. Ongoing Issues • Complexity of the installation. Can’t compare with “Google Compute” – is winning a PR exercise useful? • Difficulty sharing resources – almost all of those listed are 100% LCG due to difficult sharing issues. • How will we manage clusters without LCFGng? Quattor has a learning curve (uses a new language) – should we all get training?

  14. Future Issues • We need 100 000 1GHz machines “… to scale up the computing power available by a factor of ten…” – Tony Doyle (GridPP summary of All Hands meeting). • What are we learning now? The gLite (aka EGEE1) may be completely different? • Can’t we get some cycle stealing? 20000 “decent” machines in Oxford University alone!

  15. LHC At Home!! (Thanks Mark) LHC at Home: http://lhcathome.cern.ch • Started 1st September • Still beta. • 1004 Computers already • How can we leverage this???

More Related