1 / 18

RAL PPD Tier 2 (and stuff) Site Report

This report outlines the activities and achievements of the HEP SysMan team over the past year, including performance, hardware purchases, issues faced, and plans for the future. It also discusses non-*nix related topics and goals for improving reliability and infrastructure.

tiffanyo
Download Presentation

RAL PPD Tier 2 (and stuff) Site Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RAL PPD Tier 2 (and stuff)Site Report Rob Harper HEP SysMan 30th June 2009 1

  2. Steve Dallison

  3. Outline • The last year • Things we did and stuff we bought • Where we are now • The next year • What is coming up • Some non *nix stuff

  4. Over the last year • Performance • Hardware • Other stuff • Issues

  5. Performance • 1,382,736 jobs in year to 2009-05-31 • Cluster has been underutilised much of the time • Availability 93% over last 6 months • Lower than we had hoped • But not entirely our fault!

  6. Jobs Run

  7. Hardware • Most purchasing was for GridPP • 70 new SuperMicro twins (1120 cpu cores)‏ • 25 new 20TB Viglen storage nodes (about 450TB of usable space)‏ • Assorted new hardware for service nodes, etc. • Not all new kit is yet fully commissioned

  8. Other Stuff • Tested SL5 WN • Logging with rsyslog (to 2 hosts)‏ • Setting up machines and services for assorted projects • Desktop Linux • Talked to users • One test (SL5) box set up

  9. Issues • Air conditioning • Several failures, including twice in one week after Xmas • Site power • Big power cut • Planned work (on short notice) • The Sun • Building Management System • R89 delays and movement schedules

  10. Where We Are Now • ~613 TB Storage • dCache version 1.9.1-7 • 1584 CPU cores • 2740 kSI2k • Ahead of 2009 GridPP pledges

  11. The next year • Consolidate storage in R1, CPU in lab R27 • Replacing RGMA and BDII hosts with virtual machines • Local private network for IPMI, RAID, APC, etc. • New machines to host VO software. • An SL5 CE • Getting ready to move to all SL5 some time. • CREAM for added yumminess. • Desktop Linux (again!)‏ • LHC data? • Networking...

  12. Networking: now • Nortel 55xx stack in R1 • Similar switches installed in R27 but not yet stacked • Interconnected through RAL site networking • 2 * 1GB links to R27 • 1 * 10GB to R1

  13. Networking: planned • Establish stack in R27 • Connect machine rooms with direct 10GB fibre link • Hopefully add second 10GB link later

  14. Not just *nix… • Windows • Macs • Other services

  15. Windows • Moving to Windows 2008 domain (as I speak!) • Recent desktop machines running Vista Business • VMs for some legacy software • Laptops running Vista Ultimate with Bitlocker encryption (older laptops encrypted with Pointsec) • Remote reporting/updating to our own WSUS (Windows Updates) service and Sophos Enterprise server

  16. Mac • An increasing number of Apple computers in the department • Support still not “official” but provided on a best efforts basis • Sophos antivirus • if you want to heckle this, wait for next year when we announce this for Linux! • Pointsec encryption

  17. Some General Stuff • Visitors’ network • Authenticated wireless access coming up... • Promoting the use of WebDAV for remote file access, rather than PPTP (CERN already use this)

  18. So… • We have the technology… • We’re doing pretty well, but are underutilised • Now we need to work on • Reliability • Getting everything commissioned promptly • Improving our infrastructure

More Related