1 / 10

RAL Tier1 Review Fabric

RAL Tier1 Review Fabric. Martin Bly Tier1 Fabric Manager 21 November 2007. Overview. Procurements Disk Storage CPU Capacity Database systems Tape systems Networking Services Installation. Capacity Procurement. Procurements for commodity hardware Annual capacity procurements

ann
Download Presentation

RAL Tier1 Review Fabric

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RAL Tier1 ReviewFabric Martin Bly Tier1 Fabric Manager 21 November 2007

  2. Overview • Procurements • Disk Storage • CPU Capacity • Database systems • Tape systems • Networking • Services • Installation RAL Tier1 Fabric - Martin Bly

  3. Capacity Procurement • Procurements for commodity hardware • Annual capacity procurements • 3 Years full maintenance (NDB) • 4th year – 10% withdrawn for spares • 5th year – decommissioning, disposal • EU restricted tenders • Longer process, but larger range of suppliers and technologies offered • Split each procurement into Lots • Reduced risk of losing whole procurement to bad technology RAL Tier1 Fabric - Martin Bly

  4. Storage Capacity • Total useable storage ~800TB • Including non-capacity storage • First generation IDE/SCSI arrays decommissioned • 2007/8 • 1638TB, 3ware PCI-e RAID6, 182 servers (14 racks) • Delivery w/b 7 Jan 08, production March 08 • 2008/9 • Procurement will aim at delivery 1 Sep 08 • SRMs - dCache, Castor2, xrootd • NFS for legacy data and VO software RAL Tier1 Fabric - Martin Bly

  5. CPU capacity • ~600 systems, ~1500kSI2K, ~1200 job slots • 1U twin chip systems, varied CPU type, specification and OEM, RAM, local disk size, NIC • Strategy: 2GB RAM/core, 50GB disk/core, dual GbE NIC per system • 2007/8 • Procurement of at least 1500kSI2K (gcc), 2 Lots • Offered quad-cores in 1U, 1U twin and blade systems • In evaluation RAL Tier1 Fabric - Martin Bly

  6. Database systems • Oracle RACs for 3D • HA Systems with redundant hot-swap PSUs, RAID system drives in hot-swap bays • Storage on FC/SATA SAN array • Expansion of RACs • Extra storage and RAC nodes • Provide instances for FTS, Atlas TAG, LFC • 4 MySql hosts • All backed up to data store RAL Tier1 Fabric - Martin Bly

  7. Tape Robots • 10000 slot Sun SDL8530 tape silo: 8 hand-bots on 4 levels • GRIDPP share is: • 5000 slots • 6*9940B drives (200GB/tape) (phase out soon) • 6*T10K drives (500GB/tape) • Expect this technology (with drive refreshes) to remain in service at least until 2012. • Current GRIDPP capacity: 1150TB • Purchases: • Framework Purchasing agreement for drives and media • 850TB media to be purchased soon • Further 6-10(?) T10K drives to purchase in 2007 • New double density T10K-type drives expected summer 2008 RAL Tier1 Fabric - Martin Bly

  8. Intra-T1 Networking • Based on commodity components • Nortel 5530 and 5510 units • Configured as multiple stacks forming large numbers of GbE ports in virtual switches. • High speed backplane – storage and CPUs on same switch stacks • 10GbE backbone links • 10GbE links to OPN for T0-T1 and T1-T1 data • Bypass to firewall for T1-T2 traffic in OPN subnet • Core switch upgrade planned 2007/8 • Will need new core capacity in new building • Ancillary networking with older 10/100 switches RAL Tier1 Fabric - Martin Bly

  9. Services • Traditionally provided using 10% of systems held back from batch capacity • Moving to dedicated high availability hardware • Full set of WLCG services • 3 RBs, 2 CEs, UK BDII, Site BDII, MyProxy, R-GMA server, MON, 6 x UIs, 3 x VOBOX • System Services • Batch server/job scheduler (Torque/MAUI), NIS, Nagios, Ganglia, Mail, Home file system, Cacti, DHCP, TFTP … RAL Tier1 Fabric - Martin Bly

  10. Node Installation • Systems provisioned using PXE/kickstart • Re-install hosts quickly • Two-stage process • Kickstart file – main OS installation • Second stage script – all systems - updates OS, installs standard and common components • Third stage script – system personality installation • Review of provisioning and version control RAL Tier1 Fabric - Martin Bly

More Related