1 / 7

Computing and IT Update Jefferson Lab User Group Roy Whitney, CIO & CTO 10 June 2009

Computing and IT Update Jefferson Lab User Group Roy Whitney, CIO & CTO 10 June 2009. Sci Comp – Physics Farm. Moving 2 racks 6n nodes (bought by base funds) from LQCD to the farm (3rd upgrade funded by LQCD) Adding a new cluster of 10 nodes

mohawk
Download Presentation

Computing and IT Update Jefferson Lab User Group Roy Whitney, CIO & CTO 10 June 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computing and IT Update Jefferson Lab User Group Roy Whitney, CIO & CTO 10 June 2009

  2. Sci Comp – Physics Farm • Moving 2 racks 6n nodes (bought by base funds) from LQCD to the farm (3rd upgrade funded by LQCD) • Adding a new cluster of 10 nodes • 2.8 GHz dual Nehalem (much faster than current farm nodes) • 64 bit OS: CentOS 5.3 • Desktop CUE Level 1/2 machines at RHEL 5.3 64-bit will be supported • Primary user (and funder) is muon beam simulation project, but farm will end up getting lots of cycles & testbed for 64 bit • Decommissioning oldest nodes (not worth the electricity anymore) • Planning for FY2010 upgrades • Cache disk capacity & performance • More new Nehalem nodes (64 bit) • Some new work disk http://scicomp.jlab.org/

  3. Sci Comp – Tape Library • Almost finished moving 3 PB of data from old silos to new library • When finished will have 3x bandwidth of old system • During past year 25%-50% of bandwidth was used for the transfer • Higher performance revealed problems in slow cache nodes (performance mismatch) • Capacity is limited! • High tape usage on limited budget means from now on, not all data will fit into the tape library • Oldest tapes are now being ejected and put into storage • Re-mount will take up to a week • Capacity upgrade will come in 2010, but…intention is to hold sliding window of N years of data (N tbd)

  4. Lattice QCD / High Performance Computing • JLab currently runs ~660 nodes with ~3700 cores, for LQCD computing, as part of the USQCD collaboration • JLab will host a new $5M project for LQCD computing funded by ARRA through the Nuclear Physics office • $3.2M for computing (6x - 12x capacity gain at JLab) GPUs will be used as compute accelerators for part or all of the cluster • JLab will again be on the Top100 list of fastest computers in the world • ~$0.3M for disk (over 250 Tbytes) • 2 phases, first to be installed in November, second in January • http://lqcd.jlab.org/

  5. Computing and Networking Infrastructure • Helpdesk Hours during the Summer 8-12,1-4:30 • Network • Registration of computers and automatic port configuration • Wireless changes coming  make it function like wired and put all unmanaged laptops on guest • E-mail list management will be moving from Majordomo to Mailman • Telecom: testing VoIP • Cyber Security • Managed desktops have resulted in reduced vulnerabilities discovered from scanning and faster remediation • Phishing/spear phishing is the currently preferred attack of choice • We do not ask for passwords or personal information in email

  6. Management Information Systems • The online user registration form is undergoing improvements. • Remember: submit all publications related to JLab research into the publications database!

  7. Power Outage • As part of preventing/managing potential power outages/brownouts and as we get our power under the Commonwealth of Virginia’s contract, we are participating in a power management program. • There will be a test mid day, tomorrow 11 June where farm and LQCD clusters will go off line for several hours. • This should have no effect an the network, servers, desktops, email, file access, etc. • Some lighting, HVAC, etc. will be off during the test. • Please support the test as success will lower our power bill! • If there sever electrical issues subsequently during the summer in the Eastern Mid-Atlantic, we may be asked to drop power up to 12 times. • We are not required to do so! • It is not anticipated that this will be likely.

More Related