1 / 17

Hall D Computing Facilities

Hall D Computing Facilities. Ian Bird 16 March 2001. Overview. Comparisons – Hall D computing Estimates of needs As an illustration – but actual needs require a model Costs Staffing Timeline Other projects – Data Grids Some comments. Some comparisons: Hall D vs other HENP. Process.

Download Presentation

Hall D Computing Facilities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hall D Computing Facilities Ian Bird 16 March 2001

  2. Overview • Comparisons – Hall D computing • Estimates of needs • As an illustration – but actual needs require a model • Costs • Staffing • Timeline • Other projects – Data Grids • Some comments

  3. Some comparisons: Hall D vs other HENP

  4. Process • For CDR; computing/analysis chapter • Define the Hall-D computing model • Distributed architecture (facilities) • Data model • Software architecture • Collaboratory tools and infrastructure • Estimate of costs and funding profile, and management plan • Can set bounds (best/worst case) based on technology guesses • All this must be based on: • Analysis models – e.g. how will the PWA be done, etc. • Needs strong management – hire “now” a computing professional to lead this • Write a Computing Technical Design Report • This can come after the CDR, fixes the ideas from the CDR, provides a detailed implementation plan

  5. JLAB Facilities for Hall D • Some crude estimates – • No computing model – has to come first • Maybe too soon to fix technologies

  6. Mass storage – at JLAB • How much needs to be on tape – depends on computing model and how well managed the activities are: • Assume: • 0.75 PB/year raw data • 0.75 PB/year reconstructed • 0.3 PB/year other • All simulated data stored off-site • 1.8 PB/year (minimum) to be stored • @ 300 GB/tape = 6000 tapes = 1 silo • @ 750 GB/tape = 2400 tapes = ½ silo • Need other tapes (DST on fast access, lower density) • Keep data available for 2 years: • Pessimistic – 2 silos, Optimistic – 1 silo (for Hall D) • Realistic guess – Hall D should have at least 2 dedicated silos • Number of drives – depends on technology and access model • Experience shows need at least 30 drives • Lab needs more for other parts of the program

  7. Other storage • Disk • Again amount and type depends strongly on the computing model • Not unreasonable to expect to want 20% of data on disk • 200 TB ? • Current costs – 1TB/ $10K (IDE), - expect  10? • Cost and type depends on requirements

  8. CPU & Networks • Not a computing problem for reconstruction • All significant computing is in • simulation – most not at JLAB? • Level 3 trigger farm • It will be cheaper to compute more and store (and move) less • Conservative assumption – 500 SI95/processor • 2 procs in 1u rack = 40,000 SI95/rack • Networking • Will be of critical importance to success of Hall D • Distributed computing model • Transparent access to all data for all users • Expect 10-Gigabit Ethernet (perhaps first deployment of subsequent generation) • Assume JLAB will have OC12 (622 Mb/s) to ESNet • Even today just a configuration change

  9. Staffing • Experiment needs to have a strong dedicated computing group • Computer Center – needs depends on facilities – depends on computing model • Estimate: • Support of Hall D Level 3 farm: 0.5 • Support of offline MSS, farm: 3.0 • Additional network support: 0.5 • Development/experiment support: 2.0 • Total 6.0

  10. Costs • Real cost will be >> $3M in report – probably closer to $5-6M • Cf. RHIC computing facilities was $12M project over 5 years • Costs cannot be defined without a clear vision for the computing model • New Computer Center is already in lab building plan

  11. Integration • Technologies will be there • Challenge is in software (middleware) in integrating all the distributed pieces into a seamless system that is useable and responsive

  12. Development activities Grid computing, collaboratory environments and Data Grids

  13. LHC Concept of Computing Hierarchy – Data Grid • LHC Grid Hierarchy Example • Tier0: CERN • Tier1: National “Regional” Center • Tier2: Regional Center • Tier3: Institute Workgroup Server • Tier4: Individual Desktop • Total 5 Levels

  14. Data Grid activities • Particle Physics Data Grid (PPDG) • DOE funded – labs (inc JLAB) + universities • GriPhyN (Grid Physics Network) • NSF funded • Computing grids are heavily funded • US, Europe, Japan, • LHC computing relies on these technologies • Not just academic interest - industry

  15. PPDG • Has been funded for last 2 years • New PPDG proposal using DOE SciDac funds – just submitted • Other (complementary) proposals relevant to JLAB or Hall D: • FSU/IU proposal – Hall D portal • FIU proposal • LQCD • PPDG will: • “…provide a distributed (grid-enabled) data access and management service for the large collaborations of current and future particle and nuclear physics experiments. It is a collaborative effort between physicists and computer scientists at several DOE laboratories and universities. This is accomplished by applying existing grid middleware to current problems and providing feedback to middleware developers on additional features required or shortcomings in the current implementations.” • For JLAB will provide directly useful services for current program • These funds could be targeted next year for Hall D development activities (needs a context first)

  16. Comments • Absolute requirement: • Need a clear vision for the computing/analysis model • Computing requires a dedicated group within Hall D – the leader of that group should be found now • Management • Badly managed computing costs real money • Well managed – calibration & reconstruction are immediate – need less long-term storage; do not need to keep simulated data,… • Badly managed software architecture will kill the L3 trigger – you have to trust it • Computing task is not trivial, but not overwhelming, but is at least as complex as the detector • Must be recognized and treated as such by the collaboration

More Related