1 / 11

HEPiX Large Cluster SIG Report

HEPiX Large Cluster SIG Report. Alan Silverman 25 th October 2002 HEPiX 2002, FNAL. Overview of the talk. Large Cluster Workshop Large Site Surveys LCCWS Plans. Large Cluster Workshop - 1. A workshop to share practical experiences in building and running large clusters.

vance
Download Presentation

HEPiX Large Cluster SIG Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HEPiX Large Cluster SIG Report Alan Silverman 25th October 2002 HEPiX 2002, FNAL

  2. Overview of the talk • Large Cluster Workshop • Large Site Surveys • LCCWS Plans Alan Silverman

  3. Large Cluster Workshop - 1 • A workshop to share practical experiences in building and running large clusters. • Gather the information to write the definitive guide to building and running a cluster - how to choose/select/test the hardware; software installation and upgrade tools; performance mgmt, logging, accounting, alarms, security, etc, etc • Then document what exists and what might scale to large clusters. • And by implication, what does not scale Alan Silverman

  4. Large Cluster Workshop - 2 • First instance was May 22nd to 25th 2001 in Fermilab • 60 people attended; summaries prepared and published/presented – see web site http://conferences.fnal.gov/lccws/ • Must have been successful in some respects because …. Alan Silverman

  5. Large Cluster Workshop - 3 • … a second workshop was held this week • Two themes – practical experiences again and technology choices to build, configure and run a cluster • 90+ participants this time, over 2 days • Overheads from (almost) all talks should be on the web within a week or so and full proceedings will be published within a month or two. Alan Silverman

  6. LCCWS2 Highlights - 1 • HEP is starting to get practical experience in running large clusters, practically all on Linux running on commodity hardware. • More and more of these share the resources among several or many client groups • Management overhead increasing and ways are being sought to automate as much as possible but there is no silver bullet • Users starting to expect production services from the Grid Alan Silverman

  7. LCCWS2 Highlights - 2 • Grid deployment facing resistance by local fabric managers against having to accept masses of incoming software packages and tailoring. Not yet clear if this is an unreasonable fear or one Middleware developers will just have to accept and work around. • Developing Grids by large multi-site collaborations raises many social issues within the teams, more management overhead, more committees and working groups. Alan Silverman

  8. LCCWS2 Highlights - 3 • Tape and network trends match our perceived needs but CPU trends need to be interpreted. • Intel still reigns in terms of number of nodes but AMD better for floating point at this time and appearing in more and more HEP sites. Will Itanium be important? • Disk sizes growing ok but not much faster, and tape still cheaper. File systems more of an issue • MOSIX not yet ready for large scale use. • The larger the cluster, the more professional you must become at all levels from the ground up (literally). Alan Silverman

  9. Site Surveys - 1 • Surveyed the major sites (BNL, Caltech, CERN, FNAL, RAL, SLAC, NERSC) • First survey was for computer centre services such as power backup options and operator cover • Later added a review of videoconference support offerings and anti-virus tools in use Alan Silverman

  10. Site Surveys - 2 • Seems to be of interest and not too disturbing • Recently surveyed user support features and choices of PC hardware and the results will be published soon after I get back to CERN • Proposal to survey speed and type of Ethernet connections to desktop • Worth continuing? More site surveys as requested (but not more than one per ??????) Alan Silverman

  11. Plans for LCCWS • The LCCWS workshops appear to fill a niche by addressing practical issues in an HEP environment so the series should probably continue. • Format which seems to be popular is to co-locate it with HEPiX but continue to keep HEP out of the name to encourage participation by other sciences running large clusters. • But keep the HEPiX link by having it driven by the Large Cluster SIG and working in harmony with HEPiX in respect of the scheduling of the meeting itself and of the talks. • Use it as a place to discuss technical issues coming up in grid development? Then it probably needs to be more often, every HEPiX? • Must remain something driven by theme and seeded with invited talks which focus on the theme. Alan Silverman

More Related